Questions tagged [object-detection]

Object detection deals with recognizing the presence of objects of a certain semantic class (e.g., “humans”, “buildings”, “cars”, etc.) in digital image and video data.

187 questions
43
votes
4 answers

Is it possible to give variable sized images as input to a convolutional neural network?

Can we give images with variable size as input to a convolutional neural network for object detection? If possible, how can we do that? But if we try to crop the image, we will be loosing some portion of the image and if we try to resize, then, the…
31
votes
5 answers

Yolo Loss function explanation

I am trying to understand the Yolo v2 loss function: \begin{align} &\lambda_{coord} \sum_{i=0}^{S^2}\sum_{j=0}^B \mathbb{1}_{ij}^{obj}[(x_i-\hat{x}_i)^2 + (y_i-\hat{y}_i)^2 ] \\&+ \lambda_{coord} \sum_{i=0}^{S^2}\sum_{j=0}^B…
12
votes
2 answers

Do more object classes increase or decrease the accuracy of object detection

Assume you have an object detection dataset (e.g, MS COCO or Pascal VOC) with N images where k object classes have been labeled. You train a neural network (e.g., Faster-RCNN or YOLO) and measure the accuracy (e.g., IOU@0.5). Now you introduce x…
12
votes
4 answers

Average Precision in Object Detection

I'm quite confused as to how I can calculate the AP or mAP values as there seem to be quite a few different methods. I specifically want to get the AP/mAP values for object detection. All I know for sure is: Recall = TP/(TP + FN), Precision = TP/(TP…
9
votes
1 answer

Yolo v3 loss function

The original loss function can be seen here and is more or less explained in Yolo Loss function explanation: \begin{align} &\lambda_{coord} \sum_{i=0}^{S^2}\sum_{j=0}^B \mathbb{1}_{ij}^{obj}[(x_i-\hat{x}_i)^2 + (y_i-\hat{y}_i)^2 ] \\&+…
sachinruk
  • 1,113
  • 1
  • 9
  • 21
9
votes
1 answer

Fine Tuning vs. Transferlearning vs. Learning from scratch

In my master thesis, I am researching on transfer learning on a specific use Case, a traffic sign detector implemented as a Single Shot Detector with a VGG16 base network for classification. The Research focuses on the problem of having a detector…
8
votes
1 answer

What are the best methods for reducing false positives in TensorFlow Mask-RCNN object detection framework using transfer learning?

I am training a single object detector with mask rcnn and I have tried several methods for reducing false positives. I started with a few thousand examples of images of the object with bounding boxes and trained that, got decent results, but when…
Nadav Ben-Haim
  • 181
  • 1
  • 2
8
votes
1 answer

One-shot object detection with Deep Learning

In the recent years, the field of object detection has experienced a major breakthrough after the popularization of the Deep Learning paradigm. Approaches such as YOLO, SSD or FasterRCNN hold the state of the art in the general task of object…
7
votes
1 answer

Are there networks specialised on object detection for a single class of object?

I want to detect the location of a single class of object, which might occur multiple times in an image. Specifically, this relates to research on detecting brake lights for autonomous vehicles. I imagine similar techniques could be used to detect…
craq
  • 197
  • 1
  • 9
7
votes
2 answers

Faster R-CNN: How to avoid multiple detection in same area?

I use the Tensorflow object detection API to train the Pascal VOC dataset from scratch. I just had a look on the first results after 200k training steps and the results are okay, despite that I often have many detections of the same class in…
5
votes
3 answers

Sample size for the evaluation of Deep Learning Models

I'm evaluating the performance and accuracy in detecting objects for my data set using three deep learning algorithms. In total there are 24,085 images. I measure the performance in terms of time taken to detect the objects. To measure the accuracy,…
5
votes
2 answers

What is sigma function in the YOLO object detector?

I have gone through the YOLO9000 paper, in that they have mentioned that network predicts 5 coordinates of the bounding box, and from that we find the exact centre coordinates and the width and height. I'm confused with those…
5
votes
1 answer

Coordinate prediction parameterization in object detection networks

State of the art object detection networks, such as RetinaNet, Faster R-CNN, and YOLO, use a coordinate encoding where the bounding box regression is given relative to the anchor box: Centers: $t_x = (x-x_a)/w_a$ and $t_y = (y-y_a)/h_a$ Height and…
5
votes
2 answers

how to detect the exact size of an object in an image using machine learning?

I have a problem where I get an image from an underwater camera. The image is quite large compared to the objects shown so that it contains has mostly background (seafloor). Objects in the image are for instance corals or sponges. In the image I…
jens0r
  • 327
  • 2
  • 4
  • 11
5
votes
3 answers

Object detection - how to annotate negative samples

I am using Tensorflow Object detection API to detect 2 objects. I need negative samples because it sometimes detects something random as one of the images . How should I annotate an image to be a 'negative sample'? I am using LabelImg to annotate…
Amaan
  • 217
  • 4
  • 9
1
2 3
12 13