Object detection deals with recognizing the presence of objects of a certain semantic class (e.g., “humans”, “buildings”, “cars”, etc.) in digital image and video data.
Questions tagged [object-detection]
187 questions
43
votes
4 answers
Is it possible to give variable sized images as input to a convolutional neural network?
Can we give images with variable size as input to a convolutional neural network for object detection? If possible, how can we do that?
But if we try to crop the image, we will be loosing some portion of the image and if we try to resize, then, the…

Ashna Eldho
- 531
- 1
- 4
- 4
31
votes
5 answers
Yolo Loss function explanation
I am trying to understand the Yolo v2 loss function:
\begin{align}
&\lambda_{coord} \sum_{i=0}^{S^2}\sum_{j=0}^B \mathbb{1}_{ij}^{obj}[(x_i-\hat{x}_i)^2 + (y_i-\hat{y}_i)^2 ] \\&+ \lambda_{coord} \sum_{i=0}^{S^2}\sum_{j=0}^B…

Kamel BOUYACOUB
- 341
- 1
- 3
- 3
12
votes
2 answers
Do more object classes increase or decrease the accuracy of object detection
Assume you have an object detection dataset (e.g, MS COCO or Pascal VOC) with N images where k object classes have been labeled. You train a neural network (e.g., Faster-RCNN or YOLO) and measure the accuracy (e.g., IOU@0.5).
Now you introduce x…

SaiBot
- 249
- 2
- 9
12
votes
4 answers
Average Precision in Object Detection
I'm quite confused as to how I can calculate the AP or mAP values as there seem to be quite a few different methods. I specifically want to get the AP/mAP values for object detection.
All I know for sure is:
Recall = TP/(TP + FN),
Precision = TP/(TP…

User1915
- 391
- 1
- 3
- 12
9
votes
1 answer
Yolo v3 loss function
The original loss function can be seen here and is more or less explained in Yolo Loss function explanation:
\begin{align}
&\lambda_{coord} \sum_{i=0}^{S^2}\sum_{j=0}^B \mathbb{1}_{ij}^{obj}[(x_i-\hat{x}_i)^2 + (y_i-\hat{y}_i)^2 ] \\&+…

sachinruk
- 1,113
- 1
- 9
- 21
9
votes
1 answer
Fine Tuning vs. Transferlearning vs. Learning from scratch
In my master thesis, I am researching on transfer learning on a specific use Case, a traffic sign detector implemented as a Single Shot Detector with a VGG16 base network for classification. The Research focuses on the problem of having a detector…

Jürgen Zornig
- 215
- 1
- 2
- 6
8
votes
1 answer
What are the best methods for reducing false positives in TensorFlow Mask-RCNN object detection framework using transfer learning?
I am training a single object detector with mask rcnn and I have tried several methods for reducing false positives. I started with a few thousand examples of images of the object with bounding boxes and trained that, got decent results, but when…

Nadav Ben-Haim
- 181
- 1
- 2
8
votes
1 answer
One-shot object detection with Deep Learning
In the recent years, the field of object detection has experienced a major breakthrough after the popularization of the Deep Learning paradigm. Approaches such as YOLO, SSD or FasterRCNN hold the state of the art in the general task of object…

Daniel López
- 5,164
- 2
- 21
- 42
7
votes
1 answer
Are there networks specialised on object detection for a single class of object?
I want to detect the location of a single class of object, which might occur multiple times in an image. Specifically, this relates to research on detecting brake lights for autonomous vehicles. I imagine similar techniques could be used to detect…

craq
- 197
- 1
- 9
7
votes
2 answers
Faster R-CNN: How to avoid multiple detection in same area?
I use the Tensorflow object detection API to train the Pascal VOC dataset from scratch. I just had a look on the first results after 200k training steps and the results are okay, despite that I often have many detections of the same class in…

ITiger
- 173
- 1
- 8
5
votes
3 answers
Sample size for the evaluation of Deep Learning Models
I'm evaluating the performance and accuracy in detecting objects for my data set using three deep learning algorithms. In total there are 24,085 images. I measure the performance in terms of time taken to detect the objects. To measure the accuracy,…

Nilani Algiriyage
- 565
- 2
- 5
- 11
5
votes
2 answers
What is sigma function in the YOLO object detector?
I have gone through the YOLO9000 paper, in that they have mentioned that network predicts 5 coordinates of the bounding box, and from that we find the exact centre coordinates and the width and height. I'm confused with those…

bibinwilson
- 153
- 3
5
votes
1 answer
Coordinate prediction parameterization in object detection networks
State of the art object detection networks, such as RetinaNet, Faster R-CNN, and YOLO, use a coordinate encoding where the bounding box regression is given relative to the anchor box:
Centers:
$t_x = (x-x_a)/w_a$ and $t_y = (y-y_a)/h_a$
Height and…

NicoJ
- 88
- 9
5
votes
2 answers
how to detect the exact size of an object in an image using machine learning?
I have a problem where I get an image from an underwater camera. The image is quite large compared to the objects shown so that it contains has mostly background (seafloor). Objects in the image are for instance corals or sponges.
In the image I…

jens0r
- 327
- 2
- 4
- 11
5
votes
3 answers
Object detection - how to annotate negative samples
I am using Tensorflow Object detection API to detect 2 objects. I need negative samples because it sometimes detects something random as one of the images . How should I annotate an image to be a 'negative sample'? I am using LabelImg to annotate…

Amaan
- 217
- 4
- 9