Questions tagged [computer-vision]

Questions related to image representation, segmentation, visual object categorization and image processing algorithms in general.

438 questions
71
votes
4 answers

What is translation invariance in computer vision and convolutional neural network?

I don't have computer vision background, yet when I read some image processing and convolutional neural networks related articles and papers, I constantly face the term, translation invariance, or translation invariant. Or I read alot that the…
43
votes
4 answers

Is it possible to give variable sized images as input to a convolutional neural network?

Can we give images with variable size as input to a convolutional neural network for object detection? If possible, how can we do that? But if we try to crop the image, we will be loosing some portion of the image and if we try to resize, then, the…
43
votes
6 answers

Neural network references (textbooks, online courses) for beginners

I want to learn Neural Networks. I am a Computational Linguist. I know statistical machine learning approaches and can code in Python. I am looking to start with its concepts, and know one or two popular models which may be useful from a…
21
votes
2 answers

Balancing Reconstruction vs KL Loss Variational Autoencoder

I am training a conditional variational autoencoder on a dataset of faces. When I set my KLL Loss equal to my Reconstruction loss term, my autoencoder seems unable to produce varied samples. I always get the same types of faces appearing: These…
Joe B
  • 211
  • 1
  • 2
  • 4
21
votes
3 answers

hinge loss vs logistic loss advantages and disadvantages/limitations

Hinge loss can be defined using $\text{max}(0, 1-y_i\mathbf{w}^T\mathbf{x}_i)$ and the log loss can be defined as $\text{log}(1 + \exp(-y_i\mathbf{w}^T\mathbf{x}_i))$ I have the following questions: Are there any disadvantages of hinge loss (e.g.…
user570593
  • 1,099
  • 2
  • 13
  • 19
17
votes
2 answers

Fine Tuning vs Joint Training vs Feature Extraction

I am reading this paper http://zli115.web.engr.illinois.edu/wp-content/uploads/2016/10/0479.pdf It distinguishes between feature extraction and fine tuning in deep learning. I am not getting the difference as feature extraction is just the same as…
16
votes
2 answers

What is energy minimization in machine learning?

I was reading about optimization for an ill-posed problem in computer vision and came across the explanation below about optimization on Wikipedia. What I don't understand is, why do they call this optimization "Energy minimization" in Computer…
iamprem
  • 263
  • 2
  • 7
14
votes
4 answers

Convolution with a non-square kernel

So far I've only encountered convolution kernels which are square (ie, have the same rows as columns). Are there any cases in which a non-square kernel makes sense? If not, why?
13
votes
2 answers

Can a convolutional neural network take as input images of different sizes?

I'm working on a convolution network for image recognition, and I was wondering if I could input images of different sizes (not hugely different though). On this project: https://github.com/harvardnlp/im2markup They say: and group images of similar…
13
votes
1 answer

How to reduce number of false positives?

I'm trying to solve task called pedestrian detection and I train binary clasifer on two categories positives - people, negatives - background. I have dataset: number of positives= 3752 number of negative= 3800 I use train\test split 80\20 % and…
12
votes
1 answer

How to form a Precision-Recall curve when I only have one value for P-R?

I have a data mining assignment where I make a content-based image retrieval system. I have 20 images of 5 animals. So in total 100 images. My system returns the 10 most relevant images to an input image. Now I need to evaluate the performance of my…
jeff
  • 1,102
  • 3
  • 12
  • 24
12
votes
2 answers

Anchoring Faster RCNN

In the Faster RCNN paper when talking about anchoring, what do they mean by using "pyramids of reference boxes" and how is this done? Does this just mean that at each of the W*H*k anchor points a bounding box is generated? Where W = width, H =…
BadProgrammer
  • 457
  • 1
  • 4
  • 12
12
votes
5 answers

What loss function should I use for binary detection in face/non-face detection in CNN?

I want to use deep learning to train a face/non-face binary detection, what loss should I use, I think it is SigmoidCrossEntropyLoss or Hinge-loss. Is that right, but I also wonder should I use softmax but with only two classes?
12
votes
3 answers

How to classify a unbalanced dataset by Convolutional Neural Networks (CNN)?

I have a unbalanced dataset in a binary classification task, where the positives amount vs negatives amount is 0.3% vs 99.7%. The gap between positives and negatives are huge. When I train a CNN with the structure used in MNIST problem, the testing…
11
votes
1 answer

Training a convolution neural network

I am currently working on a face recognition software that uses convolution neural networks to recognize faces. Based on my readings, I've gathered that a convolutional neural network has shared weights, so as to save time during training. But, how…
1
2 3
29 30