2

I want to ask a basic understanding of CNN.

Let say I have 1 dataset (100 pictures) with

  1. Class A (Picture of Cat: 40 pictures)
  2. Class B (Picture of Dog: 60 pictures)

And then, I input 100 pictures into CNN and run it.

My question is:

  1. What is the output should I look at?
  2. Is that mean if I input a picture (either Cat and Dog), I can know the picture (is cat or is a dog) by looking at the output?

Thank you.

aan
  • 75
  • 8
  • What do you understand now? Do you understand logistic regression? (If not, do you understand linear regression?) – Dave Jul 20 '20 at 01:56
  • @Dave, logistic regression is not similar. This is CNN – aan Jul 20 '20 at 08:15
  • Logistic regression is extremely similar. In fact, a logistic regression is a neural network with no hidden layers. Now do you understand logistic regression? – Dave Jul 20 '20 at 09:56
  • @Dave, yes, I did learn logistic regression. So both work similar, just CNN with hidden layer. – aan Jul 20 '20 at 11:36
  • Let’s look at how it would work for logistic regression. Transform your pictures into vectors you input into the logistic regression model, perhaps just by flattening the image. (How you so this doesn’t matter for the example.) What is your answer to #1? What is your answer to #2? – Dave Jul 25 '20 at 23:21

2 Answers2

1

The classifier would not necessarily be a logistic regression(it would be an SVM or some other classifiers), but it would be simpler to illustrate the issue using just logistic regression.

What is the output should I look at?

I thought you mean logistic regression. The output would just be a float between 0 to 1 or two floats complementing each other(we use this type here, but they are equivalent) and adding up to 1. You can refer to this answer. You can see from the picture below that the blue and red nodes form a binomial logistic regression model.

Is that mean if I input a picture (either Cat and Dog), I can know the picture (is cat or is a dog) by looking at the output?

Yes. The CNN or any other encoders is just for extracting features. The last layer of a CNN is just like the input of the logistic regression.

The fully connected layer depicted below is what I mean the last layer of the CNN, and the output as depicted is what you want. In your case, there would be just 2 floats(each red node is a float number). Each red node stands for a class and they add up to 1 and you choose the one with the biggest number. In your case, if 0 indexed node represents cats and 1 indexed node represents dogs(you train it to do that with such labels) and node 1 is bigger than node 0(in the inference mode) we can say that the input of the model would be a dog.

enter image description here

A simple and typical CNN binary classifier.

Hope this removes your doubts.

Lerner Zhang
  • 5,017
  • 1
  • 31
  • 52
1

The basic intuition of a CNN is that it is a modified neural network which uses a trick of parameter sharing to capture more abstract patterns in the images which are then used to classify different classes like dog and cat in your case. So, the purpose of a CNN architecture is feature extraction in an efficient way. Each layer in CNN captures different abstract patterns which add to the overall differentiability among the classes. These extract features can be then fed to a fully connected layer which acts as a classifier and finally classified using a sigmoid(binary classification) or softmax function(multi-class) which produces in terms of probabilities.

  1. You should look at which class has maximum probability in the output layer. Ex. if your output layer has .2 for cat and .8 for dog, then final output of your model is a dog.
  2. If you feed an image of a cat in input layer and your output layer produces .95 probability for cat label in output layer then you can know your output is a cat.
Vivek
  • 189
  • 5
  • thanks. Does `sigmoid(binary classification)` only for `classify 2 classes, e.g. Cat and Dog` and `softmax function(multi-class) for `classify more than 2 classes, e.g. Cat, Dog, Bird`? – aan Jul 26 '20 at 12:27
  • Yes, that's right. – Vivek Jul 26 '20 at 15:38
  • @Viviek, thanks. do you have any resource for such implementation using Matlab? – aan Jul 26 '20 at 16:51
  • @aan, You can checkout this gitgub repo : https://github.com/xuzhenqi/cnn for matlab implementation. – Vivek Jul 27 '20 at 04:31
  • it is difficult to run/unable to run the code after I download the code from github. Not sure what is the problem, but couldn't see any clue there – aan Jul 27 '20 at 12:48
  • I have not used MATLAB before so can't really help you with that. You can explore mathworks for matlab implementations. – Vivek Jul 27 '20 at 15:41
  • thanks for your reply. How about Python? – aan Jul 27 '20 at 17:14
  • There are tons of implementations online for CNN in python. Take a look at this and try to implement yourself: https://towardsdatascience.com/building-a-convolutional-neural-network-cnn-in-keras-329fbbadc5f5 – Vivek Jul 28 '20 at 09:50
  • Does the output of `sigmoid` and `softmax` the same? – aan Aug 22 '20 at 00:17