Is it possible to classify images by vectorizing them (and achieve a good performance)?

Question

Many applications of image classification involves convolutional neural network, where the image is treated directly as a 2D (or 3D, if multiple images) matrix.

I wonder if images can be classified (and with reasonably good performance) with a MLP or softmax regression or even SVMs by vectorizing them, meaning to stack each row or column of this 2D matrix into a single row or column vector and feeding that into the network directly (no convolution)

The answer is probably negative...but I wonder if anyone know whether this is possible.

An image is usually represented by a 3D tensor, except for gray scale images. — Michael M, Sep 05 '20 at 07:53

score 1 · Accepted Answer · answered Sep 05 '20 at 12:00

LeCun tracks MNIST performance on many classification models.

yann.lecun.com/exdb/mnist/

Unsurprisingly, CNNs do the best, but his “Neural Nets” category gives a number of fully-connected models with classification error under $1\%$. The SVM and KNN categories also report models with error rates under $1\%$.

score 1 · Answer 2 · answered Sep 05 '20 at 13:33

While you may get reasonable performance on dataset like MNIST which consists of small greyscale image, treating image as a vector will not lead to good performance:

-reasonably sized image may be of size 224x224x3 pixels = 150.528 features. Most methods (logistic regression, SVM, fully connected neural network) will probably overfit / fails to learn a good generalization.

-the main problem is that by treading the image as a vector, you are throving out the information of spatial distance between pixels. Useful features in image are usually local, depending on pixels in small area. The CNNs perform so well because they exploit this prior information.

Is it possible to classify images by vectorizing them (and achieve a good performance)?

2 Answers2