0

Just followed a machine learning example and programmed a gradient decent maximizaation to read in a 64X64X3 image vector to predict whether it was a cat or not. The model had a parameter for every pixel*color so 12288 parameters were optimized. It worked, a vector of parameter estimates were obtained and predictions were obtained and worked pretty well (prediction accuracy good). How did this work, when I only had 200 samples, so p >>n. I thought we could not estimate parameters in regression when p

0 Answers0