In basic machine learning we are taught the following "rules of thumb":
a) the size of your data should be at least 10 times the size of the VC dimension of your hypothesis set.
b) a neural network with N connections has a VC dimension of approximately N.
So when a deep learning neural network has say, millions of units, does this mean we should have, say, billions of data points? Can you please shed some light on this?