In this paper giving an overview of machine learning, the author writes:
Generalizing correctly becomes exponentially harder as the dimensionality (number of features) of the examples grows, because a fixed-size training set covers a dwindling fraction of the input space. Even with a moderate dimension of $100$ and a huge training set of a trillion examples, the latter covers only a fraction of about $10^{-18}$ of the input space. This is what makes machine learning both necessary and hard.
I also don't understand what he means by the input space in this context. I know he's referring to a vector space. I think he might be referring to a vector space which somehow represents all of the parameters. But I don't understand how this relates to the training set examples, and where he's getting the number $10^{18}$ from $100$ features and 1 trillion training examples - is there some way of calculating this, or is it some kind of estimate somehow?
I have the impression that the fraction of the input space covered by the training set relates to the idea that one cannot solve a system of equation with more columns than rows, and I can vaguely see how in machine learning there could generally be issues if one just didn't have many columns compared to rows. But I'm not sure, and I wish I understood why this mattered but I don't even know where to look to find this information.
References
Domingos, P. (2012). A few useful things to know about machine learning. Communications of the ACM, 55(10), 78-87.