Suppose we have a Training Set $X$ of size $n\times d$, where $n$ represents the number of examples and $d$ represents the number of features.
Assume that $d>n$, so the number of features is greater than the number of examples/observations.
In this case a simple multiple regression model CANNOT be learned right? I think so because when you have to calculate the system:
$$X^TXw=X^Ty$$
- $w$ is vector where element are the parameters of regression model
- $y$ is vector where element are numerical class into Training Set
because $rank(X)=n$ (infact, the examples into training set are indipendet), then $rank(X^TX)=n$, and because $n<d$, rank of $X^TX$ is not full and $X^TX$ is not invertible.
Is this reasoning correct?