If you regress randomly generated independent variables on a randomly generated dependent variable, is the expected R squared value simply a function of n (the # observations) and k (the # of independent variables)? If so, why is this?
In some old regression course notes I was re-reading, I see that the expected R-squared in this case is k / (n-k-1). I tried this with some randomly generated data (e.g. n=100 and k=20) and indeed got a value very close to 0.2532, but I don't understand how it can be this simple. Thanks for any color anyone might have.