0

In a linear (OLS) regression, I find that the sizes of the coefficients vary a lot: one of them is about 0.0005 and another is ~150. The p-values are all under 0.1. Is this reliable result? These are time-series variables by the way.

Separately, in another OLS time series regression, I am finding a variable to be significant which according to the business client, should not be. This variable has much higher variance than other input variables we tried in the model and also has high correlation with the dependent variable. COUld the low p-value/statistical significance/ high correlation arise due to the variance? The other variables have low variance over the observed time period.

Thanks for any insight.

user2450223
  • 317
  • 3
  • 13

1 Answers1

1

The size of the coefficients is dependent on the scale of the explanatory/right-hand-side/independent/x-variable. Say we use annual household income as an explanatory variable measured in euros. The coefficient will tell us how much we expect the outcome to change for a euro change in annual income. That effect will in all likelihood be very small. It would probably make more sense to measure annual income in 1000s of euros (just divide the old income by 1000). In that case the effect will be a 1000 times the old effect. The p-value will remain unchanged, as we did not change the model: If 1 euro increase leads to a 0.01 change in y, then a 1000 euro increase will lead to a 10 change in y. These are just different ways of saying the same thing. So it is very well possible to have such huge differences in coefficients and similar p-values. In practice, if I see such large differences in coefficients I always reconsider the unit of the explanatory variables.

Maarten Buis
  • 19,189
  • 29
  • 59
  • (+1). See also http://stats.stackexchange.com/questions/165027/why-is-the-age-squared-variable-divided-by-100-or-1000/165122#165122 – Christoph Hanck Aug 31 '15 at 08:37
  • Thanks for the answer, it is very helpful. However, I am also wondering about the role of the variance of an input predictor variable: does higher variance make them somehow more likely to be significant in a model? If I have 2 variables X1 and X2 in a regression model, such that X1 doesn't vary much, but X2 does, does this impact their final p-values/ coefficients in the model? Thanks. – user2450223 Sep 01 '15 at 16:11