8

From my model, I'm asked to determine which variables are statistically significant.

fitted.model <- lm(spending ~ sex + status + income, data=spending)

My results were as follows:

Coefficients:
                Estimate  Std. Error t value   Pr(>|t|)    
(Intercept)    22.55565   17.19680   1.312   0.1968    
sex         **-22.11833**  8.21111  -2.694   0.0101 *  
status          0.05223    0.28111   0.186   0.8535    
income          4.96198    1.02539   4.839 1.79e-05 ***
verbal         -2.95949    2.17215  -1.362   0.1803 

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 22.69 on 42 degrees of freedom
Multiple R-squared: 0.5267, Adjusted R-squared: 0.4816 
F-statistic: 11.69 on 4 and 42 DF,  p-value: 1.815e-06.

Question: Do I have to look at the last column? If so, then sex and income would be statistically significant.

chl
  • 50,972
  • 18
  • 205
  • 364
jerry
  • 107
  • 1
  • 1
  • 4
  • 2
    Let me reiterate my suggestion to [register](http://stats.stackexchange.com/faq#login) your account, michelle. I would also suggest to provide as much background as possible when asking such question: What is the context of your homework, what did you learn about multiple regression in your previous courses, what is the purpose of identifying significant predictors (BTW, is this for explanatory or predictive purpose)? Thank you. – chl Sep 24 '12 at 21:07
  • 1
    See if this post has anything useful for you: http://stats.stackexchange.com/questions/5135/interpretation-of-rs-lm-output/ – Roman Luštrik Sep 24 '12 at 22:04

3 Answers3

5

Yes, based on the output, sex and income are statistically significant.

sex and possibly status are nominal variables, so it's odd that they appear in the model as is. It could work, if they are 0/1 variables, but it still opens up the potential for error.

To be on the safe side, for sex and any other nominal variable, include it in the model like this: factor(sex):

fitted.model <- lm(spending ~ factor(sex) + status + income, data=spending)
Jessica
  • 1,019
  • 7
  • 14
4

The p-value in the last column tells you the significance of the regression coefficient for a given parameter. If the p-value is small enough to claim statistical significance, that just means there is strong evidence that the coefficient is different from 0. But in the regression context it might be a little naive to think that it means that sex and income are the only significant factors. As we have seen (I think with this data set) the variables are correlated and their coefficients and t statistics can change a lot depending on which other variables are included in the regression. You should look at what those t-tests say when only sex and income are included in the model.

BGreene
  • 3,045
  • 4
  • 16
  • 33
Michael R. Chernick
  • 39,640
  • 28
  • 74
  • 143
  • its for class we had to indicate possible significant from our lm then create another lm with just the two variables which I did and I did a logit and it does indicate that sex and income are significant. – jerry Sep 25 '12 at 13:12
3

Who has asked you to determine this? This looks like homework and if it is it should be tagged as such.

The answer to your question depends very much on what is meant by "statistically significant" in a regression context. Looking at the last column as you suggest will meet one definition, but a rather simplistic definition.

Your quoted output above does not include the rest of the summary which includes the overall F-test. That p-value should be examined before the individual tests, it is possible to have an overall test tell you that nothing is significant, but then an individual test or 2 show significance due to alpha inflation from multiple testing.

If status and verbal are correlated with each other then it is possible that either could be a very "significant" predictor of spending, but show up as redundant given the other.

Greg Snow
  • 46,563
  • 2
  • 90
  • 159
  • Question updated. But then how would I predict significant variables from the linear model – jerry Sep 24 '12 at 18:45
  • 2
    @jerry, your question looks simple, but a meaningful answer is really the topic of multiple chapters in a regression textbook. Your question depends on what is meant by "significant", there are several different questions that investigate significance, the above output has tests for 2 such questions, but others will require fitting additional models and comparing. What question are you trying to answer? – Greg Snow Sep 25 '12 at 14:27