0

enter image description here

As shown, y is Wear. X is Brand (total 5 brands). Why summary show 4 brands variable? How to write model based on R code? Is model wear=beta0+beta1*brandAjax+beta2*brandChamp+beta3*brandTuffy+beta4*Xtra right?? Does intercept stand for the mean of fifth brand??

melody
  • 1
  • 2
  • You need to understand `contrasts` in R. You'll find lots of explanations in CV or any introduction to R. – user20637 Apr 30 '20 at 20:52

2 Answers2

1

The summary shows 4 variables because variable lead$Brand is a nominal variable with 5 levels - AJAX, CHAMP, TUFFY, XTRA and the last one whichis not visible with this output. Because attributing numbers to each of the levels of the lead$Brand would not make sense, as it is (probably) not an interval or ordinal variable, R automatically creates dummy variables to account for the variation in the data (for more info on dummy variables see this https://www.statisticssolutions.com/dummy-coding-the-how-and-why/.

Your model
$Wear = \beta_0 + \beta_1*brandAJAX + \beta_2*brandCHAMP + \beta_3*brandTUFFY + \beta_4*brandXTRA$
is correct.

The intercept ($\beta_0$) can be interpreted as the predicted value of the fifth brand.

User33268
  • 1,408
  • 2
  • 10
  • 21
  • Thanks, It helps. Two more question: 1.Assume each brand has 5 sample. If I wanna calculate confidence interval by hand for fifth brand, Do I need to divide sd(beta0) by sqrt(5) or sd(beta0) itself is good? 2. Mean for AJAX would be 2.325-0.275*5??? – melody Apr 30 '20 at 21:24
  • I presume you are talking about the confidence interval for the mean of the fifth sample right? The 95% CI for that is $int \pm 1.96*SE$, that is, $2.33 \pm 1.96*0.07$, because the intercept is the estimated mean of the fifth brand. Mean for AJAX would be 2.33-0.275 – User33268 May 01 '20 at 07:55
0

The fifth brand is omitted due to the dummy variable trap. The betas for the four dummy variables of the brands are all multiplied by zero when the brand in question is the fifth brand. So yes, in this case the intercept is the fifth brand, and it is the reference group that the other brands are compared to.

kkz
  • 205
  • 1
  • 5
  • Thanks, It helps. Two more question: 1.Assume each brand has 5 sample. If I wanna calculate confidence interval by hand for fifth brand, Do I need to divide sd(beta0) by sqrt(5) or sd(beta0) itself is good? 2. Mean for AJAX would be 2.325-0.275*5??? – melody Apr 30 '20 at 23:20