1

I looking for examples where someone has graphed the results of a simple linear regression model where the dependent variable is continuous and the predictor variable is categorical with more than two categories.

For example, predicting mean temperatures from 3 different types of ovens.

I am finding plenty of graphed examples with two categories, but none with 3 or 4 categories. I would like to visually know what the model would look like on the data.

Thanks.

phaser
  • 245
  • 1
  • 9
  • Do you mean the *dependent* variable is continuous? See https://stats.stackexchange.com/questions/87487, https://stats.stackexchange.com/questions/82557, https://stats.stackexchange.com/questions/68847, or https://stats.stackexchange.com/questions/62756 for some examples. Otherwise, could you describe your data more clearly? – whuber Jul 04 '17 at 21:03
  • @whuber see my edits. Yes the dependent variable is continuous. – phaser Jul 05 '17 at 03:30
  • Then it would seem any of the threads I referenced already answer your question. – whuber Jul 05 '17 at 12:43

1 Answers1

1

This overview of visuals from ggplot shows several options:

  • histograms
  • density plots
  • (Tufte styled)
  • box plots
  • violin plots

If it is all just about presenting the results/output of a regression model (predicted means, error, other calculated statistics, etc), and not an actual image of the raw data, then a table will be sufficient.

A quick example (just one out of the many variations!) using R and the referred package:

ggplot(data = iris, aes(x=Petal.Length,
                        fill=Species,
                        color=Species,
                        facet=Species)
      ) +
          geom_histogram(position="identity", alpha=0.5, bins=50) +
          geom_density(alpha=0.5) +
          scale_fill_manual(values=c("#009999", "#AAAAAA", "#999900")) +
          scale_color_manual(values=c("#009999", "#AAAAAA", "#999900"))

multiple histogram example

Sextus Empiricus
  • 43,080
  • 1
  • 72
  • 161
  • I am looking for a way to visually look at how well the model fits the data. – phaser Jul 05 '17 at 15:42
  • I have added a visual example. The model would be just the three averages, setosa ~ 1.462, versicolor ~ 4.260, virginica ~ 5.552 and you may do some anova or other statistical stuff to place a (modelled) number on setosa being more different from the other two. – Sextus Empiricus Jul 05 '17 at 15:55
  • The case of DependentContinuousVariable ~ IndependentFactorVariable is not so complex and the described graphs provide a good insight. In more complex models you could plot the residuals (which can be done in various way, but the idea/principle is the same every time). – Sextus Empiricus Jul 05 '17 at 15:58