2

One standard approach we use when describing our data for a paper is to cut our dependent variable into tertiles and then give mean/SD or median/IQR for each tertile for continuous and percentage/number in each tertile for each category for categorical independent variables.

This is fine for an overview. But I can't help but feel that it would be so much nicer if in one extra column, I could plot a scatterplot+smoother for continuous and a barplot for categorical independent variables. However, I have no idea how to create this kind of tabular mixed text/graphics with R. I know I could just be using text() and position the text by pixels, or I could separately make the text table and batch create the graphics and then stitch them together by hand, but this seems very inelegant to me and I'd find it surprising if there wasn't some more automated, convenient way, as with R there often is.

I'm searching for something a bit like latex(describe()) in the rms package, except I want the numerical summary by tertiles (or arbitrary quantiles) and instead of a histogram I want a scatterplot/barplot.

Here's how what we usually do looks like. I'd be happy to have one more column to the right with a meaningful graphical depiction of the relationship, but don't know how to do it with R. tertilesummary

miura
  • 3,364
  • 3
  • 21
  • 27

1 Answers1

1

What you are describing are synonymous to Sparklines. This question on the site gives various resources in R, Plotting Sparklines in R, and I include many more references in my blog post on this site, Some notes on making effective tables. In particular, the Sparklines for excel add-on has the widest variety of example plots to peruse that I am aware of.

I can't really say much on your particular application (I don't really get what you are doing!), and some plots contain too much info to be shrunk into such tiny spaces. But hopefully this leads to knowledge of what is possible and experimentation if any of those representations are reasonable given your application.

Andy W
  • 15,245
  • 8
  • 69
  • 191
  • This is very nice, thanks a lot. Also the notes on table making were valuable. I wonder: Can I show different "sparkline" types in the same column with this package, i.e. lines for continuous and bars for discrete independent variables? What we're doing is we try to give some exploratory overview of the crude relationship of our dependent variable, most likely some hot new biomarker, with loads of other variables such as other biomarkers, demographic variables, disease status. – miura May 31 '12 at 07:50
  • @miura, I'm not sure, I only know of the packages, I have never used them. After a quick perusing of some examples of both SparkTable and the YaleToolkit, I would bet the SparkTable package (which I believes exports HTML or Latex tables) appears to be the best "bet" as to accomplishing this. It may be a good question for the R guru's on stackoverflow. – Andy W May 31 '12 at 12:08
  • @ Andy W, thanks. sparkTable looks very nice, although it appears you have to decide on one type of plot per column. Would probably not be too much of an effort to achieve mixed plot types in a column manually with what sparkTable provides. – miura Jun 01 '12 at 11:14