1

I am doing a study to compare the effect of quarterly gdp on number of successful loans issued for each state in US. For a five year period, I will have 20 time observations and (50 states+1 gdp) value to represent for each time period. What will be the best technique to visualize this data? I thought of a line graph representing gdp and loan count on y-axis vs time on x-axis. But that will give me 50 different graphs. Is there a better sophisticated way to see all on the same graph or maybe two graphs? Or comparing all 50 and then combining similar ones will be a good approach?

Nick Cox
  • 48,377
  • 8
  • 110
  • 156

1 Answers1

2

No one plot can do everything, especially for a dataset of nontrivial complexity, and this dataset is indeed nontrivially complex. Here are some ideas:

  • A spaghetti plot. By this, I mean one line graph per state of the number of successful loans versus time, all plotted on top of each other, and then a line graph of GDP versus time on top of that. You won't be able to make out much about individual states, but you may be able to spot overall patterns.
  • A plot like the spaghetti plot, but without lines connecting the points, except in the case of the GDP. You can add a boxplot at each timepoint showing the distribution across states.
  • One dotplot per timepoint, each with one dot per state.
  • 50 different loans versus time plots, each with the GDP plotted on the same set of axes. This is useful for inspecting states individually.

In any case, it may be easier to make comparisons between states if you normalize the number of successful loans somehow, as by dividing by the state population.

Kodiologist
  • 19,063
  • 2
  • 36
  • 68
  • 1
    +1. In this case you can perhaps identify 5-10 states by identifiers (e.g. OH, WY) without crowding the graph unnecessarily. So, perhaps you can show 8 or 9 graphs (2 x 4 or 3 x 3 layout) with a bundle of states in each. A variant is to show all time lines as a backdrop in grey. 50 graphs for 50 states is a clear design, but a strain on readers' powers of synthesis. – Nick Cox Oct 16 '16 at 18:17
  • See http://stats.stackexchange.com/questions/190152/visualising-many-variables-in-one-plot for various solutions to the spaghetti problem, except that no example there has 50 series. That's why I am suggesting a half-way house. – Nick Cox Oct 16 '16 at 22:11