1

I have a dataset that contains information about the prevalence of two mutually exclusive cultural traits over time in different study regions.

Imagine it to look somehow like this (in a tall format):

|  region  |   year   |   trait  |  proportion |
|----------|----------|----------|-------------|
| regionA  |  -2200   |  traitA  |     0.1     |
| regionA  |  -2200   |  traitB  |     0.9     |
| regionB  |  -2200   |  traitA  |     0.7     |
| regionB  |  -2200   |  traitB  |     0.3     |
| regionA  |  -2199   |  traitA  |     0.15    |
| regionA  |  -2199   |  traitB  |     0.85    |
|   ...    |   ...    |   ...    |     ...     |

In the real dataset there are 8 regions and 2 traits. It spans over 1400 years. The development in the regions is correlated, however it seems to be more complicated than a simple, spatiotemporal diffusion of innovation. I would like to know which regions influence the others or more precisely which ones lead others to change.

Can I apply Vector Autoregression to answer this question? If so, which tests do I have to apply beforehand? Which VAR settings are suitable for my question and the data? Which methods should I use to get parameters and plots to effectively analyse the results of the VAR model?

I believe I would have to reduce the temporal resolution, because I don't expect the year wise information to be meaningful. A resolution of one measurement per generation is more interesting. That would leave me with a time series of less than 50 observations.

In a proof of concept test I did exactly this and estimated a very basic VAR of one of the two variables with the vars package in R. The result is promising, but I don't want to look deeper into this before I got some feedback. Time series analysis is pretty new to me.

bt_diff <- sapply(bt, diff)
varfit <- vars::VAR(bt_diff, type = "both", p = 1)
plot(vars::fevd(varfit))
nevrome
  • 111
  • 3

0 Answers0