I have a dataset that contains information about the prevalence of two mutually exclusive cultural traits over time in different study regions.
Imagine it to look somehow like this (in a tall format):
| region | year | trait | proportion |
|----------|----------|----------|-------------|
| regionA | -2200 | traitA | 0.1 |
| regionA | -2200 | traitB | 0.9 |
| regionB | -2200 | traitA | 0.7 |
| regionB | -2200 | traitB | 0.3 |
| regionA | -2199 | traitA | 0.15 |
| regionA | -2199 | traitB | 0.85 |
| ... | ... | ... | ... |
In the real dataset there are 8 regions and 2 traits. It spans over 1400 years. The development in the regions is correlated, however it seems to be more complicated than a simple, spatiotemporal diffusion of innovation. I would like to know which regions influence the others or more precisely which ones lead others to change.
Can I apply Vector Autoregression to answer this question? If so, which tests do I have to apply beforehand? Which VAR settings are suitable for my question and the data? Which methods should I use to get parameters and plots to effectively analyse the results of the VAR model?
I believe I would have to reduce the temporal resolution, because I don't expect the year wise information to be meaningful. A resolution of one measurement per generation is more interesting. That would leave me with a time series of less than 50 observations.
In a proof of concept test I did exactly this and estimated a very basic VAR of one of the two variables with the vars package in R. The result is promising, but I don't want to look deeper into this before I got some feedback. Time series analysis is pretty new to me.
bt_diff <- sapply(bt, diff)
varfit <- vars::VAR(bt_diff, type = "both", p = 1)
plot(vars::fevd(varfit))