Let's say I want to estimate the effects of wars in GDP. My data looks something like this, where war 1 was the first war in the country in that time period (but is not the same war), and LongWar1 is a variable which takes value 1 if the war lasted for longer than 3 months and 0 otherwise:
Country | Year | Years since War 1 | Years since War 2 | Long War 1 | GDP |
---|---|---|---|---|---|
Afghanistan | 1970 | 50 | 25 | 0 | 3,000,000 |
Iraq | 1970 | -5 | 2 | 1 | 4,000,000 |
Of course wars that haven't happened yet at that year cannot have any effect on GDP. So I coded the above to get an NA for that year if the war hasn't happened yet:
Country | Year | Years since War 1 | Years since War 2 | Long War 1 | GDP |
---|---|---|---|---|---|
Afghanistan | 1970 | 50 | 25 | 0 | 3,000,000 |
Iraq | 1970 | NA | 2 | NA | 4,000,000 |
I am trying to estimate regression:
lm(GDP ~ YearssinceWar1 + YearssinceWar2 + LongWar1 + LongWar2 + YearsSinceWar1*LongWar1, data = mydata)
Of course, R doesn't let me run this due to the NA variables. Is there any way around this to get the regression I want? The only other alternative I can see is setting the NA variables to 0, but this comes with its own set of problems (Like also coding LongWar1 as 0, which simply suggests a short war as this is a dummy varibale).
Thanks!