I am trying to run fixed effects and random effects regressions on data which is not pure panel data but rather independent cross-sections over several years. In addition, the yearly cross-sections are of different sizes. The dataset consists of rows of syndicated loans with loan-level variables and borrower country-level variables. I'd like to run country
fixed and random effects models while also controlling for year
and industry
fixed effects. In the end I aim to compare their results together and with results from a pooled OLS model.
The question is how can I apply pdata.frame()
so that I can regress using plm()
when my data has multiple observations (loans) per country-year pair? If I try to use year and country I get the following error:
Warning in pdata.frame(df, index = c("borrower_country", "year")) :
duplicate couples (id-time) in resulting pdata.frame
to find out which, use, e.g., table(index(your_pdataframe), useNA = "ifany")
I know that it is caused by duplicate id-time pairs, but since it's a central feature of the data I don't know how to fix the issue. I know that panel data methods have been used in studies with a similar data structure, but I don't understand how to apply the methods to my data. I've tried to find an answer but it seems as if there is a gap in this area. I've searched online, stackexchange, stackoverflow, textbooks I've listed below etc. but I can't seem to find a solution.
Any help would be much appreciated.
- Woolridge (2012). Introductory Econometrics: A Modern Approach.
- Woolridge (2010). Econometric Analysis of Cross Section and Panel Data.
- Baltagi (2005). Econometric Analysis of Panel Data.
- Tsionas (2019). Data Econometrics Empirical Applications.
PS. Is this an issue that would be easier to solve with Stata? I'm used to working with R but I am ready to try to do it with Stata if it would help. Although, based on previous questions I've read I suspect I'd face the same issue there.