I am wondering if there are formal definition that will help distinguish between univariate, multivariate, cross-sectional, repeated/pooled cross-sectional, panel and longitudinal analysis?
Considering my particular research as an example:
I am doing forecasts with only one predictive variable. In the models that I am building, I am considering a variation of a variety of explanatory variables (from chronological data, to weather data, demand data, productions data, and more) across a variety of time periods (period one is 2019, period two is 2018-2019, period three is 2010-2019). The models will consider from one to all the variables.
From the definitions that I share in the point 2 (bellow), it looks like the table is correct and that Longitudinal is used interchangeably with Panel. Also, Repeated/pooled cross sectional seems like it is in between cross-sectional and panel/longitudinal, am I right?
But how would you categorize the research that I am doing? Does it fit Longitudinal Analysis?
1
In a conversation with a teacher, it seemed that the difference between Univariate, Multivariate and Longitudinal is the following
However, in that case, it is not necessarily considering Cross-Sectional, Repeated/Pooled Cross-Sectional, nor Panel Analysis (it depends on how one defines them).
2
Then, I have done a quick research to find definitions for the terms, and try to validate the table above.
• Univariate: from this Stata's blog entry, we read
Univariate time series data typically arise from the collection of many data points over time from a single source, such as from a person, country, financial instrument, etc.
• Multivariate: from Wikipedia, we read
Multivariate analysis is based on the principles of multivariate statistics, which involves observation and analysis of more than one statistical outcome variable at a time.
• Cross-sectional: from Wikipedia, we read
Cross-sectional data, or a cross section of a study population, in statistics and econometrics is a type of data collected by observing many subjects (such as individuals, firms, countries, or regions) at the one point or period of time.
• Pooled Cross Sectional: From this material we read that it is
Randomly sampled cross sections of individuals at different points in time.
• Longitudinal: The blog entry referenced above also defines Longitudinal as as the following
Longitudinal data typically arise from collecting a few observations over time from many sources, such as a few blood pressure measurements from many people.
We can complement it with this article, where we read
Then longitudinal analysis is the study of collections of variables; in most applications the variables are strongly associated. We can associate each time point with a separate variable, in the spirit of the original definition of the term variable.
• Panel: From this article it looks like Panel Analysis is the same as Longitudinal Analysis
Longitudinal or panel data analysis refers to the statistical analysis of pooled data which consists of a cross‐section of units (e.g., countries, firms, households, individuals) for which there exist repeated observations over time.
This Wikipedia page goes within the same line
In statistics and econometrics, panel data and longitudinal data1 are both multi-dimensional data involving measurements over time.
And this Quora answer as well
longitudinal or panel data, observations of multiple phenomena over multiple time periods for the same data units; involves repeated observations of the same variables (e.g., people) over periods of time; can prove cause and effect, but time-consuming and expensive
Note
I went through a variety of Q&A across the StackExchange (will share some bellow) and apart from finding only old content, but I didn't find formal definitions not even the comparison of all of the terms in one. This question may serve as canonical.