2

The question has been asked (one time) on CV before, but the answer is really imprecise and does not really answer the question in my opinion.

So: What are the assumptions for estimating a linear regression model via quantile regression?

To my understanding (and as several CV users have mentioned), quantile regression does not assume any specific distribution of the error terms - does that mean that, in a time series model, autocorrelation and heteroscedasticity do not have to be accounted for?

What about the other Gauss-Markov assumptions? I would assume that the assumption of no perfect multicollinearity has to be met when applying quantile regression, but do the parameters have to be linear? The linearity assumption only has to hold for the specific quantile I would assume.

Anyways - I do not find any backup for any of my thoughts in the scientific literature and I would appreciate a comprehensive answer. Thank you!

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
shenflow
  • 750
  • 8
  • 20
  • 1
    You should link to the previous question that you mention. – Kodiologist Dec 31 '17 at 14:17
  • shenflow, Could you include the link to the other question? – Richard Hardy Feb 21 '18 at 19:08
  • https://stats.stackexchange.com/questions/47929/what-are-the-assumptions-for-quantile-regression The answer is really imprecise: The auther does not say which properties of the estimators he is referring to. This in turn kind of makes the answer "wrong" in the sense that one can not generally state those things. Heteroskedasticity for example has an effect on the efficiency of QR estimators. There are a lot of theoretical constructs that have been proposed to deal with this. – shenflow Feb 22 '18 at 09:52
  • Actually, one can differentiate quite precisely which assumptions have to be fulfilled so that the QR estimators do inherit certain properties. There are very specific derivations of for instance consistency. To just say "it does not have strong assumptions when it comes to the error term" is not really an answer. At least in my opinion. – shenflow Feb 22 '18 at 09:52

1 Answers1

10

Quantile regression assumes

  • the normal regression assumptions of linearity and additivity (unless you add more terms to the model)
  • independence of observations
  • very large sample size, as quantile regression is not very efficient
  • $Y$ is very continuous; quantile regression doesn't work well when there are many ties at one or more values of $Y$

You might also consider semiparametric regression (e.g., proportional odds or hazards models) which are more efficient and also allow you to estimate the mean.

My RMS course notes goes a bit more into quantile and semiparametric regression in the chapter on ordinal models for continuous $Y$.

Frank Harrell
  • 74,029
  • 5
  • 148
  • 322
  • 1
    Thank you for the answer Mr Harrell! Is there any comprehensive literature on this? In most (basic) econometric textbooks I have read, quantile regression is not mentioned.The original paper on quantile regression by Koenker kind of implies what you are saying, but does not explicitly mention the assumptions (at least to my understanding). – shenflow Dec 31 '17 at 13:24
  • 1
    See the link I just added – Frank Harrell Dec 31 '17 at 13:54
  • Sorry but I have a follow up question @FrankHarrell. I am trying to prove consistency of quantile regression estimators. I have found the following lecture notes:https://eml.berkeley.edu/~powell/e241a_sp10/qrnotes.pdf . Assumption A1 says that the data has to be iid. In the case of ols, the data only has to be covariance stationary. Does this imply that data that shows some sort of autocorrelation structure (autoregressive processes) can not be used to estimate consistent coefficients by applying quantile regression? – shenflow Jan 23 '18 at 14:08
  • Possibly but I'm not well versed in that aspect. More interested in small sample properties. But start with simplest case: one predictor $X$ that is categorical with $k$ levels. This is completely equivalent to computing $k$ ordinary sample quantiles. – Frank Harrell Jan 24 '18 at 13:06
  • Okay I understand that. But the issue I am having is to precisely define in what cases quantile regression leads to what kind of estimators. I am particularly interested in quantile regression (QR) in the context of time series. And the work I have found so far was rather sparse and on a comparably high technical level. Keeping in mind there are actually quite a lot of papers which use QR for research purposes and **do not mention any underlying assumptions**, it suprises me that there is no set of "standard assumptions" that researchers can refer to (like in the case of OLS). – shenflow Jan 24 '18 at 13:19
  • And in your answer you are stating that the observations have to be indpendent. This rules out autoregressive processes. Yet, there are applications of time series quantile regressions. – shenflow Jan 24 '18 at 13:27
  • This is an oversimplification but for some models the dependence can be ignored when estimating regression parameters. But it must be incorporated to get valid standard errors etc. – Frank Harrell Jan 24 '18 at 19:45
  • @FrankHarrell, https://stats.stackexchange.com/questions/429885/assumptions-of-l1-regression – jeza Oct 05 '19 at 22:14
  • @FrankHarrell, can we have considered the same of L1 regression? If yes, could you please provide me with academic resources that I can refer to. – jeza Oct 05 '19 at 22:16
  • Switching to a different penalty function does not change the underlying model assumptions. It just adds an assumption that is equivalent to tilting the regression coefficients towards certain values (in this case, zero). – Frank Harrell Oct 06 '19 at 12:52
  • The last two bullet points seem more like practical tips rather than formal assumptions. I guess bullet point #3 is voided by focusing on asymptotics, which the statistical theory usually does anyway. Bullet point #4 is an interesting one; a more formal formulation would be interesting to see. – Richard Hardy Jan 06 '21 at 18:50
  • More on bullet point #3: The efficience of quantile regression depends on the distribution of errors. Quantile regression estimator is the MLE for the asymmetric double exponential distribution (I learned this in the thread ["Probability distribution models compatible with quantile regression"](https://stats.stackexchange.com/questions/258362)). Being the MLE, it is (in a certain technical sense) the most efficient there is. – Richard Hardy Jan 06 '21 at 18:50
  • 1
    Which immediately tells you that quantile regression is inefficient because I've never seen a distribution that looks anything like a double exponential. – Frank Harrell Jan 07 '21 at 00:24