3

I have noticed that, in many posts explaining the origin of the term "Regression" (e.g. HERE), the explanations basically revolve around how the term got stuck after a biologist named Francis Galton used it to describe some biological phenomena.

This suggests, as I understand, that the term "Regression" was adopted for historical reason, which means that there may be an alternative term that better describes "Regression" as we know it today.

As a non statistician/non-native English speaker, I wonder if such an alternative term does exist?

  • You could call it *curve-fitting*, but you're better off sticking to the term everyone else uses. – Dan Jan 02 '18 at 10:52
  • Thanks. I agree with you. However, knowing the best descriptive term will help me in translating it using the most appropriate term/word in a language where there is no conventional term, yet! – PatternRecognition Jan 02 '18 at 11:01
  • Then take Dan answer and use it only in your native language. I had the opposite experience: I learned the technique under the “fitting” name, and got lost the first few times I heard about regression. Besides, “curve-fitting”is appropriate only in 1D regression scenario. – famargar Jan 02 '18 at 11:16
  • @famargar you could say *hyper-surface fitting* if you really wanted, but I think *curve* is general enough. Most curve-fitting software packages handle multiple dimensions (e.g. https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html) and I think it would be generally understood that it would generalize to hyper-surfaces. – Dan Jan 02 '18 at 12:20
  • 1
    @PatternRecognition then I would first make absolutely certain you really are the first person to ever translate *regression* to that language (seem unlikely no?) and if you are then why not just stick with the word regression itself. It doesn't hold a descriptive meaning in English and that doesn't seem to have prevented people from understanding it. – Dan Jan 02 '18 at 12:22
  • I would use the term 'statistical model' but to be specific enough it would be 'statistical model in which effects of predictors are separable and began with an additive basis'. – Frank Harrell Jan 02 '18 at 12:53
  • @Dan, Of course I am not the first, but different people use different translations, one of which is the literal translation of the word "Regression"! Knowing the best descriptive term helps in choosing the most appropriate translation, or even using a totally new one ! – PatternRecognition Jan 02 '18 at 12:58
  • Thanks @FrankHarrell. What 2/3 English words would you use if you were to suggest an alternative to the term Regression? curve-fitting? – PatternRecognition Jan 02 '18 at 13:10
  • 3
    Separate model specification from parameter estimation for a moment. Things like 'statistical model' and 'general linear model' are appropriate. A regression model is a statistical model with predictors that are assumed to at least partially operate additively. – Frank Harrell Jan 02 '18 at 13:14
  • What about *predictive-model*? – Dan Jan 02 '18 at 14:49
  • @Dan, this will be too generic and least descriptive. – PatternRecognition Jan 02 '18 at 16:18
  • Why not use the word "approximation" instead of "regression" ? – asmaier May 22 '20 at 16:12

1 Answers1

3

Estimate, predict, and model are three verbs that can often be used instead of "regression" in most sentences. Assess, measure, or infer are applications which may pertain to statistical testing based on such models. This solves a different problem of scientific writing being too obtuse. We cannot defenestrate "regression" for its historical context without throwing out the entire English language. However, in modern science it is given that the point of statistical tests and models is to separate signal from noise in data. This is the essential meaning that was abstracted by the term "regression".

Examples include:

"We estimated a generalized least squares model to describe mean differences in income earnings for various household structures in urban East Asian settings."

"Ordinary least squares was used to predict lower body functioning two years following transplantation."

"Hemoglobin A1c was modeled as a linear combination of age, sex, race, family history of diabetes from self-report questionnaires as well as triglycerides, blood pressure, creatinine, and serum cholesterol at follow-up."

"The association between traumatic exposure and development of PTSD was assessed with logistic regression controlling for age, military service, housing instability, and educational setting."

"We measured change in knowledge following the preparatory exam course using the T-test."

"We inferred vallium's superiority to oxycodone for pain management of cystic fibrosis using the log-rank test for time until readmission for symptoms of unmanaged pain."

AdamO
  • 52,330
  • 5
  • 104
  • 209