2

I am looking at a relationship between Population and a measure of Accessibility. I have found an inverted-u type relationship:

inverted-u relationship

Unfortunately, the relationship would predict negative population in areas with very high Accessibility. This intuitively makes no sense so I wish for the model to have an asymptote at zero.

I don't know how to incorporate this into the regression, could anyone please suggest an appropriate functional form to look at?

Chris
  • 175
  • 5
  • 1
    If you possibly can, use a theoretical argument to deduce the likely shape of the curve. Then apply it as shown in the example at https://stats.stackexchange.com/a/64039/919, which looks remarkably similar to me in that (a) your data apparently are counts and (b) the shape is qualitatively the same. If you have no such theory, then you are reduced to fitting splines or some other flexible (but relatively meaningless) family of curves. – whuber May 18 '17 at 15:44
  • That was an excellent answer. Unfortunately a binomial model won't work here since the y axis is actually a density of population in an area. Making this into a proportion of overall density wouldn't make too much sense. I did want to avoid the more black-box curves, but they may be unavoidable. – Chris May 18 '17 at 17:12
  • A "density of population" is a count divided by a (known) area. Model the counts. (The area can be included in the model as an "offset.") A Poisson model ought to work well for a start. – whuber May 18 '17 at 17:14
  • Thanks @whuber, I'm not clear what you mean by included as an 'offset'. The areas are different for each point on the plot so modelling them as counts makes them unrepresentative of the actual data. – Chris May 18 '17 at 17:22
  • Areas are not modeled as counts. Please see https://stats.stackexchange.com/questions/11182. – whuber May 18 '17 at 18:00
  • say my initial model is glm(pop_density ~ accessibility, poisson), my new model is glm(population ~ accessibility, offset= area, poisson), correct? If so, unfortunately the fit is pretty hopeless. – Chris May 18 '17 at 18:45
  • Of course the fit is hopeless, because such models predict monotonic relationships and this obviously is not. The point is that you need to modify this model in one or both of two ways: (1) change the link function (as described in the reference I gave) and/or (2) employ more flexible regressors, such as splines or theoretically suggested functions (as is most often done). – whuber May 18 '17 at 18:48
  • If you are using fitting software that allows weighted fitting, you could weight a data point at asymptote very heavily with all other data points weighted at 1.0. In weighted fitting, the weights are inversely proportional to the uncertainty so that if uncertainty is small, weight is large. While this can be done and might produce the desired result, I consider it an inferior but possibly working solution to the problem. I mention this as I have used this technique when I was literally certain a fitted curve should pass through the origin, but my data was noisy. – James Phillips Feb 10 '18 at 16:39

0 Answers0