1

I am trying to perform a relatively simple multiple regression with a single breakpoint using the segmented package in R. My question is how to handle the interaction between the predictor variables as follows.

My model is: $$Runoff = \beta_0+\beta_1(A+\beta_2P)$$ where there is a breakpoint at some value of $(A+\beta_2P)$. I expect $\beta_0$ and $\beta_1$ to be different on each side of the breakpoint, and $\beta_2$ to be the same (in fact I'll just say a priori that $\beta_2$ should not vary across the breakpoint, but this would be interesting to test)

My question is how to implement this using segmented (or a similar breakpoint linear regression model). Options are:
1. Run some loop through values of $\beta_2$ and create an intermediate variable $\tau = (A+\beta_2P)$, then it's a simple application of segmented. My concern is that I'd like to get a fitted value of $\beta_2$ from the data and this seems like a crude way of doing it.
2. Some clever way of using segmented on the fully explicit model that will estimate $\beta_0, \beta_1,$ and $\beta_2$
any help appreciated

ADDENDUM:

To describe what I'm trying to do. This is for a watershed runoff generation project. My hypothesis is that runoff amount is a function of both precipitation and soil water table position, and that there is a very strong threshold response in the latter. Whenever it rains, the water table goes up, but if it does not reach the threshold no runoff is generated. If enough rain falls to raise the water table above the threshold, then runoff is generated. Thus, runoff is a function of $(Ant + Pcp)$ with a notable breakpoint.

So, in the model as formulated in my original question $\beta_0$ and $\beta_1$ are the slope and intercept of the runoff response, and $\beta_2$ is a coefficient that converts rainfall depth to water table rise, which I assume is something like the porosity of the soil.

I'm interested in (in order of importance): (1) is this model better than just a linear relationship between runoff and precipitation?; (2) can the model estimate where the water table threshold is?; (3) Can the model estimate the "porosity" coefficient (this would be a nice check against physical reality and I think also improve the physical fidelity of the estimated threshold position).

Below is a graph using a hand-picked value of $\beta_2$, which let me do a piece wise linear regression. Solid line is the piece wise, dashed is a linear regression between runoff and precip alone. I'd like to be able to estimate the parameters of the model ($\beta_0, \beta_1, \beta_2$, and the breakpoint position) from the data. Ultimately, I have three replicates of three watershed types, so I'm interested in comparisons there as well.

enter image description here

Nakx
  • 431
  • 4
  • 20
Iceberg Slim
  • 261
  • 2
  • 11
  • Please reread your question. Where you said "*I expect β0 and β2 to be different on each side of the breakpoint, and β2 to be the same*" -- should one of those "*β2*" values have been β1? As specified your model (even without segments) is non-linear in the parameters, but a simple reparameterization will make it linear; you might consider dealing with it in the linear form – Glen_b Oct 11 '14 at 00:36
  • Ah, you are correct about the subscript error. Edited to fix. I'm afraid I am not following the implications of being non-linear in parameters. Is the reparameterization to multiply through to get $Runoff = \beta_0 + \beta_1 A + (\beta_1 \beta_2) P$? – Iceberg Slim Oct 13 '14 at 14:23
  • Almost: $\text{Runoff} = \beta_0 + \beta_1 A + (\beta_1 \beta_2) P= \beta_0 + \beta_1 A +\beta_3 P$. *Now* it's linear. – Glen_b Oct 13 '14 at 14:34
  • Got it. The trick is that I expect the breakpoint to be at some point in $(A+\beta_2P)$, and I'm not sure how to work that regression, particularly in a way that allows $\beta_2$ to be estimated from the data. It seems that I could do if if I could fit a model to $\beta_0 + \beta_1 A + \beta_3 P$ that had breakpoints in both $A$ and $P$, which I'm not sure how to do. – Iceberg Slim Oct 13 '14 at 15:25
  • You can't get a breakpoint in $(A+\beta_2P)$ by putting one in the individual $A$ and $P$ terms. It's like trying to cut your sandwich diagonally by putting a pair of cuts crosswise (parallel to the sides). By trying implement your attempted solution ("*use the `segmented` package*") instead of solve your original problem, we're in a dead end - we have to go back and solve your actual problem ... (ctd) – Glen_b Oct 13 '14 at 22:41
  • (ctd) ... `segmented` just can't deal with this problem; it's for models that are linear in parameters, but your problem seems to be nonlinear - and, because of where you want to have the breakpoint (which I hadn't clearly understood before) - it's not linearizable. You could formulate a breakpoint as a nonlinear least squares problem (in which case there should be no problem leaving the whole thing nonlinear in the parameters as in your original formulation). – Glen_b Oct 13 '14 at 22:42
  • But before we start down that path ... what do you want to do with the identified model? – Glen_b Oct 13 '14 at 22:43
  • Edited to add info. I may just rewrite this question more appropriately, as I had sort of pre-supposed a partial answer (segmented package). – Iceberg Slim Oct 16 '14 at 21:46
  • Sure thing, appreciate the help. – Iceberg Slim Oct 16 '14 at 22:19
  • Story of this analysis man, I guess it's time to dust off the linear alg. book. Thanks for taking the time! – Iceberg Slim Oct 17 '14 at 16:56
  • But you should still be able to fit it with a nonlinear regression. The potentially tricky part would be the inference (depending on which parts you might need say hypothesis tests or intervals for), but there may be a couple of ways to approach that. – Glen_b Oct 17 '14 at 20:39

0 Answers0