3

I have some temperature data gathered over the course of a few days which follow a cyclic pattern. I've fit a linear regression model to it with sine and cosine waves of multiple periods and the result is very close to what it should be.

My question is the following: the model produces a cyclic wave which matches the phase of my data (with peaks and troughs in the same place). How is this possible since I only vary the period length and amplitudes? I don't explicitly model the phase.

My code (Python): note that I adjust the time in hours so that a full day corresponds to 2$\pi$ hours. Is the phase modeled implicitly with the intercept of the linear model?

order = 10
var = temp
T = np.asarray([x.hour*np.pi*(1/12.0) for x in time])
sT = [np.sin((a+1)*T) for a in range(order)]    
cT = [np.cos((a+1)*T) for a in range(order)]
X = np.asarray(sT)
X = np.vstack([np.vstack([X,cT])]).transpose()

from sklearn import linear_model
clf = linear_model.LinearRegression()
clf.fit(X,var)
  • check my question here. http://stats.stackexchange.com/questions/224990/whats-wrong-to-fit-periodic-data-with-polynomials – Haitao Du Jan 23 '17 at 20:46
  • I think the answer below addresses the question better than the thread above. Basically you ARE modeling the phase and amplitude; check out the "sum and difference formulas" from trigonometry, particularly the one for $cos$ – Taylor Jan 24 '17 at 05:48

1 Answers1

4

The reason you can account for arbitrary phase is that you include both sine and cosine components as regressors.

One way to write a sinusoid with amplitude $a$, frequency $f$ and phase $\phi$ is:

$$a \sin(f t + \phi)$$

But, you can also write it as a linear combination:

$$w_1 \cos(f t) + w_2 \sin(f t)$$

For any choice of $a$ and $\phi$ (in the first representation), there are corresponding weights $w_1$ and $w_2$ (in the second representation) such that the signals are identical. Their relationships are:

$$a = \sqrt{w_1^2 + w_2^2}$$ $$\phi = \tan^{-1}\frac{w_1}{w_2}$$

user20160
  • 29,014
  • 3
  • 60
  • 99