I happened to an interview question of linear regression:
If the capacity of data is very large, we cannot put them into memory simultaneously. Why can we break them down into a few parts for calculation?
Actually I don't much understand the meaning of break them down into a few parts for calculation.
Since I only heard the partitioned regression model on features direction, how can we partition the data direction?
Does that mean we divide samples as $D_1$ and $D_2,$ then train their coefficients independently: $\theta_1$ and $\theta_2.$ The final solution is $\theta = (\theta_1 + \theta_2)/2?$