Questions tagged [linear-model]

Refers to any model where a random variable is related to one or more random variables by a function that is linear in a finite number of parameters.

A linear model is any model where a random variable, $Y$, is related to one or more random variables, $X$, by a function that is linear in a finite number of parameters. That is, the parameters being estimated are all coefficients. Note that it does not matter if the resulting function looks like a straight line.

2189 questions

160

votes

9 answers

When is it ok to remove the intercept in a linear regression model?

I am running linear regression models and wondering what the conditions are for removing the intercept term. In comparing results from two different regressions where one has the intercept and the other does not, I notice that the $R^2$ of the…

asked Mar 07 '11 at 09:14

analyticsPierce

1,793
3
12
6

136

votes

3 answers

What is the difference between linear regression and logistic regression?

What is the difference between linear regression and logistic regression? When would you use each?

regression logistic linear-model

asked May 28 '12 at 18:17

B Seven

2,873
4
24
29

134

votes

9 answers

What is the difference between linear regression on y with x and x with y?

The Pearson correlation coefficient of x and y is the same, whether you compute pearson(x, y) or pearson(y, x). This suggests that doing a linear regression of y given x or x given y should be the same, but I don't think that's the case. Can…

regression correlation linear-model pearson-r

asked Feb 13 '12 at 05:15

user9097

2,973
7
18
11

128

votes

2 answers

Removal of statistically significant intercept term increases $R^2$ in linear model

In a simple linear model with a single explanatory variable, $\alpha_i = \beta_0 + \beta_1 \delta_i + \epsilon_i$ I find that removing the intercept term improves the fit greatly (value of $R^2$ goes from 0.3 to 0.9). However, the intercept term…

r linear-model interpretation r-squared intercept

asked Apr 10 '12 at 11:29

Ernest A

2,062
3
17
16

118

votes

4 answers

PCA and proportion of variance explained

In general, what is meant by saying that the fraction $x$ of the variance in an analysis like PCA is explained by the first principal component? Can someone explain this intuitively but also give a precise mathematical definition of what "variance…

regression pca linear-model dimensionality-reduction

asked Feb 10 '12 at 05:36

user9097

2,973
7
18
11

votes

3 answers

Shape of confidence interval for predicted values in linear regression

I have noticed that the confidence interval for predicted values in an linear regression tends to be narrow around the mean of the predictor and fat around the minimum and maximum values of the predictor. This can be seen in plots of these 4 linear…

regression confidence-interval linear-model standard-error

asked Feb 06 '14 at 00:15

luciano

12,197
30
87
119

votes

3 answers

What is the effect of having correlated predictors in a multiple regression model?

I learned in my linear models class that if two predictors are correlated and both are included in a model, one will be insignificant. For example, assume the size of a house and the number of bedrooms are correlated. When predicting the cost of a…

regression multiple-regression p-value linear-model multicollinearity

asked Feb 11 '14 at 22:23

Vivek Subramanian

2,613
2
19
34

votes

4 answers

Choosing between LM and GLM for a log-transformed response variable

I'm trying to understand the philosophy behind using a Generalized Linear Model (GLM) vs a Linear Model (LM). I've created an example data set below where: $$\log(y) = x + \varepsilon $$ The example does not have the error $\varepsilon$ as a…

r generalized-linear-model linear-model gamma-distribution link-function

asked Nov 19 '12 at 13:28

Marc in the box

3,532
3
33
47

votes

3 answers

Where does the misconception that Y must be normally distributed come from?

Seemingly reputable sources claim that the dependent variable must be normally distributed: Model assumptions: $Y$ is normally distributed, errors are normally distributed, $e_i \sim N(0,\sigma^2)$, and independent, and $X$ is fixed, and …

regression least-squares linear-model dependent-variable

asked Apr 25 '18 at 20:14

colorlace

1,010
11
25

votes

4 answers

Fast linear regression robust to outliers

I am dealing with linear data with outliers, some of which are at more the 5 standard deviations away from the estimated regression line. I'm looking for a linear regression technique that reduces the influence of these points. So far what I did is…

regression linear-model outliers robust fused-lasso

asked Dec 19 '12 at 10:47

Matteo Fasiolo

3,134
2
20
29

votes

3 answers

Derive Variance of regression coefficient in simple linear regression

In simple linear regression, we have $y = \beta_0 + \beta_1 x + u$, where $u \sim iid\;\mathcal N(0,\sigma^2)$. I derived the estimator: $$ \hat{\beta_1} = \frac{\sum_i (x_i - \bar{x})(y_i - \bar{y})}{\sum_i (x_i - \bar{x})^2}\ , $$ where $\bar{x}$…

regression mathematical-statistics variance linear-model regression-coefficients

asked Mar 02 '14 at 15:56

mynameisJEFF

1,583
4
24
29

votes

3 answers

Why is polynomial regression considered a special case of multiple linear regression?

If polynomial regression models nonlinear relationships, how can it be considered a special case of multiple linear regression? Wikipedia notes that "Although polynomial regression fits a nonlinear model to the data, as a statistical estimation…

regression multiple-regression linear-model nonlinear-regression polynomial

asked Apr 01 '14 at 00:42

gavinmh

1,015
3
12
15

votes

1 answer

Proof that the coefficients in an OLS model follow a t-distribution with (n-k) degrees of freedom

Background Suppose we have an Ordinary Least Squares model where we have $k$ coefficients in our regression model, $$\mathbf{y}=\mathbf{X}\mathbf{\beta} + \mathbf{\epsilon}$$ where $\mathbf{\beta}$ is an $(k\times1)$ vector of coefficients,…

regression linear-model least-squares t-distribution

asked Oct 01 '14 at 01:12

Garrett

votes

5 answers

How to derive the least square estimator for multiple linear regression?

In the simple linear regression case $y=\beta_0+\beta_1x$, you can derive the least square estimator $\hat\beta_1=\frac{\sum(x_i-\bar x)(y_i-\bar y)}{\sum(x_i-\bar x)^2}$ such that you don't have to know $\hat\beta_0$ to estimate…

regression multiple-regression generalized-linear-model linear-model

asked Dec 18 '12 at 05:32

Saber CN

votes

4 answers

(Why) do overfitted models tend to have large coefficients?

I imagine that the larger a coefficient on a variable is, the more ability the model has to "swing" in that dimension, providing an increased opportunity to fit noise. Although I think I've got a reasonable sense of the relationship between the…

regression variance linear-model bias regularization

asked Jul 13 '13 at 01:30

David Marx

6,647
1
25
43

2 3

…

99 100 Next