You are describing a situation in which you are fitting a model with an extra parameter (the intercept) relative to the generating model. This will produce an overfit model because you have too many parameters. You ask two questions: 1) what happens to the model fit when you add the extra parameter, and 2) does the overparameterized model correctly infer that the intercept is non-significant?
I will answer with simulations. First, let's simulate 500 cases of the scenario you describe (linear model with intercept=0), and for each case, we'll fit both the overparameterized regression model (slope and intercept) and a regression model with just a slope but no intercept. We'll extract the p-value for the Intercept in the full model, and the log likelihoods and R-squared values for both models.
# Simulate 500 datasets
set.seed(23)
p <- r1 <- r2 <- lh1 <- lh2 <- c()
for (i in 1:500) {
x <- runif(20)
y <- x + rnorm(20, sd=0.1)
# Fit model with intercept
fit1 <- lm(y ~ x)
# Fit model without intercept
fit2 <- lm(y ~ -1 + x)
# Store p values, R-squared, and log likelihoods
p[i] <- summary(fit1)$coefficients[1,4]
lh1[i] <- logLik(fit1)
lh2[i] <- logLik(fit2)
r1[i] <- summary(fit1)$r.squared
r2[i] <- summary(fit2)$r.squared
}
Let's address your second question first, by looking at Type I error rates (i.e., false positives) for the intercept term in the overparameterized model.
# Type I error rate for intercept
sum(p < 0.05)/500
## 0.05
The Type I error rate is 0.05, which means that 5% of the time the overparameterized model incorrectly inferred that the intercept is significant, and 95% of the time it inferred that the intercept is not significant. This is generally considered an acceptable Type I error rate.
Now what about the model fits? Let's compare the log likelihoods for the two models.
# Difference between log likelihoods
lhdiff <- lh1 - lh2
# Mean difference
mean(lhdiff)
## 0.5877251
# Number of times simpler model has a higher log likelihood
sum(lhdiff < 0)
## 0
You can see that on average the log likelihoods are higher for the full model, and there are 0 cases in which the log likelihood is higher for the simpler model (even though it is correct). This is the overfitting problem- adding more parameters always improves the fit. Now, you will notice something odd when you compare the R-squared values.
# Mean R-squared
mean(r1)
## 0.8941026
mean(r2)
## 0.9710714
The mean R-squared is bigger for the simpler model. What is going on? Shouldn't the R-squared be higher for the model with more parameters? A good explanation for why this occurs can be found here. Basically, the variation accounted for by the intercept is not factored into the computation of sums of squares, so when you include an intercept in the model, the intercept will account for some of the variation in the data, but this will not be reflected in the R-squared. Thus, the R-squared is actually lower when you include the intercept term. This is not usually what happens when you add parameters though- adding parameters other than the intercept (e.g., adding higher order terms or more predictors/slopes) will generally increase the R-squared.