If the confidence intervals overlap, it is still possible that the estimates are significantly different. If they do not overlap, then they are significantly different. With this in mind, also note that splitting the dataset on the binary variable is not a great idea because you are losing statistical power. The more unbalanced the groups are (many more 0s than 1s or vice versa, or where the two groups have very different variances), the bigger this problem becomes.
Let's look at a simple example, where the variable is statistically significant, but splitting on the variable results in overlapping confidence intervals:
set.seed(9876)
N <- 1000
x <- rbinom(N, 1, 0.2)
y <- 0 + x + rnorm(N, 0, 6)
dt <- data.frame(y, as.factor(x))
m0 <- lm(y ~ x, data = dt)
summary(m0)
This produces:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.0967 0.2116 -0.457 0.6478
x 1.0216 0.4767 2.143 0.0324 *
So x
is significant at the 5% level.
Now, we split the data, as suggested in the OP and form confidence intervals for the response variable:
dt1 <- subset(dt, x == 0)
dt2 <- subset(dt, x == 1)
m1 <- lm(y ~ 1, data = dt1)
m2 <- lm(y ~ 1, data = dt2)
lapply(list(m1, m2), confint)
which produces:
[[1]]
2.5 % 97.5 %
(Intercept) -0.5180074 0.3246138
[[2]]
2.5 % 97.5 %
(Intercept) 0.1338219 1.715985
As these intervals overlap, we cannot conclude anything about statistical significance. This is equivalent to not rejecting the null hypothesis that the means of the two groups are equal.
So, it is better to use the t-test from the model that includes all the data.