Factor variables in R and other software are automatically parsed out into several categorical factors. So for instance, if I create a variable
n <- 100
dayn <- sample(1:7, n, replace=T)
dayf <- factor(dayn, levels=1:7, labels=c('Sun', 'Mon', 'Tues', 'Weds', 'Thurs', 'Fri', 'Sat'))
and I analyze it in a linear regression model, the regression model automatically creates the binary variables, taking "Sunday" as the referent level. Each factor gives a comparison of a day of the week versus Sunday in regression models. Sunday vs Sunday is redundant, so it is dropped.
For instance:
mm <- model.matrix(~dayf)
head(mm)
Gives me:
> head(mm)
(Intercept) dayfMon dayfTues dayfWeds dayfThurs dayfFri dayfSat
1 1 1 0 0 0 0 0
2 1 0 1 0 0 0 0
3 1 0 0 0 0 1 0
4 1 0 0 0 0 1 0
5 1 0 0 1 0 0 0
6 1 1 0 0 0 0 0
Suppose further I had a outcome variable which is Poisson distributed... yet I analyze it with a linear regression model because I can
sickdays <- rpois(n, lambda = exp(1 + 2*(dayf %in% c('Monday','Tuesday'))))
boxplot(sickdays ~ dayf)

Now if my hypothesis is "Does day of the week affect the number of people taking sick days?" an appropriate test of the hypothesis may come from a 6 degree of freedom test concerning whether or not there is any statistically significant difference in mean sick days among any of the days of the week. Note that I am not concerned with exactly which day is affected. The regression model gives me 6 separate coefficients
library(lmtest)
big.model <- lm(sickdays ~ dayf)
summary(big.model)
null.model <- lm(sickdays ~ 1)
lrtest(big.model, null.model)
Depending on your seed, the likelihood ratio test may or may not be significant and the 6 separate Wald tests may or may not be significant. The problem with the 6 separate Wald tests is multiple testing is applied.
This relates to LASSO because with factors we do not hypothesize that separate levels may be predictive. So we either include all factor levels as a "feature" or not.
As a reminder, LASSO does feature selection. What is a feature? In a regression model, the particular comparison "Tuesday vs Sunday" or "Friday vs Sunday" is not a feature. The 6 level factor coming from dayf
is considered a feature. So for model selection, it is all or nothing. Either all 6 factors are included, along with their penalization, or they are excluded.
From a theoretical perspective this makes sense. If I kept "Tuesday vs Sunday" as a factor and no other factors, this factor no longer means "Tuesday vs Sunday", but becomes "Tuesday vs every other day", that means there are significant practical differences in how that factor is interpreted when the model is expanded to include (what usually is) Wednesday vs Sunday. In that case, the two factors are Tuesday vs S/M/Th/F/Sa and Wednesday vs S/M/Th/F/Sa. And you cannot compare them.