First, consider the simpler case of your model but with continuous variables. Say, something like:
mod <- lm(y ~ x1+x2)
Your regression coefficient for $x_1$ is just the partial derivative of $y$ with respect to $x_1$. This is the marginal effect, but because everything else is a constant, you just end up with $b_1$.
However, when we specify an interaction like you have, what we're really specifying is this:
mod2 <- lm(y ~ x1+x2+x1*x2)
When you have a interaction term, you also have to have the additive terms. This is why you get so many regression coefficients in your summary. So for this model with continuous variables, summary() would give us an intercept, a coefficient for $x_1$, a coefficient for $x_2$, and a coefficient for $x_1*x_2$. But this time, when we take the partial derivative of $y$ with respect to $x_1$, we don't just get a single coefficient. Instead, the marginal effect is $b_1 + b_3*x_2$. So the effect of $x_1$ depends on the value of $x_2$ at any point. As a result, the p-values in your summary() don't really tell you very much. It's a common mistake to think that you can interpret your coefficients the same way in an interaction model. The coefficient $b_3$ does not explain the 'variance due to the interaction'. Brambor, Clark and Golder do a great job of explaining just how common this mistake is, and its perils.
Now, your question had to do with factor variables. The explanation is the same; it's just that you have many more coefficients, because you're multiplying many binary variables with many other binary variables. Be cautious here - you'll see your $R^2$ go up, but that may just be due to having many more terms in your model.
A common approach to reporting on interaction effects is to plot predicted values for various values of your interacted variables, holding other values at their mean (or some other sensible values). There's a nice package in R that will do this for you. Try:
library(effects)
plot(allEffects(mod))
As for which model you should go with - I'd say that should depend on your theory more so than anything else. Does an interaction make sense for what you're trying to model? I wouldn't suggest adding interaction terms just to get better fit statistics$\dots$