The way to proceed depends a bit on how your data are coded. I'll proceed by describing what to do if your categorical variables have what's called "dummy" or treatment coding, the default in R.
With that coding, let's say that X
is coded 0 for non-working-class and 1 for working class, and that Z
is coded 0 for non-competitive and 1 for competitive. Presumably either X
or Z
or both for any district $i$ can change from election to election. Presumably H
and S
are other predictors related to the outcome, y
.
If the term $\mu_{ij}$ is supposed to represent the mean value of y
in district i for election period j, then it's superfluous so we'll omit it. If it's supposed to represent another predictor you can just add it back and treat it like H
and S
. The error term $e_{ij}$ is assumed to be random with a mean value of 0 in linear regression, so we'll ignore that too when it comes to predictions from the regression model.
In a regression based on that equation the intercept, $b_0$, represents the predicted value of $y$ when all predictors are at values of 0: in particular, that means non-working-class, non-competitive districts. Check that by substituting 0 for each of X
, Z
, H
, and S
in your equation (and omitting $\mu$ and $e$).
The coefficient $b_1$ for X
then is the predicted difference from the intercept value for working-class districts when all the other predictors are at values of 0: in particular, for non-competitive districts. The coefficient $b_2$ for Z
is the predicted difference from the intercept value for competitive districts when all the other predictors are at values of 0: in particular, for non-working-class districts. Again, check those 2 statements by substituting 0 for all predictors except for X
and Z
, respectively. Note that when only 1 of X
or Z
has a value of 1 then their product, the interaction term XZ
, has a value of 0.
Now, substitute 1 for both X
and Z
: a working-class, competitive district. If there weren't an interaction term, then you would simply predict $y=b_0+b_1+b_2$. So the coefficient $b_3$ for the interaction term is the difference from that simple prediction, due to having a district that is both working class and competitive.
With that explanation of the coefficients out of the way, let's get back to your primary question:
I want to calculate the difference ... between those working class areas that are competitive and those working class areas that are not competitive.
What you need to do is to compare a situation with {X
=1, Z
=0} against one with {X
=1, Z
=1}. If you want to do that comparison with other predictors like H
and S
held constant, then plugging in to your equation shows that the difference between those cases is $b_2+b_3$: $b_2$ for the difference for competitiveness in non-working-class districts, and $b_3$ for the extra difference from being a competitive working-class district.
That might not be all you need if the districts differ in values of H
or S
. Then you need to take the entire regression equation into account to make predictions. Also, if you want to get estimated errors for your predicted differences (a good idea), you would have to take the variances and covariances of the regression coefficient estimates into account.
Finally, a few warnings about actually doing such an analysis. First, your observations aren't all independent, as is assumed in standard linear regression. They are made repeatedly on the same districts, and the districts might have factors not included in your model that affect y
. Second, as the measurements are carried out over time (multiple election periods), you might also have to take into account correlations in building activity across time. That's another type of non-independence. Third, simple categorical breakdowns into things like working-class or not, competitive or not, are often not as useful as more continuous measures of such characteristics. Fourth, unless there are a lot of new houses built in each observation period and district, a simple linear regression (assuming a continuous outcome variable) might not be suitable. Fifth, building new houses takes time, and it's not clear how your model takes theat lag time between making a decision to build and finishing the house into account. If this is an exercise to start thinking hard about interaction terms that's OK, but if you want to carry out such an analysis you will need a more sophisticated approach that takes those issues into account.