Can someone explain to me, how the explained / residual variance from the dissassoc
command is computed? I tried to add up the variance within using the discrepancies of the different levels, but it didn't work. Let's take a small random sample (n=10).
data(mvad)
library(dplyr)
set.seed(10)
mv = mvad %>% group_by(male) %>% sample_n(5)
mvad.seq <- seqdef(mv[,17:20])
# compute the dissimilarity matrix
mvad.ham <- seqdist(mvad.seq, method="HAM")
# compute the discrepancy analysis
d = dissassoc(mvad.ham, group = mv$male, R=10)
d
# Pseudo ANOVA table:
# SS df MSE
# Exp 1.7 1 1.700000
# Res 9.6 8 1.200000
# Total 11.3 9 1.255556
I understand how to compute the Total
by hand:
sum(mvad.ham) / (2* 10) # = Total 11.3
Questions:
- How do you compute the
Exp
andRes
values by hand? - Could you please demonstrate us how do you compute the Total Sum of Square?
I understand from (Studer et al., 2010) that the equation is:
$$ SS = \sum_{i=1}^{n} w_i(y_i - \bar{y})^2 $$
What does the $\bar{y}$ represents exactly? Could you demonstrate on this example how one can compute the $SS$ from mvad.ham
manually?