If you're looking to get a "family effect" and an "item effect," we can think of there being random intercepts for both of these, and then model this with the 'lme4' package.
But, first we have to give each sibling a unique id, rather than a unique id within family.
Then for "the correlation between measurements taken on siblings within the same family for different items," we can specify something like:
mod<-lmer(value ~ (1|family)+(1|item), data=family)
This will give us a fixed effects intercept for all siblings, and then two random effects intercepts (with variance), for family and item.
Then, for "the correlation between measurements taken on siblings within the same family for the same item," we can do the same thing but just subset our data, so we have something like:
mod2<-lmer(value ~ (1|family), data=subset(family,item=="1"))
I think this might be an easier approach to your question. But, if you just want the ICC for item or family, the 'psych' package has an ICC() function -- just be cautious about how item and value are melted in your example data.
Update
Some of the below is new to me, but I enjoyed working it out. I’m really not familiar with the idea of negative intraclass correlation. Though I do see on Wikipedia that “early ICC defintions” allowed for a negative correlation with paired data. But as it’s most commonly used now, ICC is understood as the proportion of the total variance that is between-group variance. And this value is always positive. While Wikipedia may not be the most authoritative reference, this summary corresponds with how I’ve always seen ICC used:
An advantage of this ANOVA framework is that different groups can have different numbers of data values, which is difficult to handle using the earlier ICC statistics. Note also that this ICC is always non-negative, allowing it to be interpreted as the proportion of total variance that is “between groups.” This ICC can be generalized to allow for covariate effects, in which case the ICC is interpreted as capturing the within-class similarity of the covariate-adjusted data values.
That said, with data like you’ve given here, the inter-class correlation between items 1, 2, and 3 could very well be negative. And we can model this, but the proportion of the variance explained between groups will still be positive.
# load our data and lme4
library(lme4)
## Loading required package: Matrix
dat<-read.table("http://www.wvbauer.com/fam_sib_item.dat", header=TRUE)
So what percentage of the variance is between families, controlling also for between group variance between item-groups? We can use a random intercepts model like you suggested:
mod<-lmer(yijk ~ (1|family)+(1|item), data=dat)
summary(mod)
## Linear mixed model fit by REML ['lmerMod']
## Formula: yijk ~ (1 | family) + (1 | item)
## Data: dat
##
## REML criterion at convergence: 4392.3
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.6832 -0.6316 0.0015 0.6038 3.9801
##
## Random effects:
## Groups Name Variance Std.Dev.
## family (Intercept) 0.3415 0.5843
## item (Intercept) 0.8767 0.9363
## Residual 4.2730 2.0671
## Number of obs: 1008, groups: family, 100; item, 3
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.927 0.548 5.342
We calculate the ICC by getting the variance from the two random effects intercepts and from the residuals. We then calculate the square of family variance over the sum of the squares of all variances.
temp<-as.data.frame(VarCorr(mod))$vcov
temp.family<-(temp[1]^2)/(temp[1]^2+temp[2]^2+temp[3]^2)
temp.family
## [1] 0.006090281
We can then do the same for the other two variance estimates:
# variance between item-groups
temp.items<-(temp[2]^2)/(temp[1]^2+temp[2]^2+temp[3]^2)
temp.items
## [1] 0.04015039
# variance unexplained by groups
temp.resid<-(temp[3]^2)/(temp[1]^2+temp[2]^2+temp[3]^2)
temp.resid
## [1] 0.9537593
# clearly then, these will sum to 1
temp.family+temp.items+temp.resid
## [1] 1
These results suggest that very little of the total variance is explained by variance between families or between item-groups. But, as noted above, the inter-class correlation between items still could be negative. First let’s get our data in a wider format:
# not elegant but does the trick
dat2<-cbind(subset(dat,item==1),subset(dat,item==2)[,1],subset(dat,item==3)[,1])
names(dat2)<-c("item1","family","sibling","item","item2","item3")
Now we can model the correlation between, for example, item1 and item3 with a random intercept for family as before. But first, perhaps worth recalling that for a simple linear regression, the square root of the model’s r-squared is the same as the inter-class correlation coefficient (pearson’s r) for item1 and item2.
# a simple linear regression
mod2<-lm(item1~item3,data=dat2)
# extract pearson's r
sqrt(summary(mod2)$r.squared)
## [1] 0.6819125
# check this
cor(dat2$item1,dat2$item3)
## [1] 0.6819125
# yep, equal
# now, add random intercept to the model
mod3<-lmer(item1 ~ item3 + (1|family), data=dat2)
summary(mod3)
## Linear mixed model fit by REML ['lmerMod']
## Formula: item1 ~ item3 + (1 | family)
## Data: dat2
##
## REML criterion at convergence: 1188.8
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.3148 -0.5348 -0.0136 0.5724 3.2589
##
## Random effects:
## Groups Name Variance Std.Dev.
## family (Intercept) 0.686 0.8283
## Residual 1.519 1.2323
## Number of obs: 336, groups: family, 100
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) -0.07777 0.15277 -0.509
## item3 0.52337 0.02775 18.863
##
## Correlation of Fixed Effects:
## (Intr)
## item3 -0.699
The relationship is between item1 and item3 is positive. But, just to check that we can get a negative correlation here, let’s manipulate our data:
# just going to multiply one column by -1
# to force this cor to be negative
dat2$neg.item3<-dat2$item3*-1
cor(dat2$item1, dat2$neg.item3)
## [1] -0.6819125
# now we have a negative relationship
# replace item3 with this manipulated value
mod4<-lmer(item1 ~ neg.item3 + (1|family), data=dat2)
summary(mod4)
## Linear mixed model fit by REML ['lmerMod']
## Formula: item1 ~ neg.item3 + (1 | family)
## Data: dat2
##
## REML criterion at convergence: 1188.8
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.3148 -0.5348 -0.0136 0.5724 3.2589
##
## Random effects:
## Groups Name Variance Std.Dev.
## family (Intercept) 0.686 0.8283
## Residual 1.519 1.2323
## Number of obs: 336, groups: family, 100
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) -0.07777 0.15277 -0.509
## neg.item3 -0.52337 0.02775 -18.863
##
## Correlation of Fixed Effects:
## (Intr)
## neg.item3 0.699
So yes, the relationship between items can be negative. But if we look at the proportion of variance that’s between families in this relationship, i.e., ICC(family), that number will still be positive. As before:
temp2<-as.data.frame(VarCorr(mod4))$vcov
(temp2[1]^2)/(temp2[1]^2+temp2[2]^2)
## [1] 0.1694989
So for the relationship between item1 and item3, about 17% of this variance is due to variance between families. And, we’ve still allowed for there to be a negative correlation between items.