Thanks in advance.
I am new to mixed models and having several doubts about a mixed model (lme4's glmer, binomial) with multiple levels, measuring a proportion [0,1] in three time periods.
My data (without controls):
- The dependent variable is the percentage of votes for a given party in a given city-year that were cast for women.
- "year" Years 2008, 2012, 2016 as a factor variable.
- "impeachment" A factor variable that groups the parties into a group that supported impeachment (PSDB and the right) and the parties that opposed impeachment (PT and the left) in 2016.
- "pct_bolsonaro" A constant, continuous city-level variable that shows (latent) support for an anti-system candidate in a subsequent election.
- my levels are the city "ibge7" and the party list "party.list", the latter of which contains one to three of the yearly city-level party lists.
I'm interested in how voting for female candidates in local city council elections changed in anti-system districts in the election year 2016, which occurred during the impeachment process of the female president.
Here is a Dropbox link to the data
My understanding is that I should nest the party.list level inside the city level. This, along with the fixed effects for year (of substantive interest) are the most appropriate for handling the repeated measures. Does that sound correct?
My problem arises in that I get a nesting error when I attempt to use the explicit nesting language (1 | ibge7 / party.list): "couldn't evaluate grouping factor ibge7:party.list..." . ( This post suggests I can nest the party.list level with separate random intercepts (as I did in the model below) as long as my party.list variable is unique within cities, which I fulfilled. However, this tutorial suggests using explicit nesting. Could the nesting error be related to the unbalanced nature of the yearly party.lists? Is separate random intercepts a valid nesting strategy since party.lists are unique to cities, and do readers agree I need to nest party.lists in cities based on my research question below?
I am also getting a convergence error (see below). Could this be related to the nesting issue?
Note: I weight the response variable following this advice
# Model
library(ggplot2)
library(lme4)
library(ggeffects)
library(see)
ver <- readRDS("data/gender_democracy.rds")
conditional.intercepts <- glmer(Female / Party.Total ~ pct_bolsonaro * year * impeachment
+ (1 | ibge7) + (1 | party.list) ,
weights = Party.Total, family = binomial, data = ver,
control= glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 10000000)), nAGQ=1)
summary(conditional.intercepts)
plot(ggpredict(conditional.intercepts, term = c("year", "impeachment", "pct_bolsonaro")))
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: Female/Party.Total ~ pct_bolsonaro * year * impeachment + (1 | ibge7) + (1 | party.list)
Data: ver
Weights: Party.Total
Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e+07))
AIC BIC logLik deviance df.resid
32142242 32142570 -16071089 32142178 207439
Scaled residuals:
Min 1Q Median 3Q Max
-224.681 -4.557 -0.408 3.519 277.110
Random effects:
Groups Name Variance Std.Dev.
party.list (Intercept) 11.5619 3.4003
ibge7 (Intercept) 0.2379 0.4878
Number of obs: 207471, groups: party.list, 98462; ibge7, 5568
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.0965054 0.0345264 -89.685 < 2e-16 ***
pct_bolsonaro 0.6874152 0.1096435 6.270 3.62e-10 ***
year2012 0.0664325 0.0014046 47.296 < 2e-16 ***
year2016 0.1606648 0.0012625 127.260 < 2e-16 ***
impeachmentNo Federal Deputies -0.4166597 0.0454099 -9.176 < 2e-16 ***
impeachmentPSDB 0.3412179 0.0591027 5.773 7.77e-09 ***
impeachmentPT 0.5758914 0.0601784 9.570 < 2e-16 ***
impeachmentVoted to Impeach -0.0492452 0.0366724 -1.343 0.17932
pct_bolsonaro:year2012 0.2002510 0.0082936 24.145 < 2e-16 ***
pct_bolsonaro:year2016 0.7151838 0.0074416 96.107 < 2e-16 ***
pct_bolsonaro:impeachmentNo Federal Deputies 1.3278641 0.1978885 6.710 1.94e-11 ***
pct_bolsonaro:impeachmentPSDB 1.5146030 0.3291996 4.601 4.21e-06 ***
pct_bolsonaro:impeachmentPT 0.7387725 0.2374355 3.111 0.00186 **
pct_bolsonaro:impeachmentVoted to Impeach 1.0411148 0.1178119 8.837 < 2e-16 ***
year2012:impeachmentNo Federal Deputies 0.2530214 0.0022913 110.426 < 2e-16 ***
year2016:impeachmentNo Federal Deputies 0.1681143 0.0021400 78.560 < 2e-16 ***
year2012:impeachmentPSDB -0.0108824 0.0020784 -5.236 1.64e-07 ***
year2016:impeachmentPSDB 0.0006234 0.0018372 0.339 0.73437
year2012:impeachmentPT 0.0307188 0.0019116 16.070 < 2e-16 ***
year2016:impeachmentPT -0.0051848 0.0018075 -2.868 0.00412 **
year2012:impeachmentVoted to Impeach 0.0278516 0.0015123 18.417 < 2e-16 ***
year2016:impeachmentVoted to Impeach -0.0104830 0.0013640 -7.685 1.53e-14 ***
pct_bolsonaro:year2012:impeachmentNo Federal Deputies -0.6279720 0.0137620 -45.631 < 2e-16 ***
pct_bolsonaro:year2016:impeachmentNo Federal Deputies -1.1303625 0.0129276 -87.438 < 2e-16 ***
pct_bolsonaro:year2012:impeachmentPSDB -0.5342600 0.0117564 -45.444 < 2e-16 ***
pct_bolsonaro:year2016:impeachmentPSDB -1.0927577 0.0103405 -105.677 < 2e-16 ***
pct_bolsonaro:year2012:impeachmentPT -0.2603739 0.0111271 -23.400 < 2e-16 ***
pct_bolsonaro:year2016:impeachmentPT -0.1764549 0.0106061 -16.637 < 2e-16 ***
pct_bolsonaro:year2012:impeachmentVoted to Impeach -0.1913510 0.0088472 -21.628 < 2e-16 ***
pct_bolsonaro:year2016:impeachmentVoted to Impeach -0.6798329 0.0079621 -85.383 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation matrix not shown by default, as p = 30 > 12.
Use print(x, correlation=TRUE) or
vcov(x) if you need it
convergence code: 0
Model failed to converge with max|grad| = 7.71506 (tol = 0.001, component 1)
Model is nearly unidentifiable: very large eigenvalue
- Rescale variables?
Model is nearly unidentifiable: large eigenvalue ratio
- Rescale variables?
Edit: The following shows my effort to nest party.list within municipalities(ibge7) while avoiding the error I described above. The level variable "party.list" is unique to the municipality but occurs up to three times per municipality (once per year). You can see that in the first municipality "1100015" the party list "1100015DEM" occurs in both 2008 and 2012; however, "1100015PPS" occurs in 2008 but not in 2012. SIGLA_PARTIDO is just the name of the political party, so I used that with ibge7 to create a unique municipality-list factor (party.list):
> head(ver %>% select(year, ibge7, SIGLA_PARTIDO, party.list), n=20)
# A tibble: 20 x 4
year ibge7 SIGLA_PARTIDO party.list
<fct> <dbl> <fct> <fct>
1 2008 1100015 DEM 1100015DEM
2 2008 1100015 PC do B 1100015PC do B
3 2008 1100015 PDT 1100015PDT
4 2008 1100015 PMDB 1100015PMDB
5 2008 1100015 PPS 1100015PPS
6 2008 1100015 PR 1100015PR
7 2008 1100015 PSB 1100015PSB
8 2008 1100015 PSDB 1100015PSDB
9 2008 1100015 PSDC 1100015PSDC
10 2008 1100015 PSL 1100015PSL
11 2008 1100015 PT 1100015PT
12 2008 1100015 PTB 1100015PTB
13 2008 1100015 PTN 1100015PTN
14 2008 1100015 PV 1100015PV
15 2012 1100015 DEM 1100015DEM
16 2012 1100015 PC do B 1100015PC do B
17 2012 1100015 PDT 1100015PDT
18 2012 1100015 PMDB 1100015PMDB
19 2012 1100015 PP 1100015PP
20 2012 1100015 PR 1100015PR