8

I have a dataframe called "cleaned", which consists of about 300,000 rows and 13 variables. Except the dependent variable, all variables are categorical and have multiple levels ($\geq2$). The dependent variable is numeric and takes values ranging from -1,500 to 3,296, mostly positive. Here is a summary of the dataframe:

> summary(cleaned$SumOf1st.Yr.Cash)
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-1574.00    37.37   101.50   155.60   204.60  3296.00 

> dput(head(cleaned))
structure(list(Submit.Qtr = structure(c(2L, 2L, 2L, 2L, 2L, 2L
), .Label = c("1Q11", "2Q11", "3Q11", "4Q11"), class = "factor"), 
    SUBMIT_CAL_MONTH = c(201104L, 201104L, 201104L, 201104L, 
    201104L, 201104L), SumOf1st.Yr.Cash = c(-221.81, 127.86, 
    662.09, 77.24, 370.4, 176), CARRIER_NAME = structure(c(56L, 
    116L, 4L, 116L, 82L, 114L), .Label = c("AARP-branded plans, insured by Aetna", 
    "Aetna", "Aetna Life Insurance Company", "Altius Health Plans", 
    "Altius One", "American Family Life Assurance Company of Columbus (Aflac)", 
    "American Family Life Assurance Company of New York (Aflac New York)", 
    "AmeriHealth - New Jersey", "Ameritas Life Insurance Corp.", 
    "Anthem BCBS. Serving residents of Indiana", "Anthem BCBS. Serving residents of Kentucky", 
    "Anthem BCBS. Serving residents of Ohio", "Anthem Blue Cross", 
    "Anthem Blue Cross and Blue Shield", "Anthem Blue Cross and Blue Shield of CT", 
    "Anthem Blue Cross and Blue Shield of NH", "Anthem Blue Cross and Blue Shield of VA", 
    "Anthem Blue Cross Blue Shield", "Anthem Blue Cross Blue Shield Indiana", 
    "Anthem Blue Cross Blue Shield Kentucky", "Anthem Blue Cross Blue Shield of Connecticut", 
    "Anthem Blue Cross Blue Shield of Missouri", "Anthem Blue Cross Blue Shield of Wisconsin", 
    "Anthem Blue Cross Blue Shield Ohio", "Anthem BlueCross BlueShield", 
    "Anthem Health Plans of Kentucky  Inc.", "Anthem Health Plans of New Hampshire  Inc.", 
    "Argus", "Arise Health Plan", "Arkansas Blue Cross and Blue Shield", 
    "Assurant", "Assurant Employee Benefits", "Assurant Health", 
    "Asuris Northwest Health", "Avera Health Plans", "AvMed Health Plans", 
    "Bay Dental", "BCBS of GA", "Blue Cross and Blue Shield of GA", 
    "Blue Cross and Blue Shield of Georgia", "Blue Cross and Blue Shield of Illinois", 
    "Blue Cross and Blue Shield of Kansas City", "Blue Cross and Blue Shield of Minnesota", 
    "Blue Cross and Blue Shield of South Carolina", "Blue Cross and Blue Shield of Texas", 
    "Blue Cross Blue Shield", "Blue Cross Blue Shield of Arizona", 
    "Blue Cross Blue Shield of Delaware", "Blue Cross Blue Shield of Florida", 
    "Blue Cross Blue Shield of Georgia", "Blue Cross Blue Shield of Michigan", 
    "Blue Cross Blue Shield of North Dakota", "Blue Cross of Idaho", 
    "Blue Cross of Northeastern Pennsylvania through its subsidiary First Priority Life Insurance Company", 
    "Blue Shield of California", "BlueCross BlueShield of Louisiana", 
    "BlueCross BlueShield of Montana", "BlueCross BlueShield of Nebraska", 
    "BlueCross BlueShield of Tennessee", "BlueCross BlueShield of Wyoming", 
    "Capital Blue Cross", "Care Improvement Plus", "CareFirst BlueCross BlueShield", 
    "Celtic Ins. Co.", "CeltiCare Healthplan of MA, Inc.", "Cigna", 
    "Clear One Health Plans", "ConnectiCare Inc.", "Coventry", 
    "Coventry Health and Life Insurance Co. FL", "Coventry Health and Life Insurance Company", 
    "Coventry Health Care of Delaware, Inc", "Coventry Health Care of Georgia, Inc.", 
    "Coventry Health Care of Illinois, Inc.", "Coventry Health Care of Iowa, Inc.", 
    "Coventry Health Care of Kansas Inc.", "Coventry Health Care of Louisiana, Inc.", 
    "Coventry Health Care of Missouri, Inc.", "Coventry Health Care of Oklahoma Inc.", 
    "Coventry Health Care of the Carolinas, Inc.", "Coventry Health Plan of Florida, Inc.", 
    "Dean Health Plan, Inc.", "Delta Dental Insurance Company (Delta Dental)", 
    "Delta Dental of California", "Delta Dental of Colorado", 
    "Delta Dental of Iowa", "Delta Dental of Minnesota", "Delta Dental of North Carolina", 
    "Dentegra Insurance Company", "Dominion Dental Services, Inc", 
    "Easy Choice Health Plan of New York", "EmblemHealth", "Empire", 
    "Empire BlueCross", "Evercare by UnitedHealthcare", "Everest Dental Plan", 
    "Fallon Community Health Plan", "Geisinger Choice", "Generic Medicare Carrier", 
    "Group Health", "HCC Life Insurance Company", "HCC Medical Insurance Services", 
    "Health Alliance Plan", "Health Insurance Innovations", "Health Net", 
    "Health Net of Arizona", "Health Net of Oregon", "Health Plan of Nevada", 
    "HealthAmerica", "HealthPartners", "HealthPlus Insurance Company", 
    "Highmark Blue Cross Blue Shield Delaware", "Highmark Blue Cross Blue Shield West Virginia", 
    "Horizon Blue Cross Blue Shield of New Jersey", "Humana", 
    "Humana CompBenefits", "Humana Health Benefit Plan of Louisiana  Inc.", 
    "Humana Health Insurance Company of Florida", "Humana Insurance Company of Kentucky", 
    "IHC Group", "IMG Global", "Independence Blue Cross", "Kaiser Foundation Health Plan of the NW", 
    "Kaiser Mid-Atlantic", "Kaiser Permanente CO", "Kaiser Permanente GA", 
    "Kaiser Permanente of CA", "Kaiser Permanente of HI", "Kaiser Permanente of Ohio", 
    "KPS Health Plans", "LifeWise Health Plan of Oregon", "LifeWise Health Plan of Washington", 
    "Lovelace Health Plans", "Madison National Life Insurance Company", 
    "Medica", "Medica of Minnesota", "Medical Mutual", "Mercy Health Plans", 
    "Mutual of Omaha", "Mutual Of Omaha", "Mutual of Omaha Insurance Company", 
    "MVP", "My Health Alliance", "Nationwide Life Insurance Company", 
    "Next Generation Insurance Group", "ODS Alaska", "ODS Health Plan, Inc.", 
    "Optima Health Insurance Company", "Oxford NJ", "Oxford NY", 
    "PacifiCare", "PacificSource Health Plans", "PacificSource Health Plans of Idaho", 
    "Physicians Health Plan of Northern Indiana, Inc.", "Physicians Plus", 
    "PreferredOne Insurance Company", "Premera Blue Cross", "Premera Blue Cross Blue Shield of Alaska", 
    "Presbyterian", "Providence Health Plan", "QCA Health Plan Inc", 
    "Regence Blue Cross Blue Shield of Oregon", "Regence Blue Cross Blue Shield of Utah", 
    "Regence Blue Shield of Idaho", "Regence BlueCross BlueShield of Oregon", 
    "Regence BlueCross BlueShield of Utah", "Regence BlueShield", 
    "Regence BlueShield of Idaho", "Regence Life and Health", 
    "Regence Life and Health Insurance Company", "RegenceBCBS", 
    "RegenceBS", "Rocky Mountain Health Plans", "Scott & White Health Plan", 
    "SecureHorizons by UnitedHealthcare", "Security Health Plan", 
    "Security Life Insurance Company of America", "SelectHealth", 
    "Seven Corners", "Sierra Health and Life", "Standard Security Life", 
    "Standard Security Life Insurance Company", "SummaCare Inc of Ohio", 
    "Symetra Life Insurance Company", "Total Dental Administrators Health Plan, Inc.", 
    "UniCare", "United Concordia Dental", "United of Omaha", 
    "United World", "UnitedHealthcare", "UnitedHealthcare Community Plan", 
    "UnitedHealthOne", "Unity Health Insurance", "Vision Plan of America", 
    "VSP", "WellCare", "WellCare Health Plans of New Jersey, Inc.", 
    "WellCare of Florida, Inc.", "WellCare of New York, Inc.", 
    "WellCare of Ohio, Inc.", "WellCare of Texas, Inc.", "WellCare Prescription Insurance, Inc.", 
    "Wellmark Blue Cross and Blue Shield of Iowa", "Wellmark Blue Cross and Blue Shield of South Dakota", 
    "WellPath Select, Inc.", "WINhealth Partners", "WPS", "WPS Health Insurance"
    ), class = "factor"), GENDER = structure(c(2L, 2L, 1L, 1L, 
    1L, 1L), .Label = c("F", "M", "U"), class = "factor"), FAMILY_TYPE = structure(c(1L, 
    1L, 1L, 1L, 2L, 2L), .Label = c("FAMILY", "INDIVIDUAL", "Unknown"
    ), class = "factor"), EXECUTIVE_AGE_GROUP = structure(c(5L, 
    5L, 4L, 3L, 3L, 6L), .Label = c("0-18 (>=0 AND <19)", "19-25 (>=19 AND <26)", 
    "26-29 (>=26 AND <30)", "30-39 (>=30 AND <40)", "40-49 (>=40 AND <50)", 
    "50-64 (>=50 AND <65)", "65+", "Unknown"), class = "factor"), 
    MARITAL_STATUS = structure(c(4L, 1L, 8L, 1L, 7L, 1L), .Label = c("", 
    "D", "L", "M", "O", "P", "S", "W"), class = "factor"), SELECTED_RIDERS = structure(c(11L, 
    1L, 1L, 10L, 1L, 10L), .Label = c("", "/Dental_Vision/", 
    "/Dental/", "/Dental/Other/", "/Life/", "/Life/Dental_Vision/", 
    "/Life/Dental/", "/Life/Dental/Other/", "/Life/Other/", "/None/", 
    "/Other/", "/Vision/", "/Vision/Dental/", "/Vision/Dental/Other/", 
    "/Vision/Life/", "/Vision/Life/Dental/", "/Vision/Life/Dental/Other/", 
    "/Vision/Life/Other/", "/Vision/Other/"), class = "factor"), 
    STATE_ABBR = structure(c(19L, 44L, 45L, 10L, 49L, 32L), .Label = c("AK", 
    "AL", "AR", "AZ", "CA", "CO", "CT", "DC", "DE", "FL", "GA", 
    "HI", "IA", "ID", "IL", "IN", "KS", "KY", "LA", "MA", "MD", 
    "ME", "MI", "MN", "MO", "MS", "MT", "NC", "ND", "NE", "NH", 
    "NJ", "NM", "NV", "NY", "OH", "OK", "OR", "PA", "RI", "SC", 
    "SD", "TN", "TX", "UT", "VA", "VT", "WA", "WI", "WV", "WY"
    ), class = "factor"), SumOfAPPROVED.MEMBERS = c(7L, 6L, 4L, 
    2L, 1L, 1L), PRODUCTLINE_TYPE = structure(c(4L, 3L, 4L, 3L, 
    4L, 4L), .Label = c("ACC", "CAN", "DT", "IFP", "MA", "MAPD", 
    "MD", "MS", "PDC", "ST", "STU", "TRV", "VSP"), class = "factor"), 
    CHANNEL = structure(c(1L, 1L, 2L, 3L, 2L, 3L), .Label = c("Direct", 
    "Online Advertising", "Performance Partners"), class = "factor")), .Names = c("Submit.Qtr", 
"SUBMIT_CAL_MONTH", "SumOf1st.Yr.Cash", "CARRIER_NAME", "GENDER", 
"FAMILY_TYPE", "EXECUTIVE_AGE_GROUP", "MARITAL_STATUS", "SELECTED_RIDERS", 
"STATE_ABBR", "SumOfAPPROVED.MEMBERS", "PRODUCTLINE_TYPE", "CHANNEL"
), row.names = c(2L, 4L, 5L, 6L, 7L, 8L), class = "data.frame")

I was trying to run a mixed effect model on these data

 lme(SumOf1st.Yr.Cash~ GENDER + FAMILY_TYPE + 
+         EXECUTIVE_AGE_GROUP + MARITAL_STATUS + SELECTED_RIDERS + 
+          PRODUCTLINE_TYPE +  CHANNEL, data=cleaned, random= ~1 | STATE_ABBR)

But I got an error message that I couldn't understand.

Error in MEEM(object, conLin, control$niterEM) : 
  Singularity in backsolve at level 0, block 1

I could run it when there is only one variable in the formula. But even with two I got the same error message.

lme(SumOf1st.Yr.Cash~ GENDER +  
         +                 EXECUTIVE_AGE_GROUP  , data=cleaned, random= ~1 | STATE_ABBR)

Error in MEEM(object, conLin, control$niterEM) : 
  Singularity in backsolve at level 0, block 1

 lme(SumOf1st.Yr.Cash~ GENDER   
+                 , data=cleaned, random= ~1 | STATE_ABBR)

Linear mixed-effects model fit by REML
  Data: cleaned    
  Log-restricted-likelihood: -1877607   
  Fixed: SumOf1st.Yr.Cash ~ GENDER 

(Intercept)     GENDERM     GENDERU     
  146.53490    12.27621   494.77983 

Random effects:
Formula: ~1 | STATE_ABBR

        (Intercept) Residual  
StdDev:    39.21808 177.8023

Number of Observations: 284485
Number of Groups: 51 
Vokram
  • 167
  • 1
  • 1
  • 7
Fiona.Ding
  • 111
  • 1
  • 1
  • 5
  • 1
    Perhaps try the `lmer()` function in package **lme4**, but singularity issues usually mean issues with numerical stability. You could try centring the data and even reordering the data in you data frame. Other causes are trying to estimate too-complex a model from the data. Would `EXECUTIVE_AGE_GROUP` by confounded with the random effect `STATE_ABBR` (or vice-versa)? – Gavin Simpson Jun 13 '13 at 17:37
  • Thanks for your response, Gavin. I don't think EXECUTIVE_AGE_GROUP is confounded with the random effect STATE_ABBR. I'm trying out `lmer`don't really understand how to structure the arguments. Do you know any resource that I can find some examples? – Fiona.Ding Jun 13 '13 at 23:04
  • @Fiona.Ding Try the following link: http://stats.stackexchange.com/questions/13166/rs-lmer-cheat-sheet I think it will help you on how to structure your arguments for `lmer`. – usεr11852 Jun 18 '13 at 22:10
  • For what it's worth, I voted to leave this open since there may be actual statistical issues here (e.g., numerical stability, confounding) that a good answer could address. – Matt Krause Nov 03 '16 at 20:09
  • Could it be that you have too many missing values in your dataset? Sometimes that error is associated to that situation...you can check it for example by: > which(is.na(data_name)==T) – Mireia Llorente Oct 16 '18 at 11:51

0 Answers0