I apologize in advance if my questions seem incredibly dull. Trying to teach myself Stata before I get to grad school as someone who's brain is very much not made for handling statistics.
So trying to understand how to perform a logistic regression using Stata. Using this data set from PEW:https://www.pewresearch.org/social-trends/dataset/american-trends-panel-wave-68/.
So thought it might be interesting to see if mask wearing can be explained by partisanship, knowledge of health risk (using variables living in a metropolitan area, age, educational level, and how closely they follow the news), and income (because I'm thinking this would affect access to resources/masks?).
I recoded my variables of interest. Run models, and the Pseudo R2 value seems to imply that with each model, the explanatory power increases, but when I try to check Homer and Lemeshow's fit, my model is super not a good fit. Any tips? Am I setting up this model correctly?
Thanks in advance!
Here is my code:
import spss using "C:\Users\NAME\Desktop\ATP W68.sav"
recode F_AGECAT (1/3 = 0) (4 = 1), gen(age)
label variable age "Age"
label define abracket 0 "18-64" 1 "65+"
label value age abracket
replace age = . if age == 99
recode F_INCOME_RECODE (3 = 0) (1/2 = 1), gen(income)
label variable income "Income Level"
label define irange 0 "<30,000" 1 "30,000+"
label value income irange
replace income = . if income == 99
recode COVIDFOL_W68 (3/4 = 0) (1/2 = 1), gen(news)
label variable news "News Following"
label define closely 0 "Not too closely/Not at all" 1 "Very/Fairly closely"
label value news closely
replace news = . if news == 99
recode F_EDUCCAT (3 = 0) (1/2 = 1), gen(edu)
label variable edu "College Degree?/Eucational Attainment"
label define elevel 0 "HS or less" 1 "College+"
label value edu elevel
replace edu = . if edu == 99
recode COVIDMASK1_W68 (3/4 = 0) (1/2 = 1), gen(mask)
label variable mask "Mask Wearing Behaviour"
label define often 0 "Hardly/Never" 1 "All/Some of the Time"
label value mask often
drop if mask == 5
drop if mask == 99
recode F_PARTYSUM_FINAL (1 = 0) (2 = 1), gen(party)
label variable party "Party"
label define demrep 0 "Republican" 1 "Democrat"
label value party demrep
replace party = . if party == 9
recode F_METRO (1 = 1) (2 = 0), gen(metro)
label variable metro "Environment"
label define area 0 "Non-Metropolitan" 1 "Metropolitan"
label value metro area
tab1 metro party mask edu news income age
tabulate mask party, column V
tabulate mask metro, column V
tabulate mask edu, column V
tabulate mask news, column V
tabulate mask income, column V
tabulate mask age, column V
*logit model with risk factors and knowledge
logit mask metro age edu news
logit mask metro age edu news, or
*logit model with risk factors and knowledge, and income
logit mask metro age edu news income
logit mask metro age edu news income, or
*logit model with risk factors and knowledge, and income and party
logit mask metro age news income party
logit mask metro age news income party, or
estat classification
lroc
lfit, group(10) table