Context I'm testing the effects of various email interventions on getting people to sign up for a financial literacy event, and benchmarking them against a default, control intervention. The outcome variable is sign-up (i.e. Yes, or No). So it's a binary outcome.
For clarity:
- Control (T0) Email: A default email that the sponsoring agency has been using for years. ("Please sign up!")
- Treatment 1 (T1) Email: Same email, but adds an extra 'note'. (e.g., "By the way, did you know that.....")
- Treatment 2 (T2) Email: Same email again, with another type of 'note'.
- Treatment 3 (T3) Email: Same email again, with another type of 'note'.
- Treatment 4 (T4) Email: Same email again, with another type of 'note'.
Further to that, I want to study subgroups to see how the effects pan out. (Namely, in each intervention, did the males have a higher takeup rate? did the lower income groups have a higher takeup rate?)
So I have done my randomization, split my pool into 5 groups and sent them either of the 5 emails. And I'm counting responses as we speak.
Problem My initial thought is to run a logistic regression comparing the Control and each of the treatments, as the outcome is binary, and I can simply codify dummy variables per treatment. However, my control is not complete inaction - there is a default message, and therefore it is an in-principle treatment ("t0", if you will). In that sense, running a logistic regression as-is would seem ill-suited for the task, since a regression step would be comparing complete inaction against treatment, not my control against treatment.
My ask Could any of the folks here point me to a viable analysis method? Is there some sort of logistic regression suited for two treatments? (hence the title header?)
Thanks.