I am investigating data from a randomised control trial, where treatment allocation is done on a 2:1 ratio (2 patients on the experimental treatment for every 1 patient on placebo). 400 on experimental treatment and 200 on placebo, for example.
I am conducting some exploratory analysis to investigate if there are any significant interactions between the treatment term (binary covariate) and other selected covariates, when using death (binary dependent variable) as the outcome.
As mentioned, the trial has 2:1 randomisation, so there is an imbalance in the number of patients on the experimental drug and placebo.
My plan is to build a logistic regression model with death as the dependent variable. The model will include a treatment term and other relevant covariates (selected using AIC) and also any relevent interactions involving the treatment term.
My question surrounds the imbalance in the treatment allocation (Note: not the imbalance in the dependent variable death): Does the fact that the treatment arms are unequal (400 experimental, 200 placebo) have any impact on the conclusions I can draw from the logistic model? I have been led to believe that the treatment imbalance leads to differing variance in each treatment arm (np(1-p)) - is this really a problem? If so, can it be solved?
I have considered using upsampling to balance the treatment arms. Is there anything such as weighted logistic regression or conditional logistic regression that would be suitable?
I realise there has been much discussion surrounding unbalanced datasets, however the discussion is usually focused on imbalance in the dependent variable: Are unbalanced datasets problematic, and (how) does oversampling (purport to) help?