Imagine I do a randomized experiment at the beginning of the school year. Incoming freshmen (a) participate in a diversity class or (b) do not. At the end of the year, I send them emails asking to fill out one 4-point Likert question on how they feel toward diversity on campus.
Now, imagine that some given $k$ percent of students do not answer this item. However, I have a large number of variables about the students that both did and did not drop out: demographics, classes they took, where they are from, their high school GPA, etc.
I want to make a valid causal inference about if the diversity class had any effect on attitudes toward diversity, using an ordered logistic regression. However, the non-response/dropout rate could bias this: What if the people who didn't support diversity respond at a lower rate? How should I handle the information I can get from these $k$ percent of cases that dropped out? I could do a logistic regression seeing if any of the variables I have predict non-response—but what do I do after that?
Note that many treatments of this focus on surveys and polls, where the goal is to generalize to a population. That is not my goal here: I am interested in retaining validity of causal inferences of my experimental condition.
I am unfamiliar with this area: What are some references to get me up to speed on how to analyze these data in a way that provides valid causal inferences? I know there are solutions like propensity score matching and weighting cases based on demographics, but I do not know where to begin with my studies of the issue. Where should I begin? Any good papers, books, tutorials, R packages and vignettes, etc.?