1

I work for a government department. We fund a program that provides case management support for adult students in vocational training. We have around 1,000 students per year that access the support. There are around 100,000 other students per year.

I'm trying to determine whether their participation in the support program has an impact on whether they complete the course.

I think what I need to do is establish a group of students that have similar characteristics as the students that get support. Then compare success in training across both groups.

I've been working with R for a little bit and I've been doing some research. I think I'd need to develop a multilevel model that gives a propensity score. I've also been looking at tutorials on decision trees, but I'm not sure if this would give me what I'm looking for.

My questions are:

  • What approaches can be used to select a sample group from the 100,000 students that mirrors the characteristics of the group of 1,000 that received support? The variables are a mixture of ages, genders, regional/metro locations, disability and previous study levels. Links to examples or tutorials would be helpful.

  • Is a propensity score the right tact to consider the relationship with support and completion? Again please suggest any tutorials or examples or other alternatives.

Thanks in advance

Ethan G
  • 11
  • 2
  • With a 100,000 students and only 5 characteristics (age, gender, location, disability, study levels) it might be possible to do exact matching, depending on how diverse the 1,000 treated students are. Also, for future iterations could you randomly assign people to the *option* of accessing the support? – Yannis Vassiliadis May 31 '18 at 07:01

1 Answers1

1

You can fit this with propensity score matching to test if participation in the support program (treatment) affects students successful completion. Here https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3144483/ Peter C. Austin's article will give you an introduction to the topic and the different alternatives. Article by Abdia, Younathan, et al. "Propensity scores based methods for estimating average treatment effect and average treatment effect among treated: A comparative study." Biometrical Journal 59.5 (2017): 967-985. is also worth to read. It gives you detailed estimation startegies dpending on your interset ois whether ATE or ATT. Here https://sejdemyr.github.io/r-tutorials/statistics/tutorial8.html you will have a tutorial on how to implement it in R. TWANG also give you tutorial both in R and Stata. If you choose the doubly robust technique, you will be benifited from previous CV trends Doubly robust estimation of causal effects implementation.

In any case should you have time it's worth whiel to read the original article by Rosenbaum, Paul R.; Rubin, Donald B. (1983). "The Central Role of the Propensity Score in Observational Studies for Causal Effects".

Good luck.

Million
  • 96
  • 5
  • Thanks very much for your advice. I've since read the papers and used the catholic school example as a guide for my own analysis. The MatchIt function is quite easy to use. – Ethan G Jun 08 '18 at 05:05