Background:
I'm studying people seeking help. Participants described contacts with between 1 and 3 "responders" (e.g., friends, the police) in order- for example, a participant could have contacted just responder 1, or responder 1, then responder 2, then responder 3. I'm trying to predict help-seeking dropout, meaning that, for example, a participant contacted responder 1 but did not go on to contact a second or third responder- that participant would have a dropout at responder 1. So unlike other survival analysis, the observations are responders rather than time points- but they're still ordered in time. The independent variables in my model include characteristics of the people seeking help (e.g., gender) and aspects of their interactions with the responders (e.g., whether they liked the interaction). The data are right-censored for those participants who said that they contacted more than three responders because they could not record more than three responders in the survey. There are two people who only reported on responder 3; those people are left-censored because data are missing for responders 1 and 2.
Data setup:
The data are set up as a person-period dataset such that there is a line for each responder, which means that some participants have multiple lines. Responders are nested within participants. So a participant that contacted two responders would have two lines in the dataset; the participant-level data is the same in both lines and the responder-level data is different.
Here's what the data looks like: https://flic.kr/p/s1fW6k
Variables:
id is the ID number for the participant.
responder represents the responder number in the order that the participant contacted them. Possible values are 1, 2, and 3.
stoppedhelpseeking represents whether the participant stopped seeking help/dropped out after contacting that responder; 0 = no and 1 = yes.
gender is the participant's gender; 1 = woman and 2 = man
likedresponder represents whether the participant liked their interaction with the responder; 0 = no and 1 = yes
censor represents whether the participant did not report a dropout by responder 3.
Here is the code that I have (from the Allison survival analysis/SAS book):
proc phreg data = helpseeking plots=survival;
class id;
model responder*stoppedhelpseeking(0) = gender likedresponder /ties=efron;
run;
My questions:
-If I'm predicting dropout, should the code be stoppedhelpseeking(0) or stoppedhelpseeking(1)?
-How do I account for the right-censoring? I'm concerned that it's not explicitly reflected in my code.
-Do I need to account for the left-censoring, or should I drop those two people from analyses?
-Any other issues with the code that I should know about?