I am a bioinformatics masters student who is working on a study at work and would like some advice on a biostatistics issue. As a side note: we will be consulting with a biostats core but I wanted to get practice working through problems like this.
Data: I have a data set that contains phenotypic information on a cell population for each subject. The readout of the phenotype is the % positive for that marker in that population. Here is an example table showing the data
Group Participant Gender Age Caucasian API Hispanic Other Date Marker1 Marker2 Marker3
2 Group1 A1 0 39.74795 1 0 0 0 2018-07-11 1.77 13.60 77.8
3 Group1 A2 1 39.50411 0 1 0 0 2018-07-11 1.38 3.90 96.1
4 Group1 A3 0 43.79178 1 0 0 0 2018-07-25 9.34 13.60 85.2
5 Group1 A4 0 42.80274 0 0 0 0 2018-07-11 2.06 4.08 77.6
6 Group2 A5 1 41.27619 1 0 0 0 2018-07-25 0.65 16.00 79.9
7 Group2 A6 0 42.07710 1 0 0 0 2018-07-25 2.46 18.20 93.8
8 Group2 A7 0 42.70411 0 1 0 0 2018-07-11 0.30 0.00 75.0
10 Group2 A8 0 38.70387 0 0 0 0 2018-07-11 1.48 3.73 84.4
11 Group3 A9 0 40.71483 0 0 0 0 2018-07-25 1.76 7.48 90.5
13 Group3 A10 1 38.96690 0 1 0 0 2018-07-25 5.87 12.90 81.6
15 Group3 A11 0 41.46002 0 1 0 0 2018-07-25 2.40 18.80 96.0
16 Group3 A12 0 33.87945 0 1 0 0 2018-07-11 4.16 8.56 60.4
As you can see there are 3 groups with different individuals in each. Now getting to my question: we are looking to determine the impact of aging on these phenotypic markers. But, we have a lot of variables that can also impact this, like gender, race, and in infected groups the type and length of treatment. The date column is the date the samples were run. I am looking for guidance on how to best design a linear model formula. My initial readings suggested doing
Marker1 ~ Age + Gender + Caucasian + API + Hispanic + Other + Date
But I am curious if this is the best method? I read about the lmer()
function and was wondering if using Date
in that for the random effect using something like:
Marker1 ~ Age + Gender + Caucasian + API + Hispanic + Other + (1 | Date)
Thanks for all advice!