I have a dataset containing information about patients in a hospital, with the following variables:
- Status for a certain disease (binary outcome)
- Hundreds of continuous biomarkers
- A few variables for adjustment (age, gender, etc.)
- Patient ID
My objective is to find biomarkers that are associated with the disease. I have more biomarkers than observations, so I thought of ridge/LASSO logistic regression. But I also need to take into account that I have samples collected from the same patient, so I need to include Patient ID as a random effect.
The problem if that I am also expected to provide p-values (or maybe posterior probabilities) for each biomarker. The R packages that I tried did not provide this info.
I thought of doing permutation tests, but this would be very time consuming, since I have many biomarkers and the p-values would have to be precise enough for me to use multiple test corrections afterwards.
Answers with R/Python code or references would be appreciated. Thank you in advance!