I'm trying to reproduce an analysis (a transcriptomic analysis) that I found in a research paper. The methods section says:
After normalization an expression threshold for each cell line was calculated to get rid of low intensity probes that can be considered technical noise. First, probe sets were sorted by increasing expression value. For each probe set a t-test was performed to evaluate the differential expression between this probe set and the median value of the probe sets with less expression values.
I've used the shorth
function in R in order to reproduce this analysis, I would like to know if this is a appropriate method.
My data is from an Affymetrix Hugene 1.1 st array. After normalization using RMA (dat) I executed the following code to obtain the less expressed probe sets:
med.exp <- rowMedians(exprs(dat))
med <- shorth(med.exprs)
And then, I calculated a t.test
for comparing each probe sets with the med value
tt <- rep(0.8256, 23) ## 0.8256 is the value of *shorth(med.exprs)* and I have 23 samples
result.pvalue <- sapply(1:nrow(myAB1_rma.exprs), function(i) t.test(dat[i,], tt))
Is this approach valid?