Should one always expect the central tendency (i.e., mean and/or median) of a bootstrapped sample to be similar to the observed value?
In this particular case I have responses that are distributed exponentially for subjects across two conditions (I didn't run the experiment, I only have the data). I have been tasked with boot strapping the effect size (in terms of Cohen's d, the one-sample formula, i.e. $\bar{M_D}\over{s_D}$ where is the sample estimate of the population standard deviation. The forumla for this is provided in Rosenthal & Rosnow (2008) on pg 398, equation 13.27. They use $\sigma$ in the denominator because it is historically correct, however standard practice has misdefined d as using $s$, and so I follow through with that error in the above calculation.
I have randomized both within participants (i.e. a participants RT may be sampled more than once) and across subjects (participants may be sampled more than once) such that even if participant 1 is sampled twice, their Mean RT in both samples is unlikely to be exactly equal. For each randomized/resampled dataset I recalculate d. In this case $N_{sim} = 10000$. What I'm observing is a trend for the observed value of Cohen's d to be typically closer to the 97.5th percentile of than to 2.5th percentile of simulated observed values. It also tends to be closer to 0 than the median of the bootstrap (by 5% to 10% of the density of the simulated distribution).
What can account for this (keeping in mind the magnitude of the effect I'm observing)? Is it due to it being 'easier' upon resampling to obtain more extreme variances than those observed relative to the extremity of means upon resampling? Might this be a reflection of data that has been overly massaged/selectively trimmed? Is this resampling approach the same as a bootstrap? If not, what else must be done to come up with a CI?