I'll eliminate all the biological details and experiments and quote just the problem at hand and what I have done statistically. I would like to know if its right, and if not, how to proceed. If the data (or my explanation) isn't clear enough, I'll try to explain better by editing.
Suppose I have two groups/observations, X and Y, with size $N_x=215$ and $N_y=40$. I would like to know if the means of these two observations are equal. My first question is:
If the assumptions are satisfied, is it relevant to use a parametric two-sample t-test here? I ask this because from my understanding its usually applied when the size is small?
I plotted histograms of both X and Y and they were not normally distributed, one of the assumptions of a two-sample t-test. My confusion is that, I consider them to be two populations and that's why I checked for normal distribution. But then I am about to perform a two-SAMPLE t-test... Is this right?
From central limit theorem, I understand that if you perform sampling (with/without repetition depending on your population size) multiple times and compute the average of the samples each time, then it will be approximately normally distributed. And, the mean of this random variables will be a good estimate of the population mean. So, I decided to do this on both X and Y, 1000 times, and obtained samples, and I assigned a random variable to the mean of each sample. The plot was very much normally distributed. The mean of X and Y were 4.2 and 15.8 (which were the same as population +- 0.15) and the variance was 0.95 and 12.11.
I performed a t-test on these two observations (1000 data points each) with unequal variances, because they are very different (0.95 and 12.11). And the null hypothesis was rejected.
Does this make sense at all? Is this correct / meaningful approach or a two-sample z-test is sufficient or its totally wrong?I also performed a non-parametric Wilcoxon test just to be sure (on original X and Y) and the null hypothesis was convincingly rejected there as well. In the event that my previous method was utterly wrong, I suppose doing a non-parametric test is good, except for statistical power maybe?
In both cases, the means were significantly different. However, I would like to know if either or both the approaches are faulty/totally wrong and if so, what is the alternative?