The below is a partial data set showing the mean user ratings for a number of products, each of which is available in a number of standard versions, e.g. a common feature added to each product such as adding electric windows to a car.
Each mean rating is comprised of circa 17 user ratings, though some are more. The user ratings were submitted on a scale from -100 to +100. N/A indicates a product which is not available in that particular version. I understand from this question I asked previously that I should not replace the N/A values with a 0 or an average.
The data is continuous interval type data.
Control V1 V2 V3 V4 V5 V6
Product 1 1.63 -5.19 -0.48 5.79 8.89 4.19 15.73
Product 2 0.60 0.84 4.47 N/A 0.52 21.17 N/A
Product 3 4.53 -15.20 -19.66 N/A 2.84 N/A 13.07
Product 4 7.30 17.53 20.25 17.04 N/A 4.60 9.28
Product 5 -4.05 -21.33 -14.00 -13.00 N/A -23.71 -8.71
Product 6 26.27 14.53 N/A 21.24 N/A 27.25 35.18
Product 7 -3.12 N/A N/A N/A N/A 7.88 17.38
Mean Ratings 4.74 -1.47 -1.88 7.77 4.08 6.90 13.66
I want to compare the effect of the different standard versions compared to the control.
So, I think I should be using two tailed Z-Tests so I can see how far above or below the control mean the version mean or each of its individual products is. Here is my reasoning:
- The vast majority of user ratings that make up my means are normally distributed. I checked using Kolmogorov-Smirnov and Shapiro-Wilk scores.
- I checked everything that failed either of the tests with Q-Q plots and they are approximately normally distributed.
- My population is > 30
- While each individual mean is circa 15 user rateings the total of the control is circa 105 and V1 is 105, I think this is correct.
- I can derive the Standard Deviation for
- The individual mean scores
- The combined mean scores of the Control and V1 shown below
My Hypothesis
H0 - Version 1 will have no effect on the mean user rating
- The mean will not be significantly different to the control
Ha - Version 1 will have an effect on the mean user rating
- The mean will be significantly different to the control
I need to test this claim using alpha 0.05 or +1.96 to -1.96
Z-Test
I first took the mean ratings for each of the versions (from above)
Mean Ratings 4.74 -1.47 -1.88 7.77 4.08 6.90 13.66
And used SPSS to calculate Z-Scores for each
Control = -.13054
Version 1 = -1.42611
Version 2 = -.72721
Version 3 = .50160
Version 4 = -.26823
Version 5 = .32009
Version 6 = 1.73041
This tells me that none of the common versions had a significant effect on the mean user ratings for the products.
My questions are:
Are Z tests suitable for this sort of data analysis? (Since answered Yes)
And if yes is my reasoning and my attempts correct?
If anyone could point out any glaringly obvious mistakes or problems with my method it would be much appreciated
As always, any help is much appreciated.
EDIT:
I suspect that one issue is that I am including my control mean in my Z-Test calculation, which is skewing the results. But I am not sure how to undertake such a test when I am comparing it to a known mean...?
Edit 2:
In response to David Cs answer I am able to calculate the variance for each mean rating e.g. the Control = 105.999
Descriptive Statistics
N Minimum Maximum Mean Std. Deviation Variance
Control 7 -4.05 26.27 4.7371 10.29558 105.999
Valid N (listwise) 7