In R, I have produced a boxplot for two different groups, with discrete y-values between 1 and 20. My goal from this work is to investigate whether the average count differs between A and B:
unscaled = ggplot(data3, aes(x = A_or_B, y = Count))+geom_boxplot()
The problem with this is the significant skewness towards one side, which I believe should be solved with an appropriate transformation. The difficulty is that the mode count
= 1 and decreases exponentially as count
increases. I have tried and considered several transformations.
I have tried a log transformation, which I believe failed because the value the plot is weighted around is 1:
data3[33]=log(data3["Count"])
logTransformed = ggplot(data3, aes(x = A_or_B, y = logCount))+geom_boxplot()
I also tried 1/e as a transformation:
data3[34]=(1/exp(data3["Count"])
One_ovr_e_Transformed = ggplot(data3, aes(x = A_or_B, y = One_over_e_Count))+geom_boxplot()
None of these look as I would expect/want them to look. I'm struggling to find other appropriate transformations that could be applied.