Problem: There are two groups of customers, group A and group B. Group A have been subject to a campaign in terms of marketing and e-mailing while group B has not been exposed to anything. By looking at these customers spending for the last 12 months (or as long as the experiment has been conducted) I want to know if the average spending between customers in A and B differ as a result of the marketing.
Looking at the distribution of the spending for both groups it looks like this:
This is expected since there are many customers that do not buy within the time period in which we look. So the spending is not normally distributed. According to my co-worker one could still run a two sample t-test here with the motivation:
"in many cases one can do a t-test to to compare two means from a non-normal population since the two means that are compared, given large enough sample size, can always be assumed normally distributed given the CLT. The assumption of normality is done on the parameter being tested and it's distribution rather than the distribution of the population itself"
I feel there are some pitfalls here because of the overrepresentation of number of zeroes. Also, by the CLT, it seems as if the only test needed is z/t-tests since everything apparently becomes normal given sufficiently large sample size.
Is my co-worker right?