Determine if data is IID

Question

I have data usage from Android and Iphones, and I want to check if the Iphone users consume more than the Android ones. I thought about doing a t-test, but I am not sure if the ID - I think we should be ok with the independence - holds. Also, how important is this requisite ?

If you have multiple measurements for at least some users then your data is not iid. — user2974951, Feb 15 '19 at 08:42
See also https://stats.stackexchange.com/questions/116355/what-does-independent-observations-mean/326161#326161 — kjetil b halvorsen, May 08 '21 at 13:19

Peter Leopold · Answer 1 · 2019-02-22T02:34:38.550

13

Based on the comments below, I've removed my first answer and have replaced it with my follow-up comment, which the OP may have found much more useful.

If the two populations are normal and you are testing for a difference in means,so there is a chance that they are not identically distributed, then yes, you can use the t-test of means. If the variances are different, then the Welch t-test adjusts the value of the standard error used in determining the test statistic. It also adjusts the number of degrees of freedom non-trivially See Wikipedia's Welch's t-test page for details.

What IID has do with a) study design (lots!) and/or b) actual raw unconditional data (nothing!) is a question for another post and really should not have part of my answer to this question. Apologies.

edited Feb 22 '19 at 02:34

answered Feb 15 '19 at 02:49

Peter Leopold

1,653
7
21

Thanks Peter. So let's suppose my data is normally distributed, but the mean and variance differ so it's not ID. I'd like to test if that difference is significant. Should I be ok doing a t-test? What is the risk there? – Luis Feb 15 '19 at 03:59
2

If the two populations are normal and you are testing for a difference in means,so there is a chance that they are not identically distributed, then yes, you can use the t-test of means. If the variances are different, then the Welch t-test adjusts the value of the standard error used in determining the test statistic t. It *also* adjusts the number of degrees of freedom non-trivially See https://en.wikipedia.org/wiki/Welch's_t-test for Welch's t-test with different variances. – Peter Leopold Feb 15 '19 at 04:23
1

Wow! I applaud your enthusiasm, and in good faith welcome you to the site, but I find this answer lousy! Nonparametric tests are not test of means (not without additional extremely rigid assumptions). I am also baffled as to why you consider IID a property of the analytic model, and do not direct attention to the question of whether a data generating process produces IID data, particularly given that there are a host of tests to provide evidence for and against the IID assumption. Finally, you go wide in your answer instead of narrowly honing in on the OP's question. – Alexis Feb 21 '19 at 17:39
PS If I am misunderstanding your meaning of "model" I am happy to be enlightened. – Alexis Feb 21 '19 at 17:45
4

@Alexis, thank you for your warm welcome, and I take your criticism to heart, so thank you for that as well. I seem to have upset several people with this answer. I think my key error was trying to read more into the question than the OP intended, and the rebound lay-up -- which was the conventional Welch's T-test answer -- was what the OP was looking for. I should simply ask the question "What does IID have to do with a) study design and b) actual data?" in a separate post. – Peter Leopold Feb 22 '19 at 02:22
2

No upsets here. Just intellectual criticism. And welcome again, please stay around. – Alexis Feb 22 '19 at 04:03

Determine if data is IID

1 Answers1