Small Sample Size - Skewed Data - Which statistical methods am I allowed to use in my bachelor thesis?

Question

I have 14 sample network graphs and I want to empirically analyze a bunch of their metrics (e.g density, clustering coefficient, and 8 others) to draw conclusions on how dense and clustered one can expect these graphs to get.

The resulting metrics in general appear to be skewed. Given that, am I still in the right to use the mean rather than the median, provide confidence intervals of the means and analyze the standard deviation of the different metrics, while analyzing the correlation between the variables with the Pearson- or Spearman Coefficient?

Am I obliged to test every metric of my sample specifically for normal distribution and subsequently employ different (parametric/non-paremetric) methods on each metric?

Since I am left alone with this thesis, I do not even know if my approach is right in the first place. I am happy about any suggestions about the design of my resulting report.

Thank you so much for your help!

Each of these issues are discussed on site already, but you ask multiple questions so their answers are split across multiple previous questions. 1. The term "allowed" suggests there's a person or people preventing you from doing analyses you might otherwise choose to do. There's no such group; you *can* do what you please analysis-wise, it's not like anyone can stop you. The question to ask yourself is what are the properties of what you might consider doing and what claims can you make that are supported well enough to convince others? 2. With skewed distributions, means are not "bad" — Glen_b, Feb 17 '22 at 09:21
... nor are medians "good"; it depends on what you're trying to find out. Sometimes means are exactly the thing you want. 3. The central issue with Pearson or Spearman correlations is whether you're interested in linear relationships or monotonic relationships, the issue of normality is fairly secondary and in any case can be avoided (you don't have to use a test that assumes it). 4. On testing normality assumptions, see [Is normality testing essentially useless?](https://stats.stackexchange.com/questions/2492/is-normality-testing-essentially-useless) (among a number of other posts on site) — Glen_b, Feb 17 '22 at 09:23

score 0 · Answer 1 · answered Feb 16 '22 at 22:28

Normality testing isn't very powerful with small samples like yours (and it can show "significant" but unimportant deviations in very large samples). See the page "Is normality testing essentially useless?" for extensive discussion.

There are useful non-parametric tests that correspond to standard parametric tests and can perform almost as well as the parametric test even when the assumptions of the parametric test are met. For example, see this page about the Mann-Whitney-Wilcoxon test versus a t-test. This page examines Pearson correlation versus non-parametric Spearman or Kendall. It probably makes the most sense for you to concentrate on non-parametric tests with small sample sizes and substantial skew.

The means and standard deviations are probably worth reporting even if you use nonparametric significance tests. You might want to supplement those characterizations with the median and the median absolute deviation, which are less affected by outliers.

Small Sample Size - Skewed Data - Which statistical methods am I allowed to use in my bachelor thesis?

1 Answers1