What statistical test would I use?

Question

I currently have a sample of 50 participants on SPSS.

I want to identify if people are more willing to travel inter-state for holidays than to travel overseas for holidays in near future.

The variable "travelling interstate" is ordinal as it is divided into four categories (1=definitely not, 2=probably not, 3=possibly yes, 4=definitely yes).

The variable "travelling overseas" is also ordinal as it is also divided into four categories (1=definitely not, 2=probably not, 3=possibly yes, 4=definitely yes).

I'm confused as to which test I would use to carry out this analysis. I'm assuming it wouldn't be any of the t-tests due to the ordinal variables.

BruceET · Answer 1 · 2021-05-25T22:48:29.880

Paired t tests and Wilcoxon SR tests are inappropriate. If you were to do paired t test on the differences between preference for US and foreign travel, one would wonder about the validity of the results because it is unlikely that fifty small-integer values would be approximately normally distributed.

If you were to try to do a Wilcoxon signed-rank test, then you would likely get a warning message about the large number of 0 differences and other ties among differences.

Sign test is a better choice. Suppose you had score differences d for travel in US minus foreign travel such as the fictitious ones below:

table(d)
d
-2 -1  0  1 
17 12 13  8

Then you could do a sign test, by ignoring $0$-differences, which provide no direct information about a preference: Out of $m=37$ subjects who had different opinions, $x = 29$ were more likely to travel in the US, while $m-x =8$ were more likely to do foreign travel. Under the null hypothesis that the two kinds of travel are equally favored, $X \sim \mathsf{Binom}(37, .5).$

Then prop.test in R rejects the null hypothesis against the two-sided alternative with a P-value almost $0.$ So such an extreme imbalance between positive and negative opinion is extremely unlikely if $H_0$ were true. This version of prop.test is sometimes called a sign test.

prop.test(29, 37)

        1-sample proportions test with continuity correction

data:  29 out of 37, null probability 0.5
X-squared = 10.811, df = 1, p-value = 0.001009
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
 0.6133573 0.8957582
sample estimates:
        p 
0.7837838

The version of the sign test above uses a normal approximation, which should be OK for $m$ as large as 37. An exact test, which uses binomial CDFs could also be used:

binom.test(29, 37)

        Exact binomial test

data:  29 and 37
number of successes = 29, number of trials = 37, p-value = 0.0007529
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.6178635 0.9017344
sample estimates:
probability of success 
             0.7837838

The P-value of the exact binomial test is computed as:

pbinom(8, 37, .5) * 2
[1] 0.0007528971

pbinom(8, 37, .5) + 1 - pbinom(28, 37, .5)
[1] 0.0007528971

Also, the 'sign test for median 0' from Minitab statistical software agrees with the exact binomial test in R:

Sign Test for Median: d 

Sign test of median =  0.00000 versus ≠ 0.00000

    N  Below  Equal  Above       P  Median
d  50     29     13      8  0.0008  -1.000

Notes: (1) For the record, abbreviated output for paired versions of t.test and wilcox.test is shown below. P-values are small and would lead to rejection, but one would have to wonder whether to trust them.

t.test(d)$p.val
[1] 1.117456e-05

wilcox.test(d)$p.val
[1] 3.225493e-05
Warning messages:
1: In wilcox.test.default(d) : cannot compute exact p-value with ties
2: In wilcox.test.default(d) : cannot compute exact p-value with zeroes

(2) Here is R code used to sample the fictitious differences used in the examples above. These fictitious differences are clearly not normal.

set.seed(2021)
d = sample(-2:1, 50, rep=T, p=c(1,2,2,1)/6)

shapiro.test(d)

        Shapiro-Wilk normality test

data:  d
W = 0.84736, p-value = 1.313e-05

Let's say I had a much larger sample (e.g. 800). Would a Wilcoxon test be suitable then? — user322684, May 25 '21 at 22:56
That's changing the question quite a bit. // Only ff the 800 difference scores were roughly symmetrical would the the paired Wilcoxon SR test be OK. (In R, anyhow, with such a large n, the P-value would be computed using a normal approximation that is not so sensitive to ties.) However, a Wilcoxon SR test on asymmetrical data is inappropriate, and can lead to misleading results. See this [Q&A](https://stats.stackexchange.com/questions/14434/appropriateness-of-wilcoxon-signed-rank-test). — BruceET, May 25 '21 at 23:21
Note that ties can be dealt with by using the exact conditional distribution of the test statistic. A more fundamental issue with the signed rank test in this case is that the calculation of differences requires data on an interval, & not merely an ordinal, scale. — Scortchi - Reinstate Monica, May 26 '21 at 00:22

Hussain · Answer 2 · 2021-05-25T20:54:04.913

0

actually you have one sample and repeated measures for it (one for interstate and one for overseas). So this is a repeated measures.

if data were continuous, this would be dependent samples t-test. But since you have ordinal variables, this should be a nonparametric repeated measures test (specifically: 2 related samples Wilcoxon's test). This is actually analogous to dependent samples t-test.

You'll find this test in SPSS. However, if you want its results more accurate, especially if you suspect that your data meet the assumptions of the test, you can use the option for Exact-method. If you try it and your SPSS stuck (or got tired because it contains excessive amount of calculations), you may go for the other option, namely, Monte-carlo.

If you don't want to bother, just keeps things at default :)

edited May 25 '21 at 20:54

answered May 25 '21 at 20:46

Hussain

57
4

Thank you for your comment! However, I assumed that 2 related samples Wilcoxon's test was used to compare before and after? Could it also be used for this scenario as well? – user322684 May 25 '21 at 21:45
In this scenario, I guess there will be too many ties to get a reliable Wilcoxon P-value. – BruceET May 25 '21 at 21:59
For user322684, since you have 1 sample, and measuring the attitude on the same sample but in 2 different situations, then you can assume that it's a valid test to use. For what BrucET said, yes he's right. But another friend test to Wilcoxon is Marginal Homogeneity (you'll find it under same dialogue). This tests compares the different directions in answering (from Question1 to Question2). Example: those with answer 1 in Q1 who chose ans1 too in Q2; then checks 1->2, then 1->3, then 1->4; 2->1, 2->2, etc. So it'll see if participants significantly change their direction of answering – Hussain May 27 '21 at 03:09
You can imagine the change as follows: if participants changing from low to high are significantly more than those changing in opposite direction (high->low), or vice versa. – Hussain May 27 '21 at 04:13
And to elaborate on ur assumption of "before-after". Remember the name "Repeated Measures". So the whole idea stems from having two measures or more for a group of people. Yes "before-after" should be one application (and maybe the most clear one) for having two measures. Even if we consider your assumption (before-after), we can still see a few seconds (minutes) between answering Q1 and answering Q2 :) – Hussain May 27 '21 at 04:20

What statistical test would I use?

2 Answers2