Given
- Likert scale survey responses:
"Very Satisfied", "Very Dissatisfied", "Dissatisfied", "Neutral", "Neutral", "Very Satisfied", "Satisfied", "Very Satisfied", "Very Satisfied", "Satisfied", "Very Satisfied"
- Map between Likert scale and numeric values:
"Very Dissatisfied": 1, "Dissatisfied": 2, "Neutral": 3, "Satisfied": 4, "Very Satisfied": 5
- Manager's claim: The typical sentiment of customers at her store is more than neutral.
Questions (with attempts)
- Translate the raw data into pseudo numbers.
- x = [5, 1, 2, 3, 3, 5, 4, 5, 5, 4, 5]
- Specify clearly the "parameter" in the nonparametric setting that best captures what must be measured to assess the manager's claim. Denote that quantity $\theta$.
- Since mean is greatly affected by outliers, we should use median: $\theta = \text{median}(x)$.
- Write down the null and alternative hypotheses of the hypothesis testing task translates the manager's claim.
- ${\tt H_0}$: $\theta \leq 3$ vs. ${\tt H_A}$: $\theta > 3$.
This is where I start to be unsure of my answers.
- Write down the formula of the appropriate test statistic $B$ for the sign test to be used to assess the manager's claim [Hint: Use the exclusion approach on ties, so that used sample size is reduced]. $$ B = \sum_{i=1}^{n} \psi_i \text{ where } \psi_i = \begin{cases} \begin{matrix} 1 & \text{if } x_i > 3 \\ 0 & \text{if } x_i < 3 \end{matrix} \end{cases} $$
- Write down the sampling distribution of $B$.
- $B$ is the sum of Bernoulli random variables, so it has a binomial distribution: $B \sim \text{Binom}(n, p)$.
- Since we have two values of $x$ that equal $3$, we ignore them, giving $n=9$. Since we're dealing with a bionomial distribution, $p=0.5$.
- At significance level $\alpha=0.05$, write down the rejection region ${\tt RR}_{0.05}$.
- Since $n \times p_0 = 4.5$ and $n \times (1-p_0) = 4.5$ are both less than $5$, we cannot use the approximate test statistic: $Z = \frac{B - \text{mean}(B)}{\sqrt{\text{variance}(B)}} = \frac{B - n * p_0}{\sqrt{n * p_0 (1 - n * p_0)}}$
- Thus, our test statistic is $\text{min}(B<3, B>3) = 2$.
- ${\tt RR}_{\alpha=0.05} = \{B\colon B_{\tt obs} \geq 2$.
- Compute $B_{\tt obs}$, the observed value of the test statistic for the data.
- $B_{\tt obs} = 7$
- Provide your final decision on this test at significance level $\alpha=0.05$.
- $B_{\tt obs} = 7$ falls inside of the rejection region, so we reject $H_0$ in favor of $H_A$.
- Compute the p-value for this test and comment on what it says.
- p-value $= \text{Prob}(B \geq B_{obs} | H_0 \text{is true})$
- In ${\tt R}$: p-value $= 1-\text{pbinom}(7, 9, 0.5) = 0.01953125$.
- Since the p-value is less than $\alpha$, reject $H_0$.
- Provide a $95\%$ lower confidence bound for $\theta$.
- Not sure how to do this.
- Use the ${\tt R}$ function ${\tt SIGN.test()}$ to perform the same test performed step by step earlier.
sign_test <- SIGN.test(x, md=3, alternative='greater', conf.level = 0.95)
pvalue <- sign_test$p.value
- Since the pvalue $= 0.089$ is greater than $\alpha = 0.05$, we fail to reject $H_0$.
The conclusion from Question 11 disagrees with those from Question 8 and 9. Since we have the sample data, we know that the $\text{median}(x) = 4$, so we definitely should be rejecting $H_0$, but the ${\tt SIGN.test()}$ function is telling us that we can't.
I don't just want answers; I want to understand how to solve problems like this. Thank you in advance for any help you can give me!