Small Sample (N=32), using simple regression instead of multiple regression to test Hypothesis?

Question

As part of my thesis I have proposed hypotheses:

Perceived Supervisor Support is positively related to Employee Engagement
Perceived Organization Support is positively related to Employee Engagement
Perceived HRM Practices are positively related to Employee Engagement

I have done the survey in my company and have received 32 valid responses. Correlation analysis provides significant results. Now to test for hypothesis I was using multiple regression but my beta values are all statistically insignificant, whereas when I do simple regression for each individual driver with Employee Engagement I get statistically significant results.

My question is will using simple regression valid approach to prove these hypotheses? If yes can i quote a reference which will justify this approach in this case? Is there another way I can prove these hypotheses?

You describe an experiment with two fatal flaws that are uncorrectable: it is subject to non-response bias and the sample is too small to achieve your aims. — whuber, Oct 07 '17 at 16:26
I see, however when you are doing a survey for a company where the population size is already small there is not much of a choice left. I can see if I can extend the survey to other companies as well but then my results are not valid for my company itself. What can be a better approach here then? — aparna gupta, Oct 07 '17 at 17:01
You need to clarify your objectives. The narrowest possible one is to characterize the relationships among the current employees of the company. If you were to obtain 100% response, then there remains no uncertainty (assuming the responses are full, honest, and correct) and "significance" is meaningless. If your objective is to draw inferences about other employees of the company, or the same employees at different times, or any other larger population, then the lack of significance in the multiple regression suggests redesigning the survey. — whuber, Oct 07 '17 at 17:07
Well sample size of 32 constitutes about 60% of the employees. Can we ignore significance when we specifically mention that the results are just for the participants? Also, if we ignore this aspect the question from statistical point of view still stands unanswered: Can we use simple regression to prove individual hypotheses or not? — aparna gupta, Oct 07 '17 at 17:42
Let's cut to the chase: when your original plan is to relate a response to three regressors and your chosen analytical method does not conclude there is a response, *then everything else you do has to be seen as an effort to create "significance" where none exists.* I hope that the problems with that are apparent. It doesn't mean your data are worthless--for instance, you can explore them to develop hypotheses for future investigations--but it does mean that any subsequent claims of "significance" in any results have to be viewed with suspicion. — whuber, Oct 07 '17 at 18:12
The reason the point estimates in the multiple regression (with 3 regressors) are insignificant individually is quite likely related to the fact that there may well be near-multicollinearity between the 3 regressors. Please state the values of the three pairwise correlation coefficients among the regressors. — Mico, Oct 07 '17 at 23:38
@whuber: I appreciate your time and will look into your recommendation. — aparna gupta, Oct 09 '17 at 08:28

score 4 · Answer 1 · answered Oct 07 '17 at 18:22

First, as @whuber mentioned, you have the problem of selective response. This really dwarfs all other problems and makes any results dubious. Even if you have about 60% of the population in your sample, it could be way off. However if you, as per one of your comments, say "these results only apply to these respondents" then you don't need inference at all.

Second, your three independent variables are almost certainly colinear. You need to deal with that, even if you solve the nonresponse issue. One way to do this is to use ridge regression.

Third, doing three simple regressions is OK, but it answers a different question than doing one multiple regression. Specifically, in a multiple regression, the effect of each IV is estimated controlling for the other IVs. In simple regression there is no control.

score 3 · Answer 2 · answered Oct 07 '17 at 17:13

Significant predictors become non-significant in multiple logistic regression

sounds like a duplicate to your question. For example, (2) might be an issue to you.

Now, obviously, the two variables are strongly related, as you need to be older to have more experience. Hence, the two variables basically "compete" for explaining the status, which may, especially in small samples,

Your sample size is small, already noted by @whuber. Your variables sound like strongly positively correlated to each other.

Small Sample (N=32), using simple regression instead of multiple regression to test Hypothesis?

2 Answers2