Is an f-test the correct method for power analysis of my multivariate regression data?

Question

I have been working on solving a problem that requires power analysis to be run in order to determine if the subset of data that we're running analysis on is sufficient before we do the analysis. I think an f-test is the correct test to use for power analysis, but wanted to ask you guys here since programming is my specialty not math, so a lot of this is new to me.

The data that I have is basically a response value that responds to about 15 different practices. Every observation does have a value recorded for the response variable as well as each of the 15 practices. Each practice has a simple "Yes" or "No" recorded for it. I'm trying to determine whether the practices do affect the response variable. The response variable works like a normal number (where going up is good and going down is bad). The response variable cannot be negative (not sure this matters).

I've been trying for a few weeks to determine the best method of running power analysis for this dataset. I already have the number of observations (data is already collected). What I'm trying to do is calculate the power to ensure power is > 0.80 for my data.

My question for you is would an f-test be appropriate for calculating the power in this situation, or would you recommend something else? The specific implementation that I'm using is the f2.test from this R package https://cran.r-project.org/web/packages/pwr/pwr.pdf .

I would like to add that I have another response variable recorded with everything that can be negative. The response variables are never analyzed together. In other words I want to run power analysis and subsequent analysis (assuming power was high enough) for the first response variable + all independant variables and then run a separate analysis using the same data with a different response variable. Would the type of power analysis need to be different for the second response variable that can be negative?

Thank you for your help. Let me know if a data sample would be helpful.

Edit: I should also add that when we're running power analysis we're running it on the subset of data that we currently have for a given user in our software system. More data is entered daily, so the power value will change as more data comes in. What we're trying to do is determine if the subset of data that we're looking at for a given user has adequate power at that point in time to run the report we're running.

If you have already collected the data and if, as I infer from your question, you do not intend to collect more, what do you hope to accomplish by performing a power analysis at this stage of the study? At this stage you analyze the data you have. If you find significant results you had adequate power; if not, you didn’t and you could perhaps consider these data as a pilot study to help design a new study that might have adequate power. — EdM, Feb 03 '20 at 13:21
Ah, yeah my mistake. I meant to mention that we are always collecting more data (this is through software). More data is coming in every day. What we're doing is looking at a subset of the total data based on the user viewing the report and determining if that subset has high enough power. If it doesn't we don't run the report, and if it does we do run the report. So the point of the power analysis is to determine if the data currently has adequate power. More data is collected day by day. I'll update the question to make that clear. — Dylan Hamilton, Feb 03 '20 at 14:55

score 1 · Answer 1 · answered Feb 03 '20 at 15:56

For a linear regression of either of your outcomes against your predictors, the F-test evaluates whether any one of your predictors ("practices" in your case) has a regression coefficient significantly different from zero. So in principle you are on the right track.

Without knowing more about the data, however, it's hard to know whether linear regression will be an appropriate model. It doesn't matter whether the outcome is necessarily non-negative. What matters most is that each of your predictors has an additive effect upon the outcome; it's better if the magnitudes of the deviations from the model don't depend on the predicted values; it's best if those deviations have something close to a normal distribution as assumed by the F-test. A search for "linear regression diagnostics" will show ways to evaluate these properties of your models.

Finding the number of cases needed to provide a desired power requires knowing the significance level of the test, the difference that you would like to detect, and an estimate of the variability in what you wish to detect. The ratio of the last 2 factors (desired detectable difference to the variability) is related to the effect size that is expected as an input to functions like pwr.f2.test().

What you seem to want to do is to make some power estimate without looking at the data. That leads to two problems. First, you don't know whether the linear regression will be appropriate for your data. Second, just what will you specify as an effect size, if you don't know anything about the variability of the data or of the results from your modeling of the data?

As you are already collecting data and plan to collect more, take advantage instead of the data that you have (or some appropriately sampled subset) and treat it like a pilot study, which is generally the best way to proceed with power analysis for study design. You will see how well the linear regression model works and whether it might need to be modified in some respects. You will get values of regression and error mean-square values to inform your power analysis, rather than depending on the arbitrary "small," "medium" and "large" effect sizes that the pwr package will give you in the absence of data. You will find whether certain predictors have relationships to outcome that are of particular interest; in that case, your power analysis might be better done with respect to the values of those individual regression coefficients rather than the omnibus F-test.

Also, you probably should evaluate how your two outcome measures are related to each other. Sometimes analyzing multiple outcomes together works better than analyzing them separately. Again, you need to see a sample of your data to evaluate that.

Thank you for the thoughtful input. One thing I must have miscommunicated is that we are considering the data we currently have when running power analysis. The effect size is being calculated using an r squared value that is calculated by the `lm` function in R. So when I run the `pwr.f2.test` I have all parameters calculated based on data (ones that are based on data like degrees of freedom numerator, degrees of freedom denominator, f2/effect size) with a sig level of 0.05. I should also note that the analysis we run if power is high enough is separate from the power analysis itsself. — Dylan Hamilton, Feb 03 '20 at 17:40
@DylanHamilton be careful in how you translate $R^2$ values to the effect-size values. See [this page](https://stats.stackexchange.com/q/56881/28500) for the relationship between $R^2$ and F values. `pwr.f2.test` uses the effect size to determine the non-centrality parameter needed for the power calculation. It's really not clear what this apparently repeated power analysis is accomplishing, unless you suspect that the relationships between the practices and the responses is changing over time. — EdM, Feb 03 '20 at 18:04

Is an f-test the correct method for power analysis of my multivariate regression data?

1 Answers1