0

I have hundreds of sample points where I have data of three variables ranging from 0.0 to 1.0. I would like to use some statistical test to find a function that could predict a phenomenon. I also have data of this phenomenon in these sample points. Here is an example table:

enter image description here

Could you point me to some test or analysis to achieve this? Preferably some package in R, also in Excel. I'm sorry if this question is too simple but statistics is not really my area of expertise.

Nick Cox
  • 48,377
  • 8
  • 110
  • 156
Albert
  • 113
  • 4
  • 2
    What about linear regression? – Tim Nov 24 '15 at 10:22
  • 2
    In this first instance this is just regression. Excel can do it, although it is much better to use almost any statistical software, such as R. – Nick Cox Nov 24 '15 at 10:22
  • I don't know how to do it with 3 variables in the same function. The thing is these variables influence each other in the outcome. Maybe I didn't make myself clear or I don't understand your answer, sorry. – Albert Nov 24 '15 at 10:28
  • 3
    @AlbertC the interactions are fine and can be modeled by linear regression. I think you should start with some introductory statistics handbook and read about linear regression - without it it would be hard to apply this or similar methods in practice. – Tim Nov 24 '15 at 10:31
  • Ok @Tim, thank you! I'll take a look at it. But, will a regression tell me which variables are more correlated and also which interactions among them are stronger? – Albert Nov 24 '15 at 10:35
  • 1
    No. As name suggests, to see correlation between individual variables, you have to look at the correlations (see also [partial correlation](http://stats.stackexchange.com/questions/174022/how-could-i-get-a-correlation-value-that-accounts-for-gender/174025#174025)), however correlation won't help you for prediction (while regression does). – Tim Nov 24 '15 at 10:37

1 Answers1

2

The kind of data that you have will often dictate the type of test that you run. From what you've provided, it seems you have multiple continuous predictor variables (independent variables), and one continuous outcome variable (dependent variable). There are many resources online that provide flowcharts to help you determine which test is appropriate for your data. For example, this page from the Institute for Digital Research and Education. The first column asks you how many dependent variables you have--in your case, it looks like 1. The second column asks you about the nature of your independent variable(s)--in your case, they appear to be continuous/interval, and 3 of them. There is a row for "1 or more interval IVs and/or 1 or more categorical IVs" which seems most appropriate to your case. Now column 3 asks for the nature of your dependent variable, again, it looks continuous/interval. The 4th column suggests the test appropriate for these conditions--multiple regression. The other option, analysis of covariance, would only be appropriate if you had at least one categorical predictor, which you don't. This site also includes a link to the appropriate test in various common software packages.

Not only is this the test that fits your data, but regression creates coefficients that allow you to construct a prediction equation. In the output, you will get a) r^2, which will estimate the percent of variance in your dependent variable that is explained by your 3 predictors, b) model significance, to determine whether the relationship of the 4 variables is statistically significant, c) significance for the individual variables to determine which specific variables--one, two, or all three--are statistically significant to the model, d) beta coefficients that allow you to see the strength of the predictive value of the three variables, and e) coefficients that allow you construct a prediction equation. Presuming you have done the appropriate assumption test checks (sample size, normality, etc), then you should be able to use your prediction equation. There are a number of resources that could help you construct a prediction equation, this one isn't bad.

Albert
  • 113
  • 4
jeramy townsley
  • 522
  • 3
  • 16
  • 1
    Thank you very much for taking the time to write this answer! The table is a really nice resource and you have presented in a very clear way. Thanks!! – Albert Dec 02 '15 at 07:55