2

I have some time series - http://ww2.coastal.edu/kingw/statistics/R-tutorials/simplenonlinear.html

In this article author try to use log transformation for pressure data.

How can I recognize that data pressure must be transformed by log transformation ? - and I don't want to build plot

Max Usanin
  • 167
  • 1
  • 9
  • 1
    Just curious, why not plot? – zx8754 Aug 04 '15 at 09:06
  • idea is that i will make multiple regression programmatically, and my program must use way without visual observe – Max Usanin Aug 04 '15 at 09:09
  • What do you want to do with the result of your transformation? It looks dangerous to me to list for instance a mean after logarithmic transformation and a mean after no transformation in the same table. – Dirk Horsten Aug 04 '15 at 09:37

1 Answers1

5

Please review When (and why) should you take the log of a distribution (of numbers)? . I have programmed this in AUTOBOX ( a commercially available time series software package which I have helped develop) which eliminates the normally required visual/graphical analysis of the model errors by optionally/automatically performing the Box-Cox test. Notice I said model errors NOT the original data. Note well that oftentimes the error variance changes deterministically in time. This is very important and has been virtually ignored . See http://www.unc.edu/~jbhill/tsay.pdf

More importantly you should not perform multiple regression on time series data due to opportunities/complications involved with time series analysis. Specifically identifying the appropriate/correct lag structure is impacted by auto-correlation in the data and Pulses/Level Shifts/Seasonal Pulses/Local Time Trends and changes in parameters over time and of course the homogeneity of the model error variance. See a blog I wrote discussing the differences between regression and Box-Jenkins http://www.autobox.com/cms/index.php/afs-university/intro-to-forecasting/doc_download/24-regression-vs-box-jenkins . Perhaps Prof. King might be interested in the pitfalls of using ordinary multiple regression (designed for cross-sectional data) when faced with time series data.

The good news is that ultimately a Transfer Function can be restated as a multiple regression with coefficients and lag structures.. This view is very useful in explaining the model in layman’s terms.

IrishStat
  • 27,906
  • 5
  • 29
  • 55
  • how I understand (or not), not necessary to make my time series more stationary for prepare to multiple regression - correct? – Max Usanin Aug 04 '15 at 10:05
  • Essentially any required/needed transformation will be in the final equation. One doesn't necessarily have to pre-filter / pre-specify ALTHOUGH one can if they are careful. Again standard multiple regression procedures should be avoided like the plague as auto-correlation in the data can create havoc/opportunity. – IrishStat Aug 04 '15 at 10:24
  • I am apologize maybe answers on my next question I will find in your links but still I want to ask in front: can you describe brief plan, what steps you will do if you want to do the regression analysis ?, I'm a little confused, and i do not know this tutorial can help me with multi regression or not (http://ww2.coastal.edu/kingw/statistics/R-tutorials/multregr.html) – Max Usanin Aug 04 '15 at 11:00
  • When you have time series data you pre-filter the stationary x variable and apply the same filter to the stationary y series. Compute the cross-correlation between these two filtered series to aid the identification of the relationship between the original Y and X. One then identifies the necessary ARIMA structure and deterministic structure to augment the model. Test for necessity and sufficiency an culminate in a parsimonious model. See http://www.autobox.com/cms/index.php/afs-university/autobox-examples/modeling-with-autobox#_Toc509396199 particularly section 2.3.3 – IrishStat Aug 04 '15 at 11:39
  • this approach can be applied for multiple regression? – Max Usanin Aug 04 '15 at 11:45
  • yes .... even though sas says it shouldn't because of interdependence among the X's . I have found that in practice with fine-tuning through residual diagnostic checking that one can effectivel deal/handle multiple input problems with cross-correlated X's. – IrishStat Aug 04 '15 at 11:52