15

I would like to know, or have references on, analysis process most of statistical data analysts go through for each data analysis project.

If I make a "list", to complete data analysis project, an analyst has to:

  1. first collect requirements for the project,
  2. plan/design his data analysis based on those requirements before
  3. actually pre-processing data,
  4. executing the data analysis and
  5. writing a report based on his analysis results.

For this question, I am interested in more details of Step 2. But I understand this is not practically clear cut as the analyst might have to change his plan or design according to data analysis output. Is there any reference on this subject?

Steffen Moritz
  • 1,564
  • 2
  • 15
  • 22
Tae-Sung Shin
  • 655
  • 1
  • 9
  • 22

3 Answers3

17

My favorite "plan" or "list" is Scott Emerson's document Organizing Your Approach to a Data Analysis.

Note: the last two pages are under the heading "General Requirements for Ph.D. Applied Exam" but the advice given there generalizes to working on any analysis problem.

5

I found The Workflow of Data Analysis Using Stata to be a good book, particularly (but non only) as a Stata user. I found much with which to disagree, but even that helped clarify why I do things certain ways.

dimitriy
  • 31,081
  • 5
  • 63
  • 138
  • 4
    +1 but, *caveat emptor*: this book is only valuable if you are a Stata user. I don't use Stata (in fact I never have). On the other hand, I like Long, so I checked this out from the library. I'm sure there's a lot of good info in there for everyone, but it is so thoroughly intertwined with the use of Stata that it's impossible to extract the domain general information. – gung - Reinstate Monica Feb 14 '12 at 05:24
2

CRISP-DM, coined by SPSS company (now belongs to IBM) is an acronym for the data mining process, which is the same as for "data analysis". SAS has a similar process called SEMMA.

Galit Shmueli
  • 1,090
  • 8
  • 10