Questions tagged [analysis]

For questions dealing with the applied analysis of a specific dataset or design of experiment. Posts tagged analysis are requesting statistical consulting assistance from the network. Questions tagged analysis need to be phrased appropriately.

For questions dealing with the applied analysis of a specific dataset or design of experiment. Posts tagged analysis are requesting statistical consulting assistance from the network. Questions tagged analysis need to be phrased appropriately.

Give your post an informative title for visibility

This should allude to the nature of the hypothesis being tested, the field of study, asking about the appropriateness or relative benefits of certain analytic methods. Pose it as a question to help secure the analysis tag quickly.

Do not use partial or incomplete sentences. Be as detailed as you can with a 50-70 char max title.

Examples of good titles:

  • Biostatistics: can I use Cox Models to predict failure outcomes in a cohort of heart transplant recipients?
  • Economics: How do I assess stationarity for ARMA models for index performance metrics?
  • Neuroscience: How do I estimate neural spiking models based on cortical EEGs?
  • Social Science: How do I assess differential item functioning in a cognitive tool for elderly veterans?

Examples of bad titles:

  • What test do I use?
  • Bioinformatics microarrays
  • Does bootstrapping work for my data?

Be as detailed as possible with the body of the question:

Include the following information:

  1. The field of application (e.g. biostatistics, econometrics, social sciences...)
  2. A basic statement of hypothesis or hypotheses
  3. A detailed description of the dataset including the sample size, sampling method(s) used, variables collected, the measuring methodology (e.g. mass spectrometry, questionnaire, physical exam), missingness, and variable coding.
  4. Basic proposed data analysis plan
  5. Precise description of the problems encountered

Explain the jargon in your field. Write out all acronyms and shorthands unless they are obvious.

Be clear about the answer you are looking for

Details on how to implement the suggestions here in statistical software are not appropriate for this site. Do not tag such questions with R, SAS, or SPSS as they will likely be migrated to stack overflow.

Typical statistical solutions that you may expect here are:

  • Suggestions of particular statistical tests to evaluate researcher hypotheses
  • Graphical or numerical summaries that may enhance a particular analysis
  • Interpretation of results including p-values, confidence intervals, credibility regions, or Bayes factors for non-researchers or other statisticians
  • Interpretation of statistical output
  • Correctness of proposed methods or the appropriateness of secondary analyses if certain assumptions are violated
  • Being unsure of the assumptions necessary for a particular test?
  • How to handle correlated observations?
  • Ascertaining adequate power or sample size for particular methods
  • Detailed specifications on how to control for blocking, stratum, or confounding / mediating factors in analyses
435 questions
21
votes
6 answers

What is the difference between data mining and statistical analysis?

What is the difference between data mining and statistical analysis? For some background, my statistical education has been, I think, rather traditional. A specific question is posited, research is designed, and data are collected and analyzed to…
Brett
  • 5,708
  • 3
  • 29
  • 41
9
votes
2 answers

When do we "stop" using multiple correction techniques?

I understand when performing a simple t-test, we typically control the type -1 error rate at $\alpha = .05$. This signifies that if the null hypothesis holds, the data will "incorrectly" reject the null hypothesis in 5% of all instances. Therefore,…
6
votes
2 answers

Can the differential entropy be negative infinity?

Define the (differential) entropy for density $f$ as $$ H(f) :=-\int_{0}^{1} f(x) \log_{2}(f(x)) dx \, .$$ I am trying to find a Lebesgue measurable $f$ defined on $[0,1]$ such that $f\geq 0, \int_{0}^{1} f(x) dx = 1$ and $H(f)= -\infty $. I am…
6
votes
1 answer

Correct statistical test when people could appear in multiple groups

Thank you in advance for your help. I ran a survey. People first answered a multiple selection question - they could select as many of the choices as they wanted. The question asked them their purpose for visiting a website (for example, purchase an…
John
  • 61
  • 1
5
votes
1 answer

Difference between using a propensity score for matching vs. regression analysis

So I am confused on what the difference is if I match patients based on propensity scores vs. using the propensity score and then applying that into a multivariate regression analysis? Is there a difference? One you match 1:1 and disgard the groups…
5
votes
1 answer

Visually Comparing the Kaplan-Meier Curve to the Cox PH Model Curve

I am conducting a survival analysis and have a few questions regarding it's interpretation with respect to the Cox Proportional Hazards Model: Why does the inclusion of different covariates change the shape of the plotted survival function? When I…
Zach
  • 53
  • 4
5
votes
1 answer

Whether to apply the logit transformation to proportional predictor variables in a multiple linear regression? [including proportions of 0.0%]

In a linear regression, I have a number of predictors variables that are expressed as proportions. The outcome variable is continuous. My residuals are not normally distributed, with a mild to moderate positive skew. Should I use a logit…
5
votes
1 answer

Fit a time series model with unknown lag in Stan

I try to fit a population time-series model in stan/rstan(2.7.0) where the death rate depends on the generation before (n-1) but the reproduction depends on a unknown generation (n-x). I haven't found a way to estimate x since stan has no options…
Julian
  • 153
  • 5
4
votes
2 answers

Measuring Circularity of a set of Co-planar points

I am trying to find a way to quantify how circular or symmetric a shape is. What is a good way to measure the circularity of these set of points? I already have the coordinates of the points along the perimeter of the shape. The image below shows…
4
votes
1 answer

Controlling for baseline in pre-post between design: using $\Delta(T_2-T_1)$ or controlling for T1 in the regression model (or both)?

I have a mixed between-within design, with three groups and Pre (T1) and Post (T2) measurements. I'm hesitating on the right statistical analyses to do, but I would like to compare each group to the other two separately while taking into account the…
RemPsyc
  • 221
  • 2
  • 11
4
votes
1 answer

Principal Component Analysis Stock Returns

I am new to PCA and am having trouble understanding some parts of the methodology. In multi-factor models you can run regressions like: Stock Return = A(1)*Size + A(2)*Market Return + ... So in this case, you have a clear understanding of what is…
Vladmir Putin
  • 301
  • 1
  • 4
  • 11
4
votes
2 answers

Can a ROC Curve have a continous outcome variable?

I'm currently undertaking research creating cut-off scores using ROC curves. I have encountered some confusion regarding the outcome variable. My outcome variable can range in score from 10-50, and we are using a cut-off previously established of 20…
user171984
  • 41
  • 1
  • 2
4
votes
1 answer

Test for compairing data on same days in a week in different years

So I'm doing some power system analysis on active and reactive power loads and I'd like to compare the data from 2015 and data from 2015 to see how much they differ. The nature of the load is such that on same days in a week we have more or less a…
4
votes
1 answer

Is an ANOVA applicable for these data?

I have a data set from 7 groups, with 20 fish in each group. Measurement of a parameter is made on 25 cells from each fish (so each observation in the data-set is completely independent, right?). One of the groups functions as the control group…
4
votes
1 answer

Psychometrics: Survival analysis of help seeking behaviors

Background: I'm studying people seeking help. Participants described contacts with between 1 and 3 "responders" (e.g., friends, the police) in order- for example, a participant could have contacted just responder 1, or responder 1, then responder 2,…
Emily
  • 618
  • 4
  • 16
1
2 3
28 29