Highest Voted Questions - Statistical Analysis Stack Exchange

33

votes

4 answers

How to measure smoothness of a time series in R?

Is there a good way to measure smoothness of a time series in R? For example, -1, -0.8, -0.6, -0.4, -0.2, 0, 0.2, 0.4, 0.6, 0.8, 1.0 is much smoother than -1, 0.8, -0.6, 0.4, -0.2, 0, 0.2, -0.4, 0.6, -0.8, 1.0 although they have same mean and…

r time-series

asked Mar 14 '12 at 02:29

agmao

431
1
4
3

33

votes

4 answers

Independent variable = Random variable?

I'm slightly confused if an independent variable (also called predictor or feature) in a statistical model, for example the $X$ in linear regression $Y=\beta_0+\beta_1 X$, is a random variable ?

regression random-variable experiment-design predictor faq

asked Nov 15 '16 at 12:19

l7ll7

1,075
2
9
15

33

votes

1 answer

What are the properties of a half Cauchy distribution?

I am currently working on a problem, where I need to develop a Markov chain Monte Carlo (MCMC) algorithm for a state space model. To be able to solve the problem, I have been given the following probability of $\tau$: p($\tau$) =…

distributions bayesian prior state-space-models cauchy-distribution

asked Oct 01 '16 at 02:16

Christoph

335
1
3
4

33

votes

3 answers

Is there a Project Euler-alike for machine learning?

I found Project Euler http://projecteuler.net/ to be incredibly useful in learning programming languages. Is there a similar site for Machine Learning? I did see http://www.kaggle.com/, but it is not nearly as accessible to beginners as Project…

teaching

asked Feb 10 '12 at 23:04

B Seven

2,873
4
24
29

33

votes

3 answers

Is whitening always good?

A common pre-processing step for machine learning algorithms is whitening of data. It seems like it is always good to do whitening since it de-correlates the data, making it simpler to model. When is whitening not recommended? Note: I'm referring to…

machine-learning data-transformation whitening

asked Feb 15 '12 at 07:48

Ran

1,476
3
16
25

33

votes

3 answers

In boosting, why are the learners "weak"?

See also a similar question on stats.SE. In boosting algorithms such as AdaBoost and LPBoost it is known that the "weak" learners to be combined only have to perform better than chance to be useful, from Wikipedia: The classifiers it uses can be…

machine-learning mathematical-statistics boosting

asked Feb 16 '12 at 13:37

tdc

7,289
5
32
62

33

votes

3 answers

Do we need a test set when using k-fold cross-validation?

I've been reading about k-fold validation, and I want to make sure I understand how it works. I know that for the holdout method, the data is split into three sets, and the test set is only used at the very end to assess the performance of the…

cross-validation validation out-of-sample

asked Jul 27 '16 at 17:30

b_pcakes

435
1
4
5

33

votes

1 answer

How to train and validate a neural network model in R?

I am new to modeling with neural networks, but I managed to establish a neural network with all available data points that fits the observed data well. The neural network was done in R with the nnet package: require(nnet) ##33.8 is the highest…

r neural-networks

asked Jan 25 '12 at 20:21

Strohmi

815
1
10
13

33

votes

4 answers

Optimising for Precision-Recall curves under class imbalance

I have a classification task where I have a number of predictors (one of which is the most informative), and I am using the MARS model to construct my classifier (I am interested in any simple model, and using glms for illustrative purposes would be…

machine-learning roc precision-recall unbalanced-classes data-visualization

asked Jan 24 '12 at 04:11

highBandWidth

2,092
2
21
34

33

votes

4 answers

How to create an arbitrary covariance matrix

For example, in R, the MASS::mvrnorm() function is useful for generating data to demonstrate various things in statistics. It takes a mandatory Sigma argument which is a symmetric matrix specifying the covariance matrix of the variables. How would…

r random-generation covariance-matrix

asked May 31 '16 at 06:07

rsl

845
2
9
15

33

votes

2 answers

Understanding bias-variance tradeoff derivation

I am reading the chapter on the bias-variance tradeoff in The elements of statistical learning and I don't understand the formula on page 29. Let the data arise from a model such that $$ Y = f(x)+\varepsilon$$ where $\varepsilon$ is random number…

machine-learning unbiased-estimator mse bias-variance-tradeoff

asked Mar 28 '16 at 14:53

emanuele

2,008
3
21
34

33

votes

8 answers

Is there a plateau-shaped distribution?

I am looking for a distribution where the probability density decreases quickly after some point away from the mean, or in my own words a "plateau-shaped distribution". Something in between the Gaussian and the uniform.

distributions normal-distribution uniform-distribution

asked Mar 25 '16 at 09:03

dontloo

13,692
7
51
80

33

votes

2 answers

How to model non-negative zero-inflated continuous data?

I'm currently trying to apply a linear model (family = gaussian) to an indicator of biodiversity that cannot take values lower than zero, is zero-inflated and is continuous. Values range from 0 to a little over 0.25. As a consequence, there is quite…

regression zero-inflation tobit-regression tweedie-distribution

asked Dec 21 '15 at 21:57

David

331
1
4
3

33

votes

3 answers

When to use fixed effects vs using cluster SEs?

Suppose you have a single cross-section of data where individuals are located within groups (e.g. students within schools) and you wish to estimate a model of the form Y_i = a + B*X_i where X is a vector of individual level characteristics and a a…

econometrics multilevel-analysis fixed-effects-model endogeneity clustered-standard-errors

asked Dec 07 '15 at 00:53

QuestionAnswer

463
1
6
7

33

votes

3 answers

Why is variable selection necessary?

Common data-based variable selection procedures (for example, forward, backward, stepwise, all subsets) tend to yield models with undesirable properties, including: Coefficients biased away from zero. Standard errors that are too small and…

modeling feature-selection

asked Nov 10 '11 at 21:32

user7322

Most Popular