Most Popular

1500 questions
36
votes
1 answer

How to interpret variance and correlation of random effects in a mixed-effects model?

I hope you all don't mind this question, but I need help interpreting output for a linear mixed effects model output I've been trying to learn to do in R. I am new to longitudinal data analysis and linear mixed effects regression. I have a model I…
Zeda
  • 461
  • 1
  • 5
  • 3
36
votes
2 answers

Can you explain Parzen window (kernel) density estimation in layman's terms?

Parzen window density estimation is described as $$ p(x)=\frac{1}{n}\sum_{i=1}^{n} \frac{1}{h^2} \phi \left(\frac{x_i - x}{h} \right) $$ where $n$ is number of elements in the vector, $x$ is a vector, $p(x)$ is a probability density of $x$, $h$ is…
36
votes
7 answers

Are there algorithms for computing "running" linear or logistic regression parameters?

A paper "Accurately computing running variance" at http://www.johndcook.com/standard_deviation.html shows how to compute running mean, variance and standard deviations. Are there algorithms where the parameters of a linear or logistic regression…
adrcuth
36
votes
2 answers

Is Tikhonov regularization the same as Ridge Regression?

Tikhonov regularization and ridge regression are terms often used as if they were identical. Is it possible to specify exactly what the difference is?
36
votes
3 answers

How to draw neat polygons around scatterplot regions in ggplot2

How do I add a neat polygon around a group of points on a scatterplot? I am using ggplot2 but am disappointed with the results of geom_polygon. The dataset is over there, as a tab-delimited text file. The graph below shows two measures of attitudes…
Fr.
  • 1,343
  • 3
  • 11
  • 22
36
votes
2 answers

Bootstrap prediction interval

Is there any bootstrap technique available to compute prediction intervals for point predictions obtained e.g. from linear regression or other regression method (k-nearest neighbour, regression trees etc.)? Somehow I feel that the sometimes proposed…
Michael M
  • 10,553
  • 5
  • 27
  • 43
36
votes
5 answers

Why use regularisation in polynomial regression instead of lowering the degree?

When doing regression, for example, two hyper parameters to choose are often the capacity of the function (eg. the largest exponent of a polynomial), and the amount of regularisation. What I'm confused about, is why not just choose a low capacity…
36
votes
1 answer

Cross-validation misuse (reporting performance for the best hyperparameter value)

Recently I have come across a paper that proposes using a k-NN classifier on an specific dataset. The authors used all the data samples available to perform k-fold cross validation for different k values and report cross validation results of the…
36
votes
3 answers

Why is the mean function in Gaussian Process uninteresting?

I have just started reading about GPs and analogous to the regular Gaussian distribution it is characterized by a mean function and the covariance function or the kernel. I was at a talk and the speaker said that the mean function is usually quite…
Luca
  • 4,410
  • 3
  • 30
  • 52
36
votes
3 answers

Computing p-value using bootstrap with R

I use "boot" package to compute an approximated 2-sided bootstrapped p-value but the result is too far away from p-value of using t.test. I can't figure out what I did wrong in my R code. Can someone please give me a hint for this time =…
Tu.2
  • 2,627
  • 6
  • 26
  • 26
36
votes
2 answers

Is there a reliable nonparametric confidence interval for the mean of a skewed distribution?

Very skewed distributions such as the log-normal do not result in accurate bootstrap confidence intervals. Here is an example showing that the left and right tail areas are far from the ideal 0.025 no matter which bootstrap method you try in…
Frank Harrell
  • 74,029
  • 5
  • 148
  • 322
36
votes
2 answers

Is this the state of art regression methodology?

I've been following Kaggle competitions for a long time and I come to realize that many winning strategies involve using at least one of the "big threes": bagging, boosting and stacking. For regressions, rather than focusing on building one best…
36
votes
3 answers

How to fit an ARIMAX-model with R?

I have four different time series of hourly measurements: The heat consumption inside a house The temperature outside the house The solar radiation The wind speed I want to be able to predict the heat consumption inside the house. There is a clear…
utdiscant
  • 1,430
  • 4
  • 19
  • 22
36
votes
1 answer

Bootstrapping vs Bayesian Bootstrapping conceptually?

I'm having a trouble understanding what a Bayesian Bootstrapping process is, and how that would differ from your normal bootstrapping. And if someone could offer an intuitive/conceptual review and comparison of both, that would be great. Let's take…
SpicyClubSauce
  • 495
  • 1
  • 4
  • 9
36
votes
3 answers

Pre-training in deep convolutional neural network?

Have anyone seen any literature on pre-training in deep convolutional neural network? I have only seen unsupervised pre-training in autoencoder or restricted boltzman machines.