Most Popular
1500 questions
43
votes
3 answers
Softmax layer in a neural network
I'm trying to add a softmax layer to a neural network trained with backpropagation, so I'm trying to compute its gradient.
The softmax output is $h_j = \frac{e^{z_j}}{\sum{e^{z_i}}}$ where $j$ is the output neuron number.
If I derive it then I…

Ran
- 1,476
- 3
- 16
- 25
43
votes
2 answers
How to interpret glmnet?
I am trying to fit a multivariate linear regression model with approximately 60 predictor variables and 30 observations, so I am using the glmnet package for regularized regression because p>n.
I have been going through documentation and other…

Alice
- 825
- 2
- 8
- 12
43
votes
2 answers
Multiple regression or partial correlation coefficient? And relations between the two
I don't even know if this question makes sense, but what is the difference between multiple regression and partial correlation (apart from the obvious differences between correlation and regression, which is not what I am aiming at)?
I want to…

user34927
- 385
- 1
- 4
- 12
43
votes
2 answers
Interpreting the residuals vs. fitted values plot for verifying the assumptions of a linear model
Consider the following figure from Faraway's Linear Models with R (2005, p. 59).
The first plot seems to indicate that the residuals and the fitted values are uncorrelated, as they should be in a homoscedastic linear model with normally distributed…

Evan Aad
- 1,221
- 2
- 12
- 18
43
votes
1 answer
Manually calculated $R^2$ doesn't match up with randomForest() $R^2$ for testing new data
I know this is a fairly specific R question, but I may be thinking about proportion variance explained, $R^2$, incorrectly. Here goes.
I'm trying to use the R package randomForest. I have some training data and testing data. When I fit a random…

Stephen Turner
- 4,183
- 8
- 27
- 33
43
votes
5 answers
Does the beta distribution have a conjugate prior?
I know that the beta distribution is conjugate to the binomial. But what is the conjugate prior of the beta? Thank you.

Brash Equilibrium
- 3,565
- 1
- 25
- 43
43
votes
2 answers
Understanding the parameters inside the Negative Binomial Distribution
I was trying to fit my data into various models and figured out that the fitdistr function from library MASS of R gives me Negative Binomial as the best-fit. Now from the wiki page, the definition is given as:
NegBin(r,p) distribution describes the…

Legend
- 4,232
- 7
- 37
- 50
43
votes
4 answers
Should covariates that are not statistically significant be 'kept in' when creating a model?
I have several covariates in my calculation for a model, and not all of them are statistically significant. Should I remove those that are not?
This question discusses the phenomenon, but does not answer my question:
How to interpret…

A.M.
- 659
- 1
- 8
- 13
43
votes
5 answers
Analysis with complex data, anything different?
Say for example you are doing a linear model, but the data $y$ is complex.
$ y = x \beta + \epsilon $
My data set is complex, as in all the numbers in $y$ are of the form $(a + bi)$. Is there anything procedurally different when working with such…

bill_e
- 2,681
- 1
- 19
- 33
43
votes
2 answers
What are the practical differences between the Benjamini & Hochberg (1995) and the Benjamini & Yekutieli (2001) false discovery rate procedures?
My statistics program implements both the Benjamini & Hochberg (1995) and Benjamini & Yekutieli (2001) false discovery rate (FDR) procedures. I have done my best to read through the later paper, but it is quite mathematically dense and I am not…

russellpierce
- 17,079
- 16
- 67
- 98
43
votes
2 answers
Mean absolute percentage error (MAPE) in Scikit-learn
How can we calculate the Mean absolute percentage error (MAPE) of our predictions using Python and scikit-learn?
From the docs, we have only these 4 metric functions for Regressions:
metrics.explained_variance_score(y_true,…

Nyxynyx
- 885
- 3
- 9
- 15
43
votes
2 answers
Test for bimodal distribution
I wonder if there is any statistical test to "test" the significance of a bimodal distribution. I mean, How much my data meets the bimodal distribution or not? If so, is there any test in the R program?

Pauloc
- 613
- 2
- 6
- 6
43
votes
5 answers
Negative values for AICc (corrected Akaike Information Criterion)
I have calculated AIC and AICc to compare two general linear mixed models; The AICs are positive with model 1 having a lower AIC than model 2. However, the values for AICc are both negative (model 1 is still < model 2). Is it valid to use and…

Freya Harrison
- 3,212
- 4
- 25
- 31
43
votes
3 answers
How do DAGs help to reduce bias in causal inference?
I have read in several places that the use of DAGs can help to reduce bias due to
Confounding
Differential Selection
Mediation
Conditioning on a collider
I also see the term “backdoor path” a lot.
How do we use DAGs to reduce these biases, and…

LeelaSella
- 1,770
- 3
- 24
- 42
43
votes
4 answers
What are the factors that cause the posterior distributions to be intractable?
In Bayesian statistics, it is often mentioned that the posterior distribution is intractable and thus approximate inference must be applied. What are the factors that cause this intractability?

Nick
- 3,327
- 6
- 28
- 24