Most Popular

1500 questions
43
votes
3 answers

Softmax layer in a neural network

I'm trying to add a softmax layer to a neural network trained with backpropagation, so I'm trying to compute its gradient. The softmax output is $h_j = \frac{e^{z_j}}{\sum{e^{z_i}}}$ where $j$ is the output neuron number. If I derive it then I…
Ran
  • 1,476
  • 3
  • 16
  • 25
43
votes
2 answers

How to interpret glmnet?

I am trying to fit a multivariate linear regression model with approximately 60 predictor variables and 30 observations, so I am using the glmnet package for regularized regression because p>n. I have been going through documentation and other…
Alice
  • 825
  • 2
  • 8
  • 12
43
votes
2 answers

Multiple regression or partial correlation coefficient? And relations between the two

I don't even know if this question makes sense, but what is the difference between multiple regression and partial correlation (apart from the obvious differences between correlation and regression, which is not what I am aiming at)? I want to…
43
votes
2 answers

Interpreting the residuals vs. fitted values plot for verifying the assumptions of a linear model

Consider the following figure from Faraway's Linear Models with R (2005, p. 59). The first plot seems to indicate that the residuals and the fitted values are uncorrelated, as they should be in a homoscedastic linear model with normally distributed…
Evan Aad
  • 1,221
  • 2
  • 12
  • 18
43
votes
1 answer

Manually calculated $R^2$ doesn't match up with randomForest() $R^2$ for testing new data

I know this is a fairly specific R question, but I may be thinking about proportion variance explained, $R^2$, incorrectly. Here goes. I'm trying to use the R package randomForest. I have some training data and testing data. When I fit a random…
Stephen Turner
  • 4,183
  • 8
  • 27
  • 33
43
votes
5 answers

Does the beta distribution have a conjugate prior?

I know that the beta distribution is conjugate to the binomial. But what is the conjugate prior of the beta? Thank you.
Brash Equilibrium
  • 3,565
  • 1
  • 25
  • 43
43
votes
2 answers

Understanding the parameters inside the Negative Binomial Distribution

I was trying to fit my data into various models and figured out that the fitdistr function from library MASS of R gives me Negative Binomial as the best-fit. Now from the wiki page, the definition is given as: NegBin(r,p) distribution describes the…
Legend
  • 4,232
  • 7
  • 37
  • 50
43
votes
4 answers

Should covariates that are not statistically significant be 'kept in' when creating a model?

I have several covariates in my calculation for a model, and not all of them are statistically significant. Should I remove those that are not? This question discusses the phenomenon, but does not answer my question: How to interpret…
A.M.
  • 659
  • 1
  • 8
  • 13
43
votes
5 answers

Analysis with complex data, anything different?

Say for example you are doing a linear model, but the data $y$ is complex. $ y = x \beta + \epsilon $ My data set is complex, as in all the numbers in $y$ are of the form $(a + bi)$. Is there anything procedurally different when working with such…
bill_e
  • 2,681
  • 1
  • 19
  • 33
43
votes
2 answers

What are the practical differences between the Benjamini & Hochberg (1995) and the Benjamini & Yekutieli (2001) false discovery rate procedures?

My statistics program implements both the Benjamini & Hochberg (1995) and Benjamini & Yekutieli (2001) false discovery rate (FDR) procedures. I have done my best to read through the later paper, but it is quite mathematically dense and I am not…
russellpierce
  • 17,079
  • 16
  • 67
  • 98
43
votes
2 answers

Mean absolute percentage error (MAPE) in Scikit-learn

How can we calculate the Mean absolute percentage error (MAPE) of our predictions using Python and scikit-learn? From the docs, we have only these 4 metric functions for Regressions: metrics.explained_variance_score(y_true,…
Nyxynyx
  • 885
  • 3
  • 9
  • 15
43
votes
2 answers

Test for bimodal distribution

I wonder if there is any statistical test to "test" the significance of a bimodal distribution. I mean, How much my data meets the bimodal distribution or not? If so, is there any test in the R program?
Pauloc
  • 613
  • 2
  • 6
  • 6
43
votes
5 answers

Negative values for AICc (corrected Akaike Information Criterion)

I have calculated AIC and AICc to compare two general linear mixed models; The AICs are positive with model 1 having a lower AIC than model 2. However, the values for AICc are both negative (model 1 is still < model 2). Is it valid to use and…
Freya Harrison
  • 3,212
  • 4
  • 25
  • 31
43
votes
3 answers

How do DAGs help to reduce bias in causal inference?

I have read in several places that the use of DAGs can help to reduce bias due to Confounding Differential Selection Mediation Conditioning on a collider I also see the term “backdoor path” a lot. How do we use DAGs to reduce these biases, and…
LeelaSella
  • 1,770
  • 3
  • 24
  • 42
43
votes
4 answers

What are the factors that cause the posterior distributions to be intractable?

In Bayesian statistics, it is often mentioned that the posterior distribution is intractable and thus approximate inference must be applied. What are the factors that cause this intractability?
Nick
  • 3,327
  • 6
  • 28
  • 24