Most Popular

1500 questions
62
votes
2 answers

What does the inverse of covariance matrix say about data? (Intuitively)

I'm curious about the nature of $\Sigma^{-1}$. Can anybody tell something intuitive about "What does $\Sigma^{-1}$ say about data?" Edit: Thanks for replies After taking some great courses, I'd like to add some points: It is measure of information,…
Arya
  • 873
  • 1
  • 7
  • 8
62
votes
1 answer

Wald test for logistic regression

As far as I understand the Wald test in the context of logistic regression is used to determine whether a certain predictor variable $X$ is significant or not. It rejects the null hypothesis of the corresponding coefficient being zero. The test…
user695652
  • 1,351
  • 3
  • 15
  • 22
62
votes
13 answers

Two-tailed tests... I'm just not convinced. What's the point?

The following excerpt is from the entry, What are the differences between one-tailed and two-tailed tests?, on UCLA's statistics help site. ... consider the consequences of missing an effect in the other direction. Imagine you have developed a new…
62
votes
3 answers

Questions about how random effects are specified in lmer

I recently measured how the meaning of a new word is acquired over repeated exposures (practice: day 1 to day 10) by measuring ERPs (EEGs) when the word was viewed in different contexts. I also controlled properties of the context, for instance, its…
alwin hoff
  • 621
  • 1
  • 6
  • 3
62
votes
3 answers

F1/Dice-Score vs IoU

I was confused about the differences between the F1 score, Dice score and IoU (intersection over union). By now I found out that F1 and Dice mean the same thing (right?) and IoU has a very similar formula to the other two. F1 / Dice:…
pietz
  • 723
  • 1
  • 6
  • 6
62
votes
2 answers

A more definitive discussion of variable selection

Background I'm doing clinical research in medicine and have taken several statistics courses. I've never published a paper using linear/logistic regression and would like to do variable selection correctly. Interpretability is important, so no fancy…
sharper_image
  • 737
  • 7
  • 10
62
votes
1 answer

How to interpret type I, type II, and type III ANOVA and MANOVA?

My primary question is how to interpret the output (coefficients, F, P) when conducting a Type I (sequential) ANOVA? My specific research problem is a bit more complex, so I will break my example into parts. First, if I am interested in the effect…
djhocking
  • 1,701
  • 3
  • 17
  • 21
62
votes
12 answers

Software needed to scrape data from graph

Anybody have any experience with software (preferably free, preferably open source) that will take an image of data plotted on cartesian coordinates (a standard, everyday plot) and extract the coordinates of the points plotted on the…
Alex Holcombe
  • 519
  • 1
  • 7
  • 9
62
votes
4 answers

What are the differences between 'epoch', 'batch', and 'minibatch'?

As far as I know, when adopting Stochastic Gradient Descent as learning algorithm, someone use 'epoch' for full dataset, and 'batch' for data used in a single update step, while another use 'batch' and 'minibatch' respectively, and the others use…
Tim
  • 721
  • 1
  • 6
  • 4
62
votes
4 answers

Under what conditions should Likert scales be used as ordinal or interval data?

Many studies in the social sciences use Likert scales. When is it appropriate to use Likert data as ordinal and when is it appropriate to use it as interval data?
A Lion
  • 1,081
  • 2
  • 12
  • 12
61
votes
4 answers

Comparing SVM and logistic regression

Can someone please give me some intuition as to when to choose either SVM or LR? I want to understand the intuition behind what is the difference between the optimization criteria of learning the hyperplane of the two, where the respective aims are…
user41799
  • 661
  • 1
  • 6
  • 5
61
votes
7 answers

Data normalization and standardization in neural networks

I am trying to predict the outcome of a complex system using neural networks (ANN's). The outcome (dependent) values range between 0 and 10,000. The different input variables have different ranges. All the variables have roughly normal…
61
votes
3 answers

Clustering with K-Means and EM: how are they related?

I have studied algorithms for clustering data (unsupervised learning): EM, and k-means. I keep reading the following : k-means is a variant of EM, with the assumptions that clusters are spherical. Can somebody explain the above sentence? I do…
61
votes
4 answers

How to derive variance-covariance matrix of coefficients in linear regression

I am reading a book on linear regression and have some trouble understanding the variance-covariance matrix of $\mathbf{b}$: The diagonal items are easy enough, but the off-diagonal ones are a bit more difficult, what puzzles me is that…
qed
  • 2,508
  • 3
  • 21
  • 33
61
votes
6 answers

Which permutation test implementation in R to use instead of t-tests (paired and non-paired)?

I have data from an experiment that I analyzed using t-tests. The dependent variable is interval scaled and the data are either unpaired (i.e., 2 groups) or paired (i.e., within-subjects). E.g. (within subjects): x1 <- c(99, 99.5, 65, 100, 99,…
Henrik
  • 13,314
  • 9
  • 63
  • 123