Most Popular

1500 questions
37
votes
2 answers

Is there any algorithm combining classification and regression?

I'm wondering if there's any algorithm could do classification and regression at the same time. For example, I'd like to let the algorithm learn a classifier, and at the same time within each label, it also learns a continuous target. Thus, for each…
37
votes
5 answers

Difference between feedback RNN and LSTM/GRU

I am trying to understand different Recurrent Neural Network (RNN) architectures to be applied to time series data and I am getting a bit confused with the different names that are frequently used when describing RNNs. Is the structure of Long…
Josie
  • 473
  • 1
  • 4
  • 5
37
votes
3 answers

Variable importance from SVM

How to obtain a variable (attribute) importance using SVM?
user88
37
votes
3 answers

Do Bayesian priors become irrelevant with large sample size?

When performing Bayesian inference, we operate by maximizing our likelihood function in combination with the priors we have about the parameters. Because the log-likelihood is more convenient, we effectively maximize $\sum \ln (\text{prior}) + \sum…
pixels
  • 529
  • 5
  • 12
37
votes
3 answers

Building an autoencoder in Tensorflow to surpass PCA

Hinton and Salakhutdinov in Reducing the Dimensionality of Data with Neural Networks, Science 2006 proposed a non-linear PCA through the use of a deep autoencoder. I have tried to build and train a PCA autoencoder with Tensorflow several times but I…
Donbeo
  • 3,001
  • 5
  • 31
  • 48
37
votes
4 answers

Experimental evidence supporting Tufte-style visualizations?

Q: Does there exist experimental evidence supporting Tufte-style, minimalist, data-speak visualizations over the chart-junked visualizations of, say, Nigel Holmes? I asked how to add chart-junk to R plots here and responders threw a hefty amount of…
lowndrul
  • 2,057
  • 1
  • 18
  • 20
37
votes
4 answers

Cloud computing platforms for machine learning

I've got a small list of companies that provide a platform for running R, python, or octave scripts on clusters built on top of amazon EC2. Are there other names I should add? Cloudnumbers Opani crdata
Zach
  • 22,308
  • 18
  • 114
  • 158
37
votes
5 answers

Will the fact that my Italian son is going to attend a primary school change the expected number of Italian children to be present in his class?

This is a question stemming from a real-life situation, for which I have been genuinely puzzled about its answer. My son is due to start primary school in London. As we are Italian, I was curious to know how many Italian children are already…
jj90213
  • 445
  • 4
  • 7
37
votes
3 answers

What is the relationship between orthogonal, correlation and independence?

I've read an article saying that when using planned contrasts to find means that are different in an one way ANOVA, constrasts should be orthogonal so that they are uncorrelated and prevent the type I error from being inflated. I don't understand…
Carl Levasseur
  • 523
  • 1
  • 5
  • 7
37
votes
10 answers

What is your favorite layman's explanation for a difficult statistical concept?

I really enjoy hearing simple explanations to complex problems. What is your favorite analogy or anecdote that explains a difficult statistical concept? My favorite is Murray's explanation of cointegration using a drunkard and her dog. Murray…
brotchie
  • 671
  • 2
  • 10
  • 11
37
votes
4 answers

Measures of similarity or distance between two covariance matrices

Are there any measures of similarity or distance between two symmetric covariance matrices (both having the same dimensions)? I am thinking here of analogues to KL divergence of two probability distributions or the Euclidean distance between vectors…
37
votes
2 answers

What is the difference between censoring and truncation?

In the book Statistical Models and Methods for Lifetime Data , it is written : Censoring: When an observation is incomplete due to some random cause. Truncation: When the incomplete nature of the observation is due to a systematic selection process…
ABC
  • 1,367
  • 3
  • 13
  • 31
37
votes
5 answers

Is p-value essentially useless and dangerous to use?

This article "The Odds, Continually Updated" from NY Times happened to catch my attention. To be short, it states that [Bayesian statistics] is proving especially useful in approaching complex problems, including searches like the one the Coast…
37
votes
2 answers

When is t-SNE misleading?

Quoting from one of the authors: t-Distributed Stochastic Neighbor Embedding (t-SNE) is a (prize-winning) technique for dimensionality reduction that is particularly well suited for the visualization of high-dimensional datasets. So it sounds…
Lyndon White
  • 2,744
  • 1
  • 19
  • 35
37
votes
7 answers

Are all simulation methods some form of Monte Carlo?

Is there a simulation method that is not Monte Carlo? All simulation methods involve substituting random numbers into the function to find a range of values for the function. So are all simulation methods in essence Monte Carlo methods?
Victor
  • 5,925
  • 13
  • 43
  • 67