Questions tagged [machine-learning]

Machine learning algorithms build a model of the training data. The term "machine learning" is vaguely defined; it includes what is also called statistical learning, reinforcement learning, unsupervised learning, etc. ALWAYS ADD A MORE SPECIFIC TAG.

Overview

From The Discipline of Machine Learning by Tom Mitchell:

The field of Machine Learning seeks to answer the question "How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?" This question covers a broad range of learning tasks, such as how to design autonomous mobile robots that learn to navigate from their own experience, how to data mine historical medical records to learn which future patients will respond best to which treatments, and how to build search engines that automatically customize to their user's interests. To be more precise, we say that a machine learns with respect to a particular task T, performance metric P, and type of experience E, if the system reliably improves its performance P at task T, following experience E. Depending on how we specify T, P, and E, the learning task might also be called by names such as data mining, autonomous discovery, database updating, programming by example, etc.

High level machine learning problems include:

  • supervised learning (tag);
  • unsupervised learning (tag);
  • semi-supervised learning (tag);
  • outlier or anomaly detection (tag); and
  • reinforcement learning (tag).

References

The following threads have details of references on the subject:

The following journals are dedicated to research in Machine Learning:

18068 questions
530
votes
11 answers

What is the difference between test set and validation set?

I found this confusing when I use the neural network toolbox in Matlab. It divided the raw data set into three parts: training set validation set test set I notice in many training or learning algorithm, the data is often divided into 2 parts, the…
xiaohan2012
  • 6,819
  • 5
  • 18
  • 18
478
votes
20 answers

The Two Cultures: statistics vs. machine learning?

Last year, I read a blog post from Brendan O'Connor entitled "Statistics vs. Machine Learning, fight!" that discussed some of the differences between the two fields. Andrew Gelman responded favorably to this: Simon Blomberg: From R's fortunes …
Shane
  • 11,961
  • 17
  • 71
  • 89
419
votes
5 answers

How to understand the drawbacks of K-means

K-means is a widely used method in cluster analysis. In my understanding, this method does NOT require ANY assumptions, i.e., give me a dataset and a pre-specified number of clusters, k, and I just apply this algorithm which minimizes the sum of…
KevinKim
  • 6,347
  • 4
  • 21
  • 35
328
votes
8 answers

Why is Euclidean distance not a good metric in high dimensions?

I read that 'Euclidean distance is not a good distance in high dimensions'. I guess this statement has something to do with the curse of dimensionality, but what exactly? Besides, what is 'high dimensions'? I have been applying hierarchical…
287
votes
8 answers

Bagging, boosting and stacking in machine learning

What's the similarities and differences between these 3 methods: Bagging, Boosting, Stacking? Which is the best one? And why? Can you give me an example for each?
254
votes
3 answers

How to know that your machine learning problem is hopeless?

Imagine a standard machine-learning scenario: You are confronted with a large multivariate dataset and you have a pretty blurry understanding of it. What you need to do is to make predictions about some variable based on what you have. As…
Tim
  • 108,699
  • 20
  • 212
  • 390
226
votes
4 answers

ROC vs precision-and-recall curves

I understand the formal differences between them, what I want to know is when it is more relevant to use one vs. the other. Do they always provide complementary insight about the performance of a given classification/detection system? When is it…
Amelio Vazquez-Reina
  • 17,546
  • 26
  • 74
  • 110
219
votes
13 answers

What is the difference between data mining, statistics, machine learning and AI?

What is the difference between data mining, statistics, machine learning and AI? Would it be accurate to say that they are 4 fields attempting to solve very similar problems but with different approaches? What exactly do they have in common and…
Olivier Lalonde
  • 121
  • 3
  • 3
  • 5
219
votes
9 answers

Why is Newton's method not widely used in machine learning?

This is something that has been bugging me for a while, and I couldn't find any satisfactory answers online, so here goes: After reviewing a set of lectures on convex optimization, Newton's method seems to be a far superior algorithm than gradient…
Fei Yang
  • 2,181
  • 3
  • 8
  • 4
200
votes
4 answers

What does the hidden layer in a neural network compute?

I'm sure many people will respond with links to 'let me google that for you', so I want to say that I've tried to figure this out so please forgive my lack of understanding here, but I cannot figure out how the practical implementation of a neural…
FAtBalloon
  • 2,137
  • 3
  • 13
  • 8
197
votes
3 answers

Generative vs. discriminative

I know that generative means "based on $P(x,y)$" and discriminative means "based on $P(y|x)$," but I'm confused on several points: Wikipedia (+ many other hits on the web) classify things like SVMs and decision trees as being discriminative. But…
Yang
  • 2,981
  • 3
  • 20
  • 18
191
votes
7 answers

What are the advantages of ReLU over sigmoid function in deep neural networks?

The state of the art of non-linearity is to use rectified linear units (ReLU) instead of sigmoid function in deep neural network. What are the advantages? I know that training a network when ReLU is used would be faster, and it is more biological…
RockTheStar
  • 11,277
  • 31
  • 63
  • 89
190
votes
10 answers

Why is accuracy not the best measure for assessing classification models?

This is a general question that was asked indirectly multiple times in here, but it lacks a single authoritative answer. It would be great to have a detailed answer to this for the reference. Accuracy, the proportion of correct classifications among…
Tim
  • 108,699
  • 20
  • 212
  • 390
187
votes
10 answers

Why the sudden fascination with tensors?

I've noticed lately that a lot of people are developing tensor equivalents of many methods (tensor factorization, tensor kernels, tensors for topic modeling, etc) I'm wondering, why is the world suddenly fascinated with tensors? Are there recent…
Y. S.
  • 1,237
  • 3
  • 9
  • 14
178
votes
5 answers

Training on the full dataset after cross-validation?

TL:DR: Is it ever a good idea to train an ML model on all the data available before shipping it to production? Put another way, is it ever ok to train on all data available and not check if the model overfits, or get a final read of the expected…
Amelio Vazquez-Reina
  • 17,546
  • 26
  • 74
  • 110
1
2 3
99 100