Questions tagged [supervised-learning]

Supervised learning is the machine learning task of inferring a function from labeled training data. The training data consist of a set of training examples. In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired output value (also called the supervisory signal). A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples.

596 questions
59
votes
5 answers

Apply word embeddings to entire document, to get a feature vector

How do I use a word embedding to map a document to a feature vector, suitable for use with supervised learning? A word embedding maps each word $w$ to a vector $v \in \mathbb{R}^d$, where $d$ is some not-too-large number (e.g., 500). Popular word…
53
votes
4 answers

Class imbalance in Supervised Machine Learning

This is a question in general, not specific to any method or data set. How do we deal with a class imbalance problem in Supervised Machine learning where the number of 0 is around 90% and number of 1 is around 10% in your dataset.How do we optimally…
NG_21
  • 1,436
  • 3
  • 17
  • 25
49
votes
1 answer

Difference between GradientDescentOptimizer and AdamOptimizer (TensorFlow)?

I've written a simple MLP in TensorFlow which is modelling a XOR-Gate. So for: input_data = [[0., 0.], [0., 1.], [1., 0.], [1., 1.]] it should produce the following: output_data = [[0.], [1.], [1.], [0.]] The network has an input layer, a hidden…
46
votes
1 answer

How is softmax_cross_entropy_with_logits different from softmax_cross_entropy_with_logits_v2?

Specifically, I suppose I wonder about this statement: Future major versions of TensorFlow will allow gradients to flow into the labels input on backprop by default. Which is shown when I use tf.nn.softmax_cross_entropy_with_logits. In the…
38
votes
4 answers

Is there any supervised-learning problem that (deep) neural networks obviously couldn't outperform any other methods?

I have seen people have put a lot of efforts on SVM and Kernels, and they look pretty interesting as a starter in Machine Learning. But if we expect that almost-always we could find outperforming solution in terms of (deep) Neural Network, what is…
Robin
  • 585
  • 1
  • 6
  • 9
32
votes
5 answers

How can you account for COVID-19 in your models?

How are you dealing with the coronavirus "event" in your machine learning models? Let's say you used to predict the number of sales each month. The virus affected your results last year and it will affect for at least a couple of months. So your…
30
votes
5 answers

Distinguishing between two groups in statistics and machine learning: hypothesis test vs. classification vs. clustering

Assume I have two data groups, labeled A and B (each containing e.g. 200 samples and 1 feature), and I want to know if they are different. I could: a) perform a statistical test (e.g. t-test) to see if they are statistically different. b) use…
30
votes
2 answers

Supervised learning, unsupervised learning and reinforcement learning: Workflow basics

Supervised learning 1) A human builds a classifier based on input and output data 2) That classifier is trained with a training set of data 3) That classifier is tested with a test set of data 4) Deployment if the output is satisfactory To be used…
29
votes
3 answers

Unsupervised, supervised and semi-supervised learning

In the context of machine learning, what is the difference between unsupervised learning supervised learning and semi-supervised learning? And what are some of the main algorithmic approaches to look at?
23
votes
2 answers

What is the manifold assumption in semi-supervised learning?

I am trying to figure out what the manifold assumption means in semi-supervised learning. Can anyone explain in a simple way? I cannot get the intuition behind it. It says that your data lie on a low-dimensional manifold embedded in a…
21
votes
3 answers

How to predict outcome with only positive cases as training?

For the sake of simplicity, let's say I'm working on the classic example of spam/not-spam emails. I have a set of 20000 emails. Of these, I know that 2000 are spam but I don't have any example of not-spam emails. I'd like to predict whether the…
19
votes
4 answers

Why does regularization wreck orthogonality of predictions and residuals in linear regression?

Following up on this question... In ordinary least squares, the predictions and residuals are orthogonal. $$\sum_{i=1}^n\hat{y}_i (y_i - \hat{y}_i) = 0$$ If we estimate the regression coefficients using some other method but the same model, such as…
19
votes
3 answers

(Why) Is absolute loss not a proper scoring rule?

Brier score is a proper scoring rule and is, at least in the binary classification case, square loss. $$Brier(y,\hat{y}) = \frac{1}{N} \sum_{i=1}^N\big\vert y_i -\hat{y}_i\big\vert^2$$ Apparently this can be adjusted for when there are three or more…
18
votes
4 answers

What *is* an Artificial Neural Network?

As we delve into Neural Networks literature, we get to identify other methods with neuromorphic topologies ("Neural-Network"-like architectures). And I'm not talking about the Universal Approximation Theorem. Examples are given below. Then, it makes…
17
votes
2 answers

What is the support vector machine?

What IS the support vector machine? Can someone clarify my confusion? Possible answers: The SVM is the problem: given data $(x_n, y_n), n = 1, \ldots, N$ $$\min_{w, b}\frac{1}{2}||w||^2$$ $$\text{ subject to: } y_n(w \cdot x_n + b) \geq 1,…
1
2 3
39 40