Questions tagged [semi-supervised-learning]

Semi-supervised learning refers to machine learning tasks using a mix of labeled and unlabeled data. The goal is to learn a mapping from inputs to outputs, or to obtain outputs for particular unlabeled inputs. The unlabeled data is used to learn about underlying structure of the inputs, which can improve learning about the relationship between inputs and outputs. Semi-supervised learning involves elements of both supervised and unsupervised learning.

140 questions

votes

1 answer

Distant supervision: supervised, semi-supervised, or both?

"Distant supervision" is a learning scheme in which a classifier is learned given a weakly labeled training set (training data is labeled automatically based on heuristics / rules). I think that both supervised learning, and semi-supervised…

asked Dec 29 '12 at 15:14

AM2

1,237
2
11
10

votes

3 answers

Unsupervised, supervised and semi-supervised learning

In the context of machine learning, what is the difference between unsupervised learning supervised learning and semi-supervised learning? And what are some of the main algorithmic approaches to look at?

machine-learning unsupervised-learning supervised-learning semi-supervised-learning

asked Jul 22 '10 at 16:21

Ami

votes

2 answers

What's the intuition behind contrastive learning or approach?

Maybe a noobs query, but recently I have seen a surge of papers w.r.t contrastive learning (a subset of semi-supervised learning). Some of the prominent and recent research papers which I read, which detailed this approach are: Representation…

neural-networks unsupervised-learning intuition semi-supervised-learning transfer-learning

asked Feb 28 '20 at 16:04

CATALUNA84

votes

2 answers

What is the manifold assumption in semi-supervised learning?

I am trying to figure out what the manifold assumption means in semi-supervised learning. Can anyone explain in a simple way? I cannot get the intuition behind it. It says that your data lie on a low-dimensional manifold embedded in a…

machine-learning unsupervised-learning supervised-learning semi-supervised-learning manifold-learning

asked Aug 09 '13 at 14:06

user34790

6,049
6
42
64

votes

4 answers

"Semi supervised learning" - is this overfitting?

I was reading the report of the winning solution of a Kaggle competition (Malware Classification). The report can be found in this forum post. The problem was a classification problem (nine classes, the metric was the logarithmic loss) with 10000…

machine-learning random-forest boosting overfitting semi-supervised-learning

asked Dec 24 '15 at 09:55

RUser4512

9,226
5
29
59

votes

3 answers

How to predict outcome with only positive cases as training?

For the sake of simplicity, let's say I'm working on the classic example of spam/not-spam emails. I have a set of 20000 emails. Of these, I know that 2000 are spam but I don't have any example of not-spam emails. I'd like to predict whether the…

machine-learning predictive-models unsupervised-learning supervised-learning semi-supervised-learning

asked Sep 27 '15 at 13:01

enricoferrero

votes

4 answers

Why does using pseudo-labeling non-trivially affect the results?

I've been looking into semi-supervised learning methods, and have come across the concept of "pseudo-labeling". As I understand it, with pseudo-labeling you have a set of labeled data as well as a set of unlabeled data. You first train a model on…

machine-learning semi-supervised-learning

asked Aug 29 '18 at 19:28

R.M.

votes

1 answer

Is there any difference between distant supervision, self-training, self-supervised learning, and weak supervision?

From what I have read: Distant supervision: A Distant supervision algorithm usually has the following steps: 1] It may have some labeled training data 2] It "has" access to a pool of unlabeled data 3] It has an operator that allows it to sample…

machine-learning terminology unsupervised-learning semi-supervised-learning

asked Nov 29 '14 at 19:42

Franck Dernoncourt

42,093
30
155
271

votes

3 answers

Classification with partially "unknown" data

Suppose I want to learn a classifier that takes a vector of numbers as input, and gives a class label as output. My training data consists of a large number of input-output pairs. However, when I come to testing on some new data, this data is…

machine-learning neural-networks classification semi-supervised-learning

asked Sep 02 '15 at 23:46

Karnivaurus

5,909
10
36
52

votes

2 answers

How to find weights for a dissimiliarity measure

I want to learn (deduce) attribute weights for my dissimilarity measure that I can use for clustering. I have some examples $(a_i,b_i)$ of pairs of objects that are "similar" (should be in the same cluster), as well as some examples $(c_i,d_i)$ of…

clustering similarities supervised-learning semi-supervised-learning

asked Apr 25 '15 at 04:54

D.W.

5,892
2
39
60

votes

1 answer

What does the term "gold label" refer to in the context of semi-supervised classification?

Throughout the Snorkel tutorial here https://github.com/HazyResearch/snorkel and in the team's related white paper there's references to "gold labels", but the term evades definition. What are 'gold labels' in the semi-supervised classification…

classification semi-supervised-learning

asked Mar 14 '18 at 14:00

raldy

votes

4 answers

Semi supervised classification with unseen classes

Consider the following problem. You have a large dataset, some small subset of which have labels from the classes A, B and C. I would like to classify the unlabelled subset of items each of which can be from classes A, B and C or (crucially) also…

classification semi-supervised-learning

asked Sep 30 '15 at 15:16

graffe

1,799
1
22
34

votes

2 answers

Incorporate new unlabeled data into classifier trained on a small set of labeled data

I have a set of 400 labeled samples (8 numeric features) on which I trained a binary classifier. The problem I am facing is that once the classifier is shipped to the users, I will get additional samples, but those will be unlabeled. I was…

classification clustering semi-supervised-learning labeling

asked May 18 '15 at 18:48

user695652

1,351
3
15
22

votes

2 answers

Binary classification when many binary features are missing

I'm working on a binary classification problem, with about 1000 binary features in total. The problem is that for each datapoint, I only know the values of a small subset of the features (around 10-50), and the features in this subset are pretty…

classification missing-data semi-supervised-learning

asked Mar 08 '11 at 00:05

raegtin

9,090
12
48
53

votes

0 answers

Computation of log-likelihood in semi-supervised naive bayes

I have the following 2 questions about log-likelihood computation in semi-supervised Naive Bayes. I have read on several documents online that, in every EM iteration of the semi-supervised Naive Bayes, log-likelihood is positive. Is this always…

classification likelihood naive-bayes unsupervised-learning semi-supervised-learning

asked Oct 16 '12 at 16:10

SUP

2 3

…

9 10 Next