Active learning is a setting where an automated learning system can request labels from an external source, perhaps a human user or a real-world experiment. It is used to try to learn good models while minimizing the number of interventions required.
Questions tagged [active-learning]
40 questions
7
votes
3 answers
Using ML to assist human labelling in dataset with highly unbalanced classes
Are there scientific issues with using ML to assist human annotation?
I've got a 3 class unlabelled dataset where only 1 in 500 elements belong to the 2 classes of interest.
The labels arn't trivially discernible for all the elements of the…

Aidan Connelly
- 91
- 8
5
votes
1 answer
Combine reinforces and unsupervised learning?
I have an existing set of data and plan to generate more data that follows the same pattern. To do this, I plan to use unsupervised learning. How can I provide feedback on the generated data and reinforce "good" and discourage "bad" results?
In…

Streetlamp
- 151
- 3
5
votes
1 answer
Labeling a pool of unlabelled samples iteratively
Problem setting
I'm faced with a problem in which we have a large set of data points (100K), all of which are still unlabelled. These are to be used as input to a binary classifier at a later point in time. Since sampling is very costly, we need to…

ciri
- 1,123
- 9
- 21
4
votes
0 answers
R package for active learning and sampling
I tried to find R package for active learning/sampling and stumbled over "activelearning" package. However, this package is not available on CRAN.
Does someone know how good is this package?
Are there any other active learning packages available on…

Michael
- 219
- 1
- 9
3
votes
3 answers
Strategies for incorporating feedback for a ML algorithm
I am developing a text classification problem, in which at some time points, say at the end of each week, I receive a batch of feedback from users about correctly and wrongly classified inputs. I am trying different strategies for incorporating this…

Hamed
- 31
- 2
3
votes
1 answer
Batch Active Learning for classification?
Say we have unlimited unlabeled data and we can ask an oracle for labels. We can use active learning to choose the most informative data samples for labeling, thus minimizing data labeling cost. If we choose a single sample at a time, we can choose…

Adam Kosiorek
- 141
- 4
2
votes
1 answer
Query-By-Committee with abstention
I’ve some difficulties understanding how abstention works in Active Learning. A teacher asked me to implement the active learning algorithm Query-by-Committee which helps a committee to ask the better points to the oracle. I’ve understand how the…

Valentin Dusollier
- 21
- 2
2
votes
1 answer
Are there any tutorial for text categorization with Active Learning?
I am going to categorize texts into predefined topics. It looks like Active Learning approach is suits for me. Are there any good tutorials or advices?

drobnbobn
- 427
- 1
- 4
- 10
2
votes
1 answer
Asymmetrical selective sampling for linear classification
I've got a online classification problem where I predict a class label {+1, -1} for an object and then show it to a user to get a real label. My task is to minimize a number of -1 objects shown to a user.
Obviously, the algorithm will not converge…

martinthenext
- 103
- 7
2
votes
0 answers
Active learning system confusion matrix
A colleague of mine has been developing a machine learning system with an active learning component. I was having trouble reproducing the metrics he's been reporting, until I found out that he's handling low-confidence results in an usual-to-me…

Richard Turner
- 21
- 2
1
vote
0 answers
Active learning to counter concept drift
I'll be doing my thesis soon on model drift detection and possible remedies in a production environment. I'll probably be making an intuitive (hopefully!) theoretical framework with various types of model drift, root causes and solutions. Later on…

Zestar75
- 11
- 1
1
vote
0 answers
Is variance of individual points the best strategy for chosing points in active learning?
In active learning with estimators like Gaussian Processes, we typically look at the test-test covariance matrix, and choose for the next round of observation, the points who have the largest variance on the diagonal.
This comes from the intuition…

user3246971
- 395
- 1
- 8
1
vote
0 answers
What is the difference between spliting the dataset into training and testing or collecting the training and testing data seperately?
I am working on active learning and I was wondering about the difference if we split the dataset into training and testing or collecting and labeling the training and testing datasets separately. Either way, the ratio between training and testing…

Phoenix
- 111
- 5
1
vote
0 answers
validation and test set definition for active learing with rare classes
Context
I've got an active learning problem with an event rate of about 1%. The data is a panel, individuals over time. We have a proxy label that is highly correlated with the true label within individuals, but not in time. In other words, if…

generic_user
- 11,981
- 8
- 40
- 63
1
vote
0 answers
Experiment Design for Black-Box Function with Multiple Outputs
I have a multi-output black box function $f: x \rightarrow y$, where $x \in R^{M}$ and $y \in R^N$. Both $M$ and $N$ are greater than 1. For example, $M=4$, and $N=3$. My goal is to sample a set $\mathcal{X}$, so that the projected set…

Hammer. Wang
- 183
- 8