Questions tagged [active-learning]

Active learning is a setting where an automated learning system can request labels from an external source, perhaps a human user or a real-world experiment. It is used to try to learn good models while minimizing the number of interventions required.

40 questions
7
votes
3 answers

Using ML to assist human labelling in dataset with highly unbalanced classes

Are there scientific issues with using ML to assist human annotation? I've got a 3 class unlabelled dataset where only 1 in 500 elements belong to the 2 classes of interest. The labels arn't trivially discernible for all the elements of the…
5
votes
1 answer

Combine reinforces and unsupervised learning?

I have an existing set of data and plan to generate more data that follows the same pattern. To do this, I plan to use unsupervised learning. How can I provide feedback on the generated data and reinforce "good" and discourage "bad" results? In…
5
votes
1 answer

Labeling a pool of unlabelled samples iteratively

Problem setting I'm faced with a problem in which we have a large set of data points (100K), all of which are still unlabelled. These are to be used as input to a binary classifier at a later point in time. Since sampling is very costly, we need to…
ciri
  • 1,123
  • 9
  • 21
4
votes
0 answers

R package for active learning and sampling

I tried to find R package for active learning/sampling and stumbled over "activelearning" package. However, this package is not available on CRAN. Does someone know how good is this package? Are there any other active learning packages available on…
Michael
  • 219
  • 1
  • 9
3
votes
3 answers

Strategies for incorporating feedback for a ML algorithm

I am developing a text classification problem, in which at some time points, say at the end of each week, I receive a batch of feedback from users about correctly and wrongly classified inputs. I am trying different strategies for incorporating this…
3
votes
1 answer

Batch Active Learning for classification?

Say we have unlimited unlabeled data and we can ask an oracle for labels. We can use active learning to choose the most informative data samples for labeling, thus minimizing data labeling cost. If we choose a single sample at a time, we can choose…
2
votes
1 answer

Query-By-Committee with abstention

I’ve some difficulties understanding how abstention works in Active Learning. A teacher asked me to implement the active learning algorithm Query-by-Committee which helps a committee to ask the better points to the oracle. I’ve understand how the…
2
votes
1 answer

Are there any tutorial for text categorization with Active Learning?

I am going to categorize texts into predefined topics. It looks like Active Learning approach is suits for me. Are there any good tutorials or advices?
drobnbobn
  • 427
  • 1
  • 4
  • 10
2
votes
1 answer

Asymmetrical selective sampling for linear classification

I've got a online classification problem where I predict a class label {+1, -1} for an object and then show it to a user to get a real label. My task is to minimize a number of -1 objects shown to a user. Obviously, the algorithm will not converge…
2
votes
0 answers

Active learning system confusion matrix

A colleague of mine has been developing a machine learning system with an active learning component. I was having trouble reproducing the metrics he's been reporting, until I found out that he's handling low-confidence results in an usual-to-me…
1
vote
0 answers

Active learning to counter concept drift

I'll be doing my thesis soon on model drift detection and possible remedies in a production environment. I'll probably be making an intuitive (hopefully!) theoretical framework with various types of model drift, root causes and solutions. Later on…
1
vote
0 answers

Is variance of individual points the best strategy for chosing points in active learning?

In active learning with estimators like Gaussian Processes, we typically look at the test-test covariance matrix, and choose for the next round of observation, the points who have the largest variance on the diagonal. This comes from the intuition…
user3246971
  • 395
  • 1
  • 8
1
vote
0 answers

What is the difference between spliting the dataset into training and testing or collecting the training and testing data seperately?

I am working on active learning and I was wondering about the difference if we split the dataset into training and testing or collecting and labeling the training and testing datasets separately. Either way, the ratio between training and testing…
1
vote
0 answers

validation and test set definition for active learing with rare classes

Context I've got an active learning problem with an event rate of about 1%. The data is a panel, individuals over time. We have a proxy label that is highly correlated with the true label within individuals, but not in time. In other words, if…
1
vote
0 answers

Experiment Design for Black-Box Function with Multiple Outputs

I have a multi-output black box function $f: x \rightarrow y$, where $x \in R^{M}$ and $y \in R^N$. Both $M$ and $N$ are greater than 1. For example, $M=4$, and $N=3$. My goal is to sample a set $\mathcal{X}$, so that the projected set…
1
2 3