Questions tagged [labeling]
63 questions
12
votes
2 answers
Do more object classes increase or decrease the accuracy of object detection
Assume you have an object detection dataset (e.g, MS COCO or Pascal VOC) with N images where k object classes have been labeled. You train a neural network (e.g., Faster-RCNN or YOLO) and measure the accuracy (e.g., IOU@0.5).
Now you introduce x…

SaiBot
- 249
- 2
- 9
8
votes
2 answers
Incorporate new unlabeled data into classifier trained on a small set of labeled data
I have a set of 400 labeled samples (8 numeric features) on which I trained a binary classifier.
The problem I am facing is that once the classifier is shipped to the users, I will get additional samples, but those will be unlabeled. I was…

user695652
- 1,351
- 3
- 15
- 22
5
votes
5 answers
How to deal with incorrect labels in classification?
I have a dataset with 2 classes: A and B. The problem is that 20% to 30% of the samples of class B are mislabeled (labeled as B but the right label is A) and I am not able to identify those mistakes.
Is there a way/approach/method to enhance the…

naddoth
- 61
- 1
- 4
5
votes
1 answer
Do ordinal variables require one hot encoding?
For categorical variables, one hot encoding is a must if the variable is non-binary . But what about ordinals? These variables are ordered but are mutually exclusive. Do they require the same treatment as categoricals other than labelling?

Shiv_90
- 201
- 3
- 11
5
votes
1 answer
Labeling a pool of unlabelled samples iteratively
Problem setting
I'm faced with a problem in which we have a large set of data points (100K), all of which are still unlabelled. These are to be used as input to a binary classifier at a later point in time. Since sampling is very costly, we need to…

ciri
- 1,123
- 9
- 21
4
votes
2 answers
Regression algorithm on [0,1] with lots of mislabeled data
I have a training set mapping some Likert-scale variables (integers between 1 and 7, rescaled to real numbers between 0 and 1) to predict a continuous variable between 0 and 1. The data set is reasonable large ($10^4$-$10^5$ rows) but very noisy…

user1111929
- 220
- 1
- 7
4
votes
0 answers
Medium Frequency Trading - Better labelling strategy?
The mid-price at time $t$ is denoted by $$p_t = \frac{s_t^{a,1} +
s_t^{b,1}}{2}.$$
This mid-price can evolve in minimum increments of half a tick but is
almost always observed to move at increments of a tick over time
intervals of a…

Jeremie
- 101
- 6
3
votes
1 answer
Elastic net/LASSO with soft labels
Sometimes you do not have firm Y/N labels, but e.g. 80% probability of Y as a label. E.g. this happens, if you train a model on a small amount of labelled data, predict for a large amount of unlabelled data and then want to use the predictions as…

Björn
- 21,227
- 2
- 26
- 65
3
votes
2 answers
is it scientifically correct to label data by model built using golden data?
I am trying to find a labeled dataset for users profiles pictures with their personality traits scores. Unfortunately, I did not find any and therefore, I decided to crawl twitter for public users profile pictures with their tweets. At that moment,…

Krebto
- 101
- 9
3
votes
1 answer
Logistic regression - labeling outcome by confidence of classification
We have trained our logistic regression model to classify candidates attending interviews as 'pursue' or 'fail' (two possible outcome)
Now as a post prediction step, we are planning to categorise the candidates as strong/mediocre/weak based on the…

Deepan Subramani
- 31
- 2
2
votes
1 answer
How to make a decision - when there is a tie and no human expert
We have two algorithms (simple rule-based) working on labeling the dataset as "Yes" and "No" for a disease. There is no ML involved in this task.
For ex: If Algo 1 says subject 1 has the disease (Yes) and Algo 2 also says subject 1 has the disease…

The Great
- 1,380
- 6
- 18
2
votes
1 answer
Supervised learning: setting labels on sliding windows of sensor data
Suppose that I have a set of accelerometer data collected with one sensor and one label for each measured data point. These labels describe different states of my system e.g., $state_A, state_B, state_C$, etc., and I want to use this information to…
user185498
2
votes
2 answers
Features and Variables in Data Analysis
I am pretty new to machine learning and data analysis in general. I have been learning about different algorithms as part of my course. Now, I am stuck with a particular problem. I have been given a dataset which has 52 variables (columns) and 500…

Ambarish
- 119
- 1
- 7
2
votes
2 answers
Labels for correlation coefficients
How could we attribute labels for correlation coefficients in order to facilitate reading the data specially for non-technical people or in qualitative analyses?
For example:
$\rho > 0.9$ - strongly correlated
$\rho > 0.7$ - moderately…

zeferino
- 571
- 3
- 12
1
vote
0 answers
How to do sentiment analysis in financial news?
I already have financial news that I got from financial news sites. Now I want to apply sentiment analysis to classify news as positive, neutral, or negative. I do not know what to do. I know some sentiment analysis models like VEDER and…

Eko Putra
- 11
- 2