How does one trade off quantity of labels for accuracy of labels

Asked Nov 02 '17 at 15:34

Active Jul 16 '18 at 00:18

Viewed 13 times

I'm working on the problem of automatically labeling groupings in music. In this domain, there's label at each time step indicating whether a group starts at this time step. However, labels are subjective, and not everyone agrees about the groupings.

I'm currently designing a study to collect these labels, and I'm trying to decide how many participants should listen to each sample. Of course this is a direct trade-off with the number of total samples that I get labels for. The number of samples will drive our choice of model, but we're particularly interested in RNNs.

How can I justify how many participants per sample I need? It could be as few as 1 person or as many as 10. 1 gives us more samples total, but 10 gives us better labels for each sample. How can I make a principled decision? My intuition says one person per sample, because that's what magenta and other rnn approaches to music tasks have done.

edited Jul 16 '18 at 00:18

Sycorax

76,417
20
189
313

asked Nov 02 '17 at 15:34

Peter Mitrano

1

I've removed the [tag:neural-networks] tag because designing and training NNs don't appear to be the core of the question -- instead, my understanding is that you're uncertain how to design your data collection. – Sycorax Jul 16 '18 at 00:19

How does one trade off quantity of labels for accuracy of labels

0 Answers0