Under mild assumptions on the noise mechanism and data distribution (e.g. less than $\frac{1}{2}$ of the data is incorrectly labelled), some classifiers can be shown to be consistent in the binary classification setting. A classifier $C_n$, depending on the training data, is said to be consistent if
$$R(C_n) → R(C_{Bayes}) \;\; as \;\; n → ∞$$
where a classifiers risk, $R(C)$ := is minimised by the Bayes classifier
$$
C^{Bayes}(x) :=
\begin{cases}
1,& \text{if } η(x) ≥ 1/2\\
0,& \text{otherwise}
\end{cases}$$
K-nearest-neighbours and Support Vector Machines can be shown to satisfy this condition while Linear Discriminant Analysis does not. Since this limit is guaranteed as $n → ∞$, this doesn't answer how much data you will need in your case, however simulation studies are done in the paper I reference below which may help give you an intuition.
Reference
Cannings, T. I., Fan, Y. and Samworth, R. J. (2018) Classification with
imperfect training labels. https://arxiv.org/abs/1805.11505.