Are there any indicators in the data for the maximal accuracy of machine learning solutions? Lets say you have these labeled cookie data (web address, time, data about OS and browser...) of 10000 people, and you have to build ML prediction models for their sex. You try some get a 70% accuracy (compared to 50% benchmark of a simple guess that everyone is female). How could you know that it's all what cookie data can give you, that you can only get that much accuracy with that kind of data for your prediction.
For me, the more causality between independent and dependent variables in the data, the better accuracy in ML you can get. But to translating this in a number so you can sell ML projects to your clients, I have to evaluate my ML models on their data. How could I know if my models have reached the upper limit possible?