Because of No Free Lunch Theorem, no machine learning algorithm can be said to be better than the other. However, in practice some algorithms are almost always better than the others. For example random forests and other tree based ensemble methods almost always beat other algorithms, see this paper.
One reply to this seemingly conflicting situation might be to note that No Free Lunch Theorem is true when we test algorithms on all possible data generating distributions. But in real life we have only a subset of these distributions. So my question is what are the properties of these real life probability distributions?