Let's say we're building a spam classifier. When we feed it an email, it accurately classifies it as spam/not-spam 98% of the time. Then we discover that 99% of the email we receive is actually spam. The 98% accuracy doesn't look good anymore.
On skewed datasets (e.g., when there are more positive examples than negative examples, accuracy is not a good measure of performance and you should instead use F score, which is based on precision and recall. -Week 6 of Andrew Ng's Machine Learning on Coursera
I can see the value of looking at precision and recall separately, but I don't see the value of combining them with a non intuitive formula to replace accuracy as our primary overall measurement of how our learning algorithm is performing. If 99% is the accuracy that a simple "always guess yes" -algorithm can achieve, can't we just set that as a comparison and say that our algorithm is doing good when it's achieving considerably better accuracy (like 99,76% for instance).
Accuracy has the obvious benefit that it is simple and intuitive. What advantages does the F-score hold over accuracy in our "spam classifier" -example?