This is a wide topic, and you will encounter a range of reasons why data should be, or already is, bucketized. Not all of them are related to predictive accuracy.
First, here's an example where a modeler may want to bucketize. Suppose I'm building a credit scoring model: I want to know people's propensity to default on a loan. In my data, I have a column indicating the status of a credit report. That is, I ordered the report from a rating agency, and the agency returned, say, their proprietary score, along with a categorical variable indicating the reliability of this score. This indicator may be much more fined grained than I need for my purposes. For example, the "no enough information for reliable score" may be broken out into many classes like "less than 20 years of age", "recently moved to the country", "no prior credit history", etc. Many of these classes may be sparsely populated, and hence rather useless in a regression or other model. To deal with this, I may want to pool like classes together to consolidate the statistical power into a "representative" class. For example, it may only be reasonable for me to use a binary indicator "good information returned" vs. "no information returned". In my experience, many applications of bucketization fall into this general collapsing of sparsely populated categories type.
Some algorithms use bucketization internally. For example, trees fit inside of boosting algorithms often spend the majority of their time in a summarization step, where the continuous data in each node is discretized and the mean value of the response in each bucket is calculated. This greatly reduces the computational complexity of finding an appropriate split, without much sacrifice in accuracy due to the boosting.
You may also simply receive data pre-bucketized. Discrete data is easier to compress and store - a long array of floating point numbers is nigh incompressible, but when discretized into "high", "medium" and "low", you can save a lot of space in your database. Your data may also be from a source targeted at a non-modeling application. This tends to happen a lot when I receive data from organizations that do less analytical work. Their data is often used for reporting, and is summarized to a high level to help with the interpretability of the reports to laymen. This data can still be useful, but often some power is lost.
What I see less value in, though its possible I may be corrected, is the pre-bucketization of continuous measurements for modeling purposes. There are plenty of very powerful methods for fitting non-linear effects to continuous predictors, and buckeization removes your ability to use these. I tend to see this as a bad practice.