It depends on where you apply the window function. If you do it in the time domain, it's because you only want to analyze the periodic behavior of the function in a short duration. You do this when you don't believe that your data is from a stationary process.
If you do it in the frequency domain, then you do it to isolate a specific set of frequencies for further analysis; you do this when you believe that (for instance) high-frequency components are spurious.
The first three chapters of "A Wavelet Tour of Signal Processing" by Stephane Mallat have an excellent introduction to signal processing in general, and chapter 4 goes into a very good discussion of windowing and time-frequency representations in both continuous and discrete time, along with a few worked-out examples.