I have done some tests in OpenGL with multisampling, by specifying a multisample buffer as the default framebuffer. I used forward shading for rendering. From what I've observed the more primitives I have on the screen the higher the performance hit by multisampling. Is this because the built-in MSAA does multisampling in the fragment shader for each primitive we specify, even if its fragments are discarded because of depth testing? In order to fix this problem we could use deferred shading, but then we would have do multisampling ourselves.
Does this mean then that in practice for scenes with overlapping objects the built-in MSAA is pretty much never going to be used?