Part of your misunderstanding comes from the fact that there are many ways in which the radar signal processing chain is implemented. Depending on the type of radar, targets of interest, hardware, etc., some methods are more appropriate than others. We will consider pulsed-Doppler radar here.
In the chain you describe:
In modern pulse-Doppler systems using pulse compression, the act of performing pulse compression is virtually synonymous with matched-filtering. You can combine these two terms to mean the same thing and simplify your thinking. You are requiring that range detection happen before processing Doppler, but this not be the case. You can very well detect a target using its Doppler information only while being denied range. It is mainly during acquisition of the target that being denied either of the measurement dimensions is truly detrimental. In the tracking phase, you can "track through" and continue to update target velocity information using Doppler only to then later be able to receive good range measurements.
In addition, the system you describe is a more classical approach of using a Doppler filter bank to separate clutter and to determine the target's radial velocity. Many radar systems require that a signal make it past the Doppler filter bank to even consider performing a measurement. Hence why the block diagram you show has the "detection" block coming after the "Doppler filtering" block.
What the block diagram suggests:
The block diagram can represent two main schemes:
Single echoes from targets that make it past the filter bank are considered to perform a measurement. If a target appears at the output of one of these filters (i.e. adequate SNR and mitigated clutter), you can declare a detection where range and Doppler are measured simultaneously. Multiple pulses may be integrated to improve SNR.
The radar system performs Doppler processing by forming a range-Doppler map. This map can be seen as a 2D array formed by collecting multiple target returns after matched filtering for multiple pulse-repetition intervals (PRIs). Once a certain number of matched-filtered returns are collected (usually a power of 2), a DFT is performed in what is called the "slow-time" dimension. In other words, a DFT is performed across every range gate (a range gate is equivalent to each sample of the matched-filter output). After this step, you now have a 2D map where one axis is the delay (via the matched-filter) and the other axis is the Doppler frequency. Detection now finds the cell(s) according to a certain threshold. A cell is "detected" if it passes a certain threshold, and you have performed range and Doppler measurements simultaneously by virtue of simply picking the cell. Hence, you use the range-Doppler map to discriminate using Doppler and perform detection at the same time! You can go further and use both measurement dimensions to discriminate possible detections. One application being to keep track of an airborne radar's altitude return.
These are just the basics! You can implement all sorts of filtering and signal processing algorithms on the range-Doppler map alone to get even more information about your environment and your targets of interest.