I have read Yolo Loss function explanation but none of the answers discuss how box confidence scores are computed. The YOLO paper uses the following loss function:
I'm confused about how confidence scores $\hat C_i$ and $C_i$ are found. Here $i$ is the $i$th cell. For $C_i$, the paper seems to imply that $C_i = IOU_{pred}^{truth}$, which is confusing because the predicted box changes every epoch, implying that the target is not a constant?
Is $\hat C_i$ directly computed by the network? In the paper, they define predicted confidence score as $P(Object)*IOU_{pred}^{truth}$ which should ideally equal $C_i = IOU_{pred}^{truth}$ if an object is in the cell, but I'm not sure if this is just an interpretation of the scores (for Equation 1 in the paper) or if we're supposed to estimate $P(Object)$ somehow first to find $\hat C_i$.