ROUGE-N for multiple references

Question

In the original paper (https://aclanthology.org/W04-1013.pdf) that presented ROUGE the description of ROUGE-N is rather unclear on the case when multiple references are used.

Firstly it presents the ROUGE-N as $$ \text{ROUGE-N} = \frac{\sum_{S \in \text{references}} \sum_{gram_n \in S} \text{Count}_{match}(\text{gram}_n)}{\sum_{S \in \text{references}} \sum_{gram_n \in S} \text{Count}(\text{gram}_n)} $$ and defines the $\text{Count}_{match}$ as the maximum number of n-grams co-occurring in a candidate summary and a set of reference summaries. So it seems like it is meant among the all references and not pairwise.

On the other hand the author starts next section (2.1 Multiple References) with: So far, we only demonstrated how to compute ROUGE-N using a single reference.

So the author acts like he was not talking about multiple references before, but the formula is certainly using them.

Then the case for multiple references is described as that, we compute pairwise summary level ROUGE-N between a candidate summary $s$ and every reference, $r_i$, in the reference set. We then take the maximum of pairwise summary-level ROUGE-N scores.

Paper then describes this process with this formula $$ \text{ROUGE-N}_{\text{multi}} = argmax_i \text{ROUGE-N}(r_i,s) $$

But these descriptions and formulas are inconsistent. I believe that the formulas should be rather $$ \text{ROUGE-N}(r,s) = \frac{\sum_{gram_n \in r} \text{Count}_{match}(\text{gram}_n, r, s)}{\sum_{gram_n \in r} \text{Count}(\text{gram}_n, r)} $$

$$ \text{ROUGE-N}_{\text{multi}} = max_{r \in R} \text{ROUGE-N}(r,s) $$ Where the $\text{Count}_{match}$ is just for the single reference and $R$ is a set of references. Am I right?

ROUGE-N for multiple references

0 Answers0