ROUGE seems to be the standard way of evaluating the quality of machine generated summaries of text documents by comparing them with reference summaries (human generated). $$ROUGE_{n}= \frac {\sum_{s\in \textrm{Ref Summaries} } \sum_{gram_{n}\in s}{Count_{match}(gram_{n})}}{\sum_{s\in \textrm{Ref Summaries} } \sum_{gram_{n}\in s}{Count(gram_{n})}}$$
Based on the formula above, ROUGE checks only for recall so I could just generate a summary which is the concatenation of all reference summaries and get a perfect ROUGE score.
Is it always the case that ROUGE has to be considered in the light of some other metric which is related to Precision (either BLEU or some cap on length of summary)?