Average precision when not all the relevant documents are found

Question

I can't find on the Internet a proper source that explains this.

I have built a search engine that for a particular query retrieves 5 relevant document out of the 10 relevant documents.

When I calculate the average precision, I sum the Precision@k, where k is relevant. At this stage, should I divide by 10 or by 5?

score 2 · Accepted Answer · answered May 10 '15 at 14:25

Average Precision should be divided by 10 and not 5.

Formula from Manning's Introduction to Information Retrieval

$$ MeanAveragePrecision(Q) = \frac{1}{|Q|} \sum_{j=1}^{|Q|}\frac{1}{m_j} \sum_{k=1}^{m_j} Precision(R_{jk}) $$

for query $q_j \in Q$ containing relevance documents ${d_1, ..., d_{m_j}}$ and $R_{jk}$ is the set of ranked results until document $d_k$. You are only looking for the Average Precision which is the inner sum. You can see that $\sum Precision$ divides by $m_j$ which is the total number of relevant documents.

Wikipedia's Average Precision page has the formula

$$ AveragePrecision = \frac {\sum_{k=1}^{n} P(k) \times rel(k)}{numberOfRelevantDocuments} $$ where $P(k)$ is precision@k and $rel(k)$ is an indicator function equaling 1 if the item at rank k is a relevant document, zero otherwise. It also says the "is over all relevant documents and the relevant documents not retrieved get a precision score of zero"

I have seen a few example on university solutions to homeworks and this has been pretty inconsistent. Is there any other source different from wikipedia? — ramborambo, May 10 '15 at 16:06
@ramborambo I believe Manning's definition and Wikipedia's definition are equivalent. Are they not? — Eric Farng, May 10 '15 at 20:52

Average precision when not all the relevant documents are found

1 Answers1