Suppose we have a set of key value pairs, such that the keys can be ordered based on the value . We can easily extract and report the top $n$ keys. Now suppose the we havet S such sets. From each set we are supposed to extract a set of $p$ top keys and merge them to get the top $n$ keys from all the sets. How large should $p$ be? What is the relationship between the $S$, $n$ and $p$ and the error?
I wanted to know if there is any literature on this subject.
Clarification: Its possible to have the same keys in multiple sets. The merge and find top n in the ideal case would construct a combined map of key-value pairs (by adding all values corresponding to the same key) and finding the top n values.
In my data the distribution of values in each set and the combined set has a long tail.