I'm building an algorithm in R to calculate the Shapley Value for players in a collaborative game. However, I do not have an outcome value for all possible coalitions, partially because the number of players is relatively high (in the 100s/1000s), and number of observations will never cover all permutations.
My question is: If I calculate the Shapley Value for players just using the information available, how valid is this in the context of fair distribution of credit? Also, is there a method to calculate how far from optimal distribution of credit I might be?
If it is not reasonable to use the standard calculation of Shapley Values with incomplete information, is there an algorithm that is more appropriate?
[EDIT - added a worked example]
A worked example using 3 players and manual calcs:
ALL coalitions observed, with values:
A -> 5
B -> 8
C -> 3
AB -> 9
AC -> 8
BC -> 15
ABC -> 18
Permutations and marginal additional value:
A B C
A,B,C: 5, 4, 9
A,C,B: 5, 3, 10
B,A,C: 1, 8, 9
B,C,A: 3, 8, 7
C,A,B: 5, 10, 3
C,B,A: 3, 12, 3
A = 3.67
B = 7.5
C = 6.83
One coalition (AB) NEVER observed, with values:
A -> 5
B -> 8
C -> 3
AB -> ? - Never seen - let's use 0
AC -> 8
BC -> 15
ABC -> 18
Permutations and marginal additional value:
A B C
A,B,C: 5, 0, 13
A,C,B: 5, 3, 10
B,A,C: 0, 8, 10
B,C,A: 3, 8, 7
C,A,B: 5, 10, 3
C,B,A: 3, 12, 3
A = 3.5
B = 6.83
C = 7.67
As you can see - C gets a bit more of a share of the spoils in the second example, at the expense of A and B. This seems intuitively OK seeing as the coalition AB never occurs and therefore C is working harder across all observed coalitions to produce value. My question is, is this mathematically reasonable as well? And does this reasonableness extend to, say, half of all possible coalitions never being observed?
As an addition, if anyone can point to some R code for calculating Shapley Values for large numbers of players that would also be appreciated.
[To give a hint of my level - I'm not a hard core statistician (but have a Th. Physics degree, some time ago!), but understand the maths and stats behind calculating Shapley Values pretty well]
Many thanks,
Andy.