The Jaccard coefficient does allow for different set size, but its interpretation becomes less intuitive.
Here is an application for R.
jaccard <- function(a, b) {
intersection = length(intersect(a, b))
union = length(a) + length(b) - intersection
return (intersection/union)
}
jaccard(a,b)
And here is one for Python:
def jaccard(list1, list2):
intersection = len(list(set(list1).intersection(list2)))
union = (len(list1) + len(list2)) - intersection
return float(intersection) / union
jaccard(a,b)
(code from here)
Another option is the Sørensen–Dice coefficient. The nominator is twice the intersection set, and the denominator is the sum of the cardinality of both sets. To apply it, just change the two codes above accordingly.
The overlap coefficient, or Szymkiewicz–Simpson coefficient is one alternative that does not care about different set sizes. As long as one subset is contained in the other, the coefficient is one. To apply it, just change the two codes above accordingly.
Disregarding the dimension of the large set might not be ideal though. Since the index only cares about the proportion of the small set in the large one, value is the same regardless of the dimension of the latter (10, 1000, 1000000).
I'm sure there are many more metrics. The comment by @whuber is correct. There are many metrics. It depends on what you are after.