Polyphonic music transcription does not currently appear to be a solved problem.
How about the inverse of a small portion of the problem. Are there any kind of spectral characteristics (from an STFT) that can be used to eliminate some musical chords from the probability space? (e.g. this snippet of sound most likely does not contain any C# chord, or any kind of diminished minor chord, or this is a single note not a chord, etc.)
Assume the audio snippet is more-or-less stationary (transient attack removed, etc.), and that overtones for most or all individual notes are very likely present. (And this question is not about inverted chords.)