Combining classification estimates and additional information

Question

this is my first question, I only saw a slightly similar one in the suggestions when typing the title of the question, but unfortunately it had no answers (Sequential classification, combining predictions). A friend of mine asked the following and I've been trying to come up with a solution, but just can't seem to get round to it. Here it goes:

He is using a program to match a set of data (two experimental spectra of two different molecules) to two theoretical chemical structures through the following procedure: he provides the program with two structures (lets say $A$ and $B$), the program calculates a theoretical spectrum for each, then he presents a set of data to the program and it outputs a probability for each of the two structures. He did this with the two different sets of data (lets say $D_1$ and $D_2$), corresponding to different chemical species. He told me that he had the additional information that the two sets of data correspond to different species (they can't both have structure $A$ or both $B$) and that if one had structure $A$ then the other should have structure $B$ and vice versa. He said that he got for $D_1$ a probability of $0.8$ of corresponding to structure $A$ and $0.2$ to $B$, and for $D_2$ a $0.7$ probability of corresponding to structure $B$ and $0.3$ to structure $A$.

Considering the latter, he asked how he could combine the two estimates into a global estimate for the two sets of data, that is to say, he wants an estimate of the joint probability that the spectrum $D_1$ corresponds to structure $A$ and $D_2$ to structure $B$

At first I thought that what he called a "probability" was a classification probability conditional on the spectrum, so to combine both would need to estimate $P \left( \left( A, B \right) | D_1, D_2 \right)$, here $\left( A, B \right)$ would mean that the first molecule has structure $A$ and the second structure $B$. I though about using a Bayesian approach but in order to do that one would have to estimate somehow $P \left( D_1 \right), P \left( D_2 \right)$ or worse, $P \left( A \right)$ or $P \left( B \right)$ which I can't provide.

I though of "averaging" both estimates as if they were independent estimations of the probability that one of the species had only one of the structures, assuming the "mutual exclusivity" of the structures would imply that a higher probability in favour of one structure for the first molecule would imply a higher probability for the second molecule to the second structure, that is, one would have $P \left( A | D_1 \right)$ and $P \left( A | D_2 \right)$, so one could for example combine both of then as in Combining probabilities/information from different sources, equation 4, for example, but then again, they came from the same classifier...

On the one hand, one would assume that the joint probability should be larger than the individual ones, but on the other, it is true that you are combining two estimates, which are subject to error.

Could someone give me some recommendations as to what to do?

Combining classification estimates and additional information

0 Answers0