I am trying to select the best answer from a set of samples, using two variables at different scales.
In the data below, each row indicates the accuracy of a given measurement, the number of times it occurs in the sample set, and the answer that it indicates.
Accuracy:0.00886982324861, Frequency:5.0, Answer:1.0
Accuracy:0.0104663914334, Frequency:1.0, Answer:2.0
Accuracy:0.0112727390014, Frequency:1.0, Answer:2.0
Accuracy:0.0143046058573, Frequency:3.0, Answer:1.0
Accuracy:0.0251741710747, Frequency:1.0, Answer:1.0
Accuracy:0.0322055218681, Frequency:1.0, Answer:2.0
Until now, I have been ignoring the accuracy and simply selecting the answer that occurs more frequently i.e., Answer 1 occurs 9 times, Answer 2 occurs 3 times, so select answer 1.
A product of frequency and accuracy produces a metric that takes both into account, but the different scales weights frequency disproportionately more than accuracy.
How can I calculate a metric that allows me to use both accuracy and frequency, but accounts for their different scales?
Your help is greatly appreciated, John
*EDIT Thanks for your response. In the data above, accuracy is a distance measurement from the closest category to which I am attempting to quantize this data. For example, suppose I have category 1 and category 2, the sample has a value of 1.4 so I would describe it's accuracy as .4 away from category 1 and .6 away from category 2, indicating that category 1 is the answer.
In the above data, if I multiply the frequency of sample one by it's accuracy ( 5 * .0088) the result is .0443 while the second sample gives me .0104 and the fourth sample yields .0429. From the data, the most accurate sample occurs 5 times (frequency 5) so it should clearly be better than the fourth most accurate sample occurring three times, but without some type of scaling to the results are only .0014 different. What if I had a sample that occurs once but its accuracy is .9. That would have a product of .9 which is larger than the first sample.