How to find confidence in output result given a histogram?

Question

I have this algorithm that spits out a value between two objects. The smaller the value the more similar those objects are in euclidean space.

I ran it on hundreds of examples that are true positives, i.e. I know they are similar. I know their lower and upper values, the lowest is 0.01 and upper is 0.64. The theoretical upper and lower limits are 4.0 and 0.00 respectively.

Can I use those results, which are given below, as a measure of confidence if I give it two objects that It has not seen.

Example;

I give it hundreds of examples of a car. The values for car are from 0.01 to 0.45. Next, I give it a Horse cart and a car to find distance between. It gives me a value of 0.6, so I reject it, cause I know from examples that for any value higher than 0.45 is rejected. 0.45 is my threshold.

What I want to find is a confidence in my values lower than threshold.

Something like -->

0.45 ------> confidence 50%

0.30 ------> confidence 70%

What i tried was to plot a histogram of the values from 448 samples. And now I am stuck.

NOTE: I am a newbie and know just the basics, like mean, var, std and some probability distributions. Any help would be appreciated. Cheers.

Here's how the values look like.

It is not clear to me what you mean by "create a confidence". This is different from the standard use of the term "confidence" in statistics, as far as I can tell from your writing. What exactly do you want to be confident in? — Christian Hennig, Nov 10 '21 at 11:35
It sounds like you would like to make decisions concerning whether two objects are "similar" or not. But since you don't report any data about objects that are *not* similar, it would be difficult to develop a procedure to help you. Perhaps that's why you're stuck? — whuber, Nov 10 '21 at 11:48
I know that the values I am getting are similar, since they pass a threshold. but what I want is a confidence in those values. The smaller they are the more similar those objects are. By confidence I mean in in the literal sense, not the statistical sense. — obadul024, Nov 11 '21 at 11:17
@ChristianHennig it is different from the standard definition of confidence in statistics. My question is this. If I give my algorithm a set of two objects, and it returns a value say, 0.2, based on previous results I have, that I already know are true, what is my surety in that value. The problem is, Sometimes the algo gives me a small value for objects that are not same. For ex. a car and a horse cart. The smaller the value, the more similar the objects might be. Can I use my past results to say with surety that I know with a x% confidence it is a cart not a bus. — obadul024, Nov 11 '21 at 11:23
I don't think that this can be done using statistical reasoning based on only the information you gave us. In order to say something about how reliable the value 0.2 is, the value needs to be compared with some kind of "truth" from which it may deviate. If you had true or "gold standard" similarity values for some observations (potentially deviating from your measurements), one could use these to make statements about the reliability of your measurements in general, but without such information I don't see how to do this. (Essentially this is what @whuber wrote.) — Christian Hennig, Nov 11 '21 at 13:42
I have the truth values. They are in the image i posted in the question. The histogram shows the truth values and their frequencies. The true values have a range. The lower their value the better they are. But what if the majority true values lie somwhere near the maximum threhsold, i.e. 0,4. — obadul024, Nov 11 '21 at 13:46
Because the literal sense of "confidence" is a matter of psychology or personal opinion, it looks like a question we won't be able to answer. — whuber, Nov 11 '21 at 14:31
can I use the frequency as a measurement of likelihood of a better score? — obadul024, Nov 11 '21 at 14:35

How to find confidence in output result given a histogram?

0 Answers0