0

I've been trying to remember my High School teachings and are falling short.

I'm working on a project where I need to give a % of correctness for an integer (How close a given number is to the actual number within a 300% difference). For example if the number we want is 50, any number from -100 to 150 will return > 0% correctness.

The problem is that we need a curve (log, or bell curve, or something similar) to return a non-linear % correctness (i.e. 100 is 50% correct in linear terms, but we would want maybe 66%? ... and 125 is 33%?) and I dont have a formula to get this response.

Something like this (sorry, used mspaint quickly to try to explain) (please go to https://stackoverflow.com/questions/7759062/math-statistics-bell-curve-computing-correct-given-2-numbers-c to see the image, cant post it here due to me being a new user)

Am I explaining this properly? Make sense? Any help? ;)

I ran into standard deviation too, just its a bit complicated for me to process right now. If you understand it, can you throw me a quick formula?

Tizz
  • 123
  • 3
  • 2
    Although you have accepted a reply, I feel obliged to point out for any future readers that *no* reply can possibly be correct because the question is too vague. It seems to ask us to read your mind; that is, to know precisely what you mean by "non-linear % correctness." There are infinitely many solutions to questions like that and each one is tantamount to a particular way of *quantitatively valuing* almost-correct integers. If you blindly accept a suggestion like a Gaussian, you have effectively let somebody who is *completely guessing* determine this. (continued) – whuber Oct 13 '11 at 22:29
  • 1
    The process needs to be the reverse of that. Rather than letting some uninformed mathematical formula determine things--which amounts to arbitrary guesswork--you should be expressing what you know and care about with sufficient clarity that people can suggest mathematical solutions to match your values. What do you need the "nonlinearity" for? What is the purpose of reporting this "% correctness"? How will you establish that a mathematical formula is doing what you desire? In short, by providing this information and these criteria in the question, you can get a quality answer. – whuber Oct 13 '11 at 22:32

2 Answers2

0

If you want a bell curve, you need to use a Gaussian. The equation for Gaussian is: $f(x) = ae^{\frac{-(x-b)^2}{2c^2}) }$

a is the maximum height of the curve. If you want 100% correct to be the maximum value, set a=100.

b is the middle of the curve. If you want the maximum value (100%) to be at 50, set b to 50.

c relates to the width of the curve. Higher values of c correspond to wider, more gradually sloping curves. Lower values correspond to sharper, steeply sloping curves. This is a value you'd have to pick depending on how much error you're willing to tolerate. The Wikipedia page has some good example graphs.

If x is the given number, and b is the actual number, this function returns a when x=b. As we move away from b, we return some fraction of 100. If we assume any value less than 0.5% correct is just "wrong", and we want a value which is off by more than $3b$ to be considered "wrong", then we can set a value of c so that $f(4b)=f(-2b) = 0.05$.

For the values presented above, here's a function with these properties.

John Doucette
  • 2,113
  • 1
  • 15
  • 24
  • Also, c in this case corresponds to the standard deviation. – John Doucette Oct 13 '11 at 20:53
  • Thank you for taking time to do this! This is most beneficial! – Tizz Oct 13 '11 at 21:20
  • Although this solution isnt perfect (300% range doesnt drop down to 0% closeness), the solution is acceptable at this time. Thanks! – Tizz Oct 13 '11 at 22:11
  • While this answer has been accepted, and the OP gone away happy, perhaps it is worth pointing out that $a$ and $c$ are inversely proportional to each other (in fact, as most everybody reading this comment knows, $ac= 1/\sqrt{2\pi}$), and so setting $a = 100$ means that one must live with a very small value of $c$: one cannot _also_ "pick" the value of $c$ "depending on how much error you are willing to tolerate" – Dilip Sarwate Oct 14 '11 at 00:49
  • @Dilip That's only true if we want the area under the curve to sum to one (i.e. if you want a pdf). In this case, that doesn't appear to be a constraint. – John Doucette Oct 14 '11 at 02:22
  • That's right, John, but we are still left wondering how you intend this Gaussian to be "used." @Dilip's comment is a natural one and shows there is potential for readers not to understand what you're trying to communicate here. Perhaps you could edit your reply to elaborate on this. – whuber Oct 14 '11 at 16:44
  • Thanks for the tip whuber. I've updated my answer. Do you think it's clearer this way? – John Doucette Oct 14 '11 at 20:14
  • Alas, his range is assymetric, and extends down to $-100$ ($150$ away from $50$) and upwards to $150$ (only $100$ away from $50$. Your symmetric function may not be quite what he wants. – Dilip Sarwate Oct 14 '11 at 20:44
  • Given the symmetry about 50 of the curve in his picture, I was inclined to assume that was a typo. Perhaps I was mistaken though... – John Doucette Oct 14 '11 at 21:42
  • 1
    Tizz says "100 is 50% correct in linear terms, but we would want maybe 66%?" If he would just correct the typo in his upper point and set it to $200$ instead of $150$, he would get $100$ as $66\%$ correct on a linear scale. But "I used a bell curve to determine the percentage correctness" undoubtedly sounds more impressive... – Dilip Sarwate Oct 15 '11 at 22:55
0

"(How close a given number is to the actual number within a 300% difference). For example if the number we want is 50, any number from -100 to 150 will return $ > 0\%$ correctness"

The range of permissible errors is from $-100$ to $150$, that is, $150 = 3\times 50$ points to the left of $50$ but only $100 = 2\times 50$ points to the right of $50$? The latter does not seem to jibe with the $300\%$ difference allowable but is (almost) consistent with

"100 is 50% correct in linear terms"

that is, the degree of correctness decreases linearly from $100\%$ at $50$ to $50\%$ at $100$ and to $0\%$ at $150$ (though it would seem that the OP wants some nominal degree of correctness, say $1\%$ at $150$). For later reference, the linear decrease in correctness would give $25\%$ correctness at $125$.

"The problem is that we need a curve (log, or bell curve, or something similar) to return a non-linear % correctness (i.e. 100 is 50% correct in linear terms, but we would want maybe 66%? ... and 125 is 33%?) and I dont have a formula to get this response."

The bell curve is a red herring here. The OP wants a particular response curve which passes through the points $(50, 100)$, $(100,66)$, $(125,33)$ maybe, and $(150,1)$, instead of the straight-line response curve through $(50, 100)$, $(100,50)$, $(125,25)$, $(150,0)$. There may be other points that he wants the response curve to pass through, but he is not sharing those with us. He says nothing about correctness for numbers less than $50$. So, what I can suggest is Lagrange interpolation through the four points to give a curve $f(x)$ that will work for the range $[50, 150]$, a different curve $g(x)$ for the range $[-100, 50)$, and then use cases:

  • if $x < -100$ or $x > 150$, return $0$
  • if $-100 \leq x < 50$, return $g(x)$
  • if $50 \leq x \leq 150$, return $f(x)$

If $x$ can take on only integer values, and the desired response to each $x$ is known or will be as specified by the client, put the given values in a look-up table rather than doing Lagrange interpolation, and so on.

Dilip Sarwate
  • 41,202
  • 4
  • 94
  • 200