What is real life/conceptually is meant by "confidence" in Wilson score

Question

I am calculating 'popularity' scores for content in a web app based on 'views' and 'likes'.

I have not studied statistics but I have found the method I need to use here:
http://www.evanmiller.org/how-not-to-sort-by-average-rating.html

Score = Lower bound of Wilson score confidence interval for a Bernoulli parameter

In the article it explains the choice of z value thusly:

"confidence refers to the statistical confidence level: pick 0.95 to have a 95% chance that your lower bound is correct"

I don't understand what 'correct' means here, and why I would choose lower or higher confidence levels. Why not choose 100% confidence and get a 'correct' result?

I have found several other questions with answers which get heavily philosophical and technical and I'm not clear how they relate to my case: What does a confidence interval (vs. a credible interval) actually express? What, precisely, is a confidence interval? Are there any examples where Bayesian credible intervals are obviously inferior to frequentist confidence intervals

I have applied the formula and calculated the scores for my data, my question is: why I would choose lower or higher confidence levels (and what does that mean for my score)?

Update

I have somewhat answered my question by experimenting with different 'confidence' z values and looking at the scores generated:

 Likes   | Views    | z    | Score
--------------------------------------------------
 1       | 4        | 1.0  | 0.1 
 100     | 400      | 1.0  | 0.2289908334502525
--------------------------------------------------
 1       | 4        | 1.96 | 0.045586062644636216 
 100     | 400      | 1.96 | 0.21007832849376823
--------------------------------------------------
 1       | 4        | 2.58 | 0.029987372072017595
 100     | 400      | 2.58 | 0.19854163422270693

So I can see from this that choosing a higher z value 'confidence level' is assigning a relatively lower score to the item with few views ('total votes', according to the formulation of original article).

I take this to mean that for items with few views we have a lower 'confidence' that the current known ratio is representative of the unknown 'true' ratio that would emerge if we had more data.

In comments @whuber has suggested that:
"confidence limits are not likely to be a part of an accurate solution"

So my question now is... is there a better formula I should be using to calculate the 'popularity' score for my data set?

A lower bound $\ell$ is 'correct' when the true proportion p is not below $\ell$, i.e. when $p \ge \ell$. Sadly, we never know p... — Michael M, Jan 22 '14 at 18:52
You have trouble understanding what he means by "correct" in part because he doesn't have a correct understanding of confidence intervals. — gung - Reinstate Monica, Jan 22 '14 at 18:56
Your question as it stands truly is a duplicate of several of those you reference. However, you probably *do* have a different issue to deal with; your immediate problem is that confidence limits are not likely to be a part of an accurate solution. Why don't you edit your question to eliminate references to putative solutions (such as the Wilson LCL) and focus instead on what you need to accomplish: namely, how to prioritize average scores by factoring in their uncertainties. (@gung Yes, that "How not to sort..." page has driven several people to our site due to the confusion it has sown.) — whuber, Jan 23 '14 at 04:55

What is real life/conceptually is meant by "confidence" in Wilson score

0 Answers0