I wrote some code in SAS that among other things, used the BETAINV function (or BETA.INV as it's called in Microsoft Excel) to calculate the quantile in a beta distribution corresponding to a random input value. Now I'm looking for a way to calculate this in SQL (on Netezza).
It doesn't have to be super precise, I'm more concerned about speed, as this function has to be called 2 billion times per day.
What is the use case exactly? I'm interested in estimating click-through rates, and acting upon that knowledge. In fact, I'm building a Bayesian Bandit 'machine' that optimizes click-through rates by 'exploring' the performance of different messages. See this article for more information.
So, given for example the fact that we saw 20 clicks in 2000 views, the click-through rate is 1%. Now this value is just an estimate, because we sampled only 2000 people. The 'real' CTR is somewhere around 1% but probably not exactly 1%. The probability distribution for a CTR turns out to be a beta distribution.
Now I want to pull a random value from this distribution (2 billion times per day). So I want most of those random values to be around the 1% in this case, but sometimes it may be lower or higher. That's exactly what the BETAINV function does. You give it a uniform random variable between 0 and 1 and it gives you the respective quantile in the beta distribution.
There's no BETAINV function in Netezza, so can I approximate this in some way (Taylor expansion?) to calculate this in a fast way?
Note: alpha and beta in my case are always integers, with alpha equal to the number of clicks and beta equal to the number of non-clicks. Alpha is always smaller than beta. Alpha can be between 1 and say 100,000. Beta can be between 1 and say 10,000,000.
Any ideas? Thanks!