What that article claims Bayes came up with was a prior distribution for $p$.
A frequentist doesn't have a distribution on $p$ (though see the note below about incorporating previous data for example - a frequentist is not bound to ignore other information on the parameter).
So the frequentist doesn't "guess" a distribution for $p$. In fact, neither does the Bayesian - you don't guess your prior on a parameter, as a Bayesian you choose* it (along the lines of "given what I already understand about this situation that generates successes and failures, what can I say about where $p$ is likely to be?"; there using likely in its ordinary sense).
* I'm glossing over differences between some flavors of Bayesianism here.
Given the same assumptions about the situation (presumably Bernoulli trials of some kind), both of them would have the same model for the number of successes in n trials, given some $p$ (a binomial), arrived at via the same reasoning (the progression from a sequence of Bernoulli trials with constant $p$ to a Binomial number of successes in $n$ trials is straightforward application of probability rules that both agree on).
So they agree how to model (say) coin-flips for a given probability of a head ($p$), and they generally agree on the relevance of the likelihood for the relationship between the data, the model for the data and the information about the parameter(s). The trick is to apply that agreed-on model for the data to a situation where you want to come to infer information about $p$.
They differ on how $p$ is treated. The frequentist treats $p$ as fixed but unknown and tries to get information about it via things like point estimates, confidence intervals with certain coverage probabilities, and so on. The Bayesian treats their uncertainty about $p$ as represented by a probability distribution, which the data then narrows down, via (say) a credible interval (though I'm leaving out some stuff that's important to many Bayesians here). While in many situations a credible interval and a confidence interval look very similar (or with particular choices of prior, even identical), they're not trying to achieve the same thing.
[If you have information relating to your particular $p$ garnered from prior data (e.g. yesterday's experiment with the same coin), the Bayesian and the frequentist tend to agree how to incorporate that prior data with the current data; they're both applying the same probability rules there.]
The use of a uniform distribution for the prior on $p$ there was presumably intended to represent "total ignorance" of $p$, but always using flat priors to represent "ignorance" leads to an interesting situation -- your inference then depends on how you parameterize the situation. If person A works with the probability, $p$, while B works with the odds ratio $\omega=\frac{p}{1-p}$, and person C uses the log-odds ($\eta=\log \omega$), then when they come to convert their knowledge of the parameters to their friends' parameterizations, they will come to (at least slightly) different conclusions about the parameter.
(There are priors that don't depend on how you parameterize.)