TL;DR you can, but the result would strongly depend on your choice of prior.
With maximum likelihood, you would be maximizing the likelihood, that in this case is defined in terms of probability mass function $f$ of Bernoulli distribution, i.e. binomial distribution with number of trials $n=1$, parametrized by probability of success $\theta$
$$
\hat\theta = \underset{\theta}{\operatorname{arg\,max}} \; f(X|\theta)
$$
In Bayesian setting what changes is that instead of looking for point estimate for $\theta$, we learn posterior distribution $\pi(\theta|X)$, and we start with a prior distribution $\pi(\theta)$ for $\theta$
$$
\pi(\theta|X) \propto f(X|\theta)\,\pi(\theta)
$$
when calculating maximum a posteriori point estimate, you would be maximizing the posterior probability
$$
\hat\theta = \underset{\theta}{\operatorname{arg\,max}} \; f(X|\theta) \,\pi(\theta)
$$
As you can see, what changes, is that we multiply likelihood by prior. In case of binomial distribution, if we choose beta distribution as a prior, then there exists nice, closed-form solution. If as a prior we choose
$$
\theta \sim \mathsf{Beta}(\alpha, \beta)
$$
then the posterior distribution is
$$
\theta|X \sim \mathsf{Beta}(\alpha + x, \beta+ n-x)
$$
where $n=1$ is the number of trials, and $x=1$ is the number of successes. So in your case, the mean of the distribution is
$$
E[\theta|X] = \frac{\alpha + 1}{\alpha+1+\beta}
$$
For details, you can check the great What is the intuition behind beta distribution? thread. As you can see, choosing different prior parameters $\alpha$, $\beta$, would lead to different results and would have significant impact on the final estimate. If you want to assume a priori the probability to be something close to $0.5$, you need to set $\alpha$, $\beta$ to same values. For example, if you set $\alpha=\beta=1$, you would estimate $E[\theta|X]$ to be $0.66$, while $\alpha=\beta=0.5$ would lead you to estimating it as $0.75$. This impact would diminish with growing sample size, but with single sample it would be quite profound. So using Bayesian approach would enable you to estimate something more reasonable then $\hat\theta=\tfrac{1}{1}=1$, but how reasonable the estimate would be, would depend on how reasonable your prior was.
As a sidenote, your example is not that uncommon. In fact, it is often the case to use Bayesian estimators for calculating probabilities when we expect to see zero counts. It is commonly used for working with textual data, where we deal with counts of words. Obviously, some words occur very frequent, e.g. "and", "the", while other are pretty rare, e.g. "aardvark". Estimating probabilities for the common words is straightforward, but for rare words we would end up $\tfrac{0}{n}$ as estimated probabilities. When using algorithms like Naive Bayes, where we multiply the probabilities by each other, this would lead to zeroing-out everything after plugging-in single zero to the formula, that is why we use Laplace smoothing.