I am trying to use the typical Multinomial model:
$y_{i} \sim Multinomial(N, \theta_{1}, ...,\theta_{p})$
where each theta is defined as:
$\theta_{i}=\frac{e^{V_{i} }}{\sum_{j} e^{V_{j}}}$
I have an issue with the overflow of the exponentials. I know a solution is given here Softmax overflow which basically suggests to subtract the maximum $V$ from both the power of numerator and the denominator but then that also creates the underflow problem in my case and many thetas will become 0! Is there any better solution to correctly calculate the probabilities (thetas) without facing overflow and underflow?
Here is my data and what really happens when I want to use softmax and the logSumExp approach:
library(dplyr)
library(matrixStats)
df <- data.frame(id = seq(1,11,1),
real_probs = c(0.009589089, 0.007534532, 0.001860207, 0.05084681,
0.1630076, 0.04244464, 0.007857292, 0.3566669,
0.1793573, 0.1698827, 0.01095301),
V = c(2653.254, 2364.478, 521.0566, 14615.73,
46912.33, 12705.54, 1837.695, 102773.2,
51674.49, 48950.77, 3129.556))
df_new <- df %>%
mutate(total_V = sum(V),
pred_probs_without_softmax = V/total_V,
exp_V = exp(V), # To show the Overflow problem
exp_new_V = exp(V - max(V)), # To show the Underflow problem
logsumex = logSumExp(V),
pred_probs_with_softmax = exp(V - logsumex))