1

I'm trying to run R's maximum likelihood estimation function (stats4::mle), over a likelihood function in Free Shipping Is Not Free: A Data-Driven Model to Design Free-Shipping Threshold Policies.

Here is the likelihood function I'm trying to run mle over (equation $10$ from the paper):

$$\mathcal{L}(k_{1},\theta_{1},k_{2},\theta_{2} |\{ \textbf{baskets}, \boldsymbol{thresholds}\}) = \sum_{i} ^ n [f(basket_{i})(1 - p(theshold_{i}, basket_{i})) \cdot 1_{\{{basket_{i} < threshold_{i}}\}} + (f(basket_{i}) + g(basket_{i} - threshold_{i}) \int_{max(0, threshold_{i} - \frac{\alpha}{\beta})}^{threshold_{i}} p(threshold_{i}, x)f(x)dx) \cdot 1_{\{{basket_{i} \geq threshold_{i}}\}}]$$

I've modified the terms slightly, from $d_{i}$ to $basket_{i}$, and $\tau_{i}$ to $threshold_{i}$ just for my own understanding's sake.

Where $F(\cdot)$ and $G(\cdot)$ are observed by the authors to most closely conform to (independent) $Gamma$ distributions.

So $F(X;k_{1}, \theta_{1})$, and $G(Y;k_{2}, \theta_{2})$

As well, $\alpha \in [0,1], \beta > 0$

And finally, $p(threshold_{i}, basket_{i}) = max(0, \alpha - \beta(threshold_{i} - basket_{i}))$

The likelihood function does not look like it can be solved analytically (take derivatives w.r.t. the parameters and set to zero), so it looks like I'm forced to do this numerically.

Though I'm nowhere near a mathematical expert, I have no idea how the authors successfully performed MLE over that likelihood function, as I am encountering errors left and right with it.

The current (and probably wrong) implementation of the likelihood function I have so far is thus:

probPad <- function(basket, threshold, alpha, beta) { # p(threshold, basket) in the likelihood function
  max(0, alpha - beta * (threshold - basket))
}

integrand <- function(x, threshold, shape, scale, alpha, beta) {
  probPad(x, threshold, alpha, beta) * dgamma(x, shape = shape, scale = scale)
}

likelihoodFunction <- function(shape_1, scale_1, shape_2, scale_2, alpha, beta) {
  
  set.seed(2022)
  baskets <- sample(Baskets, 1e3)
  
  thresholds <- 100
  
  likelihoods <- c()
  
  for(i in seq_along(baskets)) {
  
  min_val <- max(0, thresholds - (alpha/beta))
  
  max_val <- thresholds
  
  if(baskets[i] < thresholds) {

    likelihoods[i] <-
      dgamma(baskets[i], shape = shape_1, scale = scale_1) * 
      (1 - probPad(baskets[i], thresholds, alpha, beta))

  } else {
    
    likelihoods[i] <-
        dgamma(baskets[i], shape = shape_1, scale = scale_1) + 
        dgamma((baskets[i] - thresholds), shape = shape_2, scale = scale_2) * 
        integrate(
          integrand,
          min_val,
          max_val,
          threshold = thresholds,
          shape = shape_1,
          scale = scale_1,
          alpha = alpha,
          beta = beta
        )$value
    }
  }
  
  return(-sum(likelihoods))
}

The problem is whenever trying to increase the size of the sampled $\boldsymbol{baskets}$ vector beyond ~ 130 observations, I always encounter the dreaded L-BFGS-B needs finite values of 'fn' error.

From other posts encountering this error, it seems to be because the box constraints of the estimated parameters not being well-behaved values (i.e. $0$ in divide-by-zero cases, $Inf$ causing divergence, etc.).

However, I do not think my starting values, or my box constraints, exhibit this issue.

Here are the starting and box constraints values for maximum likelihood estimation:

LL <- stats4::mle(
  likelihoodFunction,
  start = list(
    alpha = 0.3,
    beta = 0.03,
    shape_1 = 1.77,
    scale_1 = 2.44,
    shape_2 = 1.36,
    scale_2 = 2.69 
    ),
  lower = list(
    alpha = 1e-3, 
    beta = 1e-4,
    shape_1 = 1e-3, 
    scale_1 = 1e-3, 
    shape_2 = 1e-3, 
    scale_2 = 1e-3
    ),
  upper = list(
    alpha = 1, 
    beta = 1, 
    shape_1 = 10,
    scale_1 = 10,
    shape_2 = 10,
    scale_2 = 10
    ),
  method = "L-BFGS-B"
)

alpha_optim <- LL@coef[["alpha"]]
beta_optim <- LL@coef[["beta"]]
f_k <- LL@coef[["shape_1"]]
f_theta <- LL@coef[["scale_1"]]
g_k <- LL@coef[["shape_2"]]
g_theta <- LL@coef[["scale_2"]]

LL

I'm wondering what part of the objective function is throwing this error, or if it's returning a value larger than R's Machine$double.xmax.

If I can't resolve this, I think I'll have to resort to some super slow, brute-force grid search method; providing each parameter a vector of values to iterate over, and find the highest likelihood from those combination of values.


References:

Cachon, Gerard P and Gallino, Santiago and Xu, Joseph, Free Shipping Is Not Free: A Data-Driven Model to Design Free-Shipping Threshold Policies. (September 17, 2018). Available at SSRN: https://ssrn.com/abstract=3250971 or http://dx.doi.org/10.2139/ssrn.3250971

nmck160
  • 21
  • 3

0 Answers0