If I understand you correctly, you want to sample $x_1,\dots,x_k$ values from multinomial distribution with probabilities $p_1,\dots,p_k$ such that $\sum_i x_i = n$, however you want the distribution to be truncated so $a_i \le x_i \le b_i$ for all $x_i$.
I see three solutions (neither as elegant as in non-truncated case):
- Accept-reject. Sample from non-truncated multinomial, accept the sample if it fits
the truncation boundaries, otherwise reject and repeat the
process. It is fast, but can be very inefficient.
rtrmnomReject <- function(R, n, p, a, b) {
x <- t(rmultinom(R, n, p))
x[apply(a <= x & x <= b, 1, all) & rowSums(x) == n, ]
}
- Direct simulation. Sample in fashion that resembles
the data-generating process, i.e.
sample single marble from a random urn and repeat this process until
you sampled $n$ marbles in total, but as you deploy the total number
of marbles from given urn ($x_i$ is already equal to $b_i$) then stop
drawing from such urn. I implemented this in a script below.
# single draw from truncated multinomial with a,b truncation points
rtrmnomDirect <- function(n, p, a, b) {
k <- length(p)
repeat {
pp <- p # reset pp
x <- numeric(k) # reset x
repeat {
if (sum(x<b) == 1) { # if only a single category is left
x[x<b] <- x[x<b] + n-sum(x) # fill this category with reminder
break
}
i <- sample.int(k, 1, prob = pp) # sample x[i]
x[i] <- x[i] + 1
if (x[i] == b[i]) pp[i] <- 0 # if x[i] is filled do
# not sample from it
if (sum(x) == n) break # if we picked n, stop
}
if (all(x >= a)) break # if all x>=a sample is valid
# otherwise reject
}
return(x)
}
- Metropolis algorithm. Finally, the third and most
efficient approach would be to use Metropolis algorithm.
The algorithm is initialized by using direct simulation
(but can be initialized differently) to draw first sample $X_1$.
In following steps iteratively: proposal value $y = q(X_{i-1})$
is accepted as $X_i$ with probability $f(y)/f(X_{i-1})$,
otherwise $X_{i-1}$ value is taken in it's place,
where $f(x) \propto \prod_i p_i^{x_i}/x_i!$. As a proposal I used
function $q$ that takes $X_{i-1}$ value and randomly flips from 0
to
step
number of cases and moves it to another category.
# draw R values
# 'step' parameter defines magnitude of jumps
# for Meteropolis algorithm
# 'init' is a vector of values to start with
rtrmnomMetrop <- function(R, n, p, a, b,
step = 1,
init = rtrmnomDirect(n, p, a, b)) {
k <- length(p)
if (length(a)==1) a <- rep(a, k)
if (length(b)==1) b <- rep(b, k)
# approximate target log-density
lp <- log(p)
lf <- function(x) {
if(any(x < a) || any(x > b) || sum(x) != n)
return(-Inf)
sum(lp*x - lfactorial(x))
}
step <- max(2, step+1)
# proposal function
q <- function(x) {
idx <- sample.int(k, 2)
u <- sample.int(step, 1)-1
x[idx] <- x[idx] + c(-u, u)
x
}
tmp <- init
x <- matrix(nrow = R, ncol = k)
ar <- 0
for (i in 1:R) {
proposal <- q(tmp)
prob <- exp(lf(proposal) - lf(tmp))
if (runif(1) < prob) {
tmp <- proposal
ar <- ar + 1
}
x[i,] <- tmp
}
structure(x, acceptance.rate = ar/R, step = step-1)
}
The algorithm starts at $X_1$ and then wanders around the different regions of distribution. It is obviously faster then the previous ones, but you need to remember that if you'd use it to sample small number of cases, then you could end up with draws that are close to each other. Another problem is that you need to decide about step
size, i.e. how big jumps should the algorithm make -- too small may lead to moving slowly, too big may lead to making too many invalid proposals and rejecting them. You can see example of it's usage below. On the plots you can see: marginal densities in the first row, traceplots in the second row and plots showing subsequent jumps for pairs of variables.
n <- 500
a <- 50
b <- 125
p <- c(1,5,2,4,3)/15
k <- length(p)
x <- rtrmnomMetrop(1e4, n, p, a, b, step = 15)
cmb <- combn(1:k, 2)
par.def <- par(mfrow=c(4,5), mar = c(2,2,2,2))
for (i in 1:k)
hist(x[,i], main = paste0("X",i))
for (i in 1:k)
plot(x[,i], main = paste0("X",i), type = "l", col = "lightblue")
for (i in 1:ncol(cmb))
plot(jitter(x[,cmb[1,i]]), jitter(x[,cmb[2,i]]),
type = "l", main = paste(paste0("X", cmb[,i]), collapse = ":"),
col = "gray")
par(par.def)

The problem with sampling from this distribution is that describes a very inefficient sampling strategy in general. Imagine that $p_1 \ne \dots \ne p_k$ and $a_1 = \dots = a_k$, $b_1 = \dots b_k$ and $a_i$'s are close to $b_i$'s, in such case you want to sample to categories with different probabilities, but expect similar frequencies in the end. In extreme case, imagine two-categorical distribution where $p_1 \gg p_2$, and $a_1 \ll a_2$, $b_1 \ll b_2$, in such case you expect something very rare event to happen (real-life example of such distribution would be researcher who repeats sampling until he finds the sample that is consistent with his hypothesis, so it has more to do with cheating than random sampling).
The distribution is much less problematic if you define it as Rukhin (2007, 2008) where you sample $np_i$ cases to each category, i.e. sample proportionally to $p_i$'s.
Rukhin, A.L. (2007). Normal order statistics and sums of geometric random variables in treatment allocation problems. Statistics & probability letters, 77(12), 1312-1321.
Rukhin, A. L. (2008). Stopping Rules in Balanced Allocation Problems: Exact and Asymptotic Distributions. Sequential Analysis, 27(3), 277-292.