0

I have the following intuitive problem:

According to wikipedia log-likelihood for poisson regression is formulated as: \begin{equation} l(\theta \mid X, Y) = \sum_{i = 1}^n y_i\theta^Tx_i - e^{\theta^T x_i}- \log(y_i) \end{equation} However, somewhere in the internet I found that you can use poisson regression with only using observed counts $y_1, y_2, ...,y_n$ i.e. to estimate its parameters (for example for sample $y = (1, 1, 2, 2, 4, 6)$. But I don't understand how, since there is no $x_i$ to include in maximum likelihood estimation. In this case $E[Y \mid X]$ make even no sense since there is no $X$.

Am I correct with my way of thinking? Is there another poisson regression that can do those things, or I'm missing something?

John
  • 279
  • 1
  • 7
  • There are *always* $x_i,$ because the first column is assumed to contain constant nonzero values. – whuber Dec 26 '21 at 16:49
  • How exactly does it work in my example when? When I say that I have sample of counts $y = (1, 1, 2, 2, 4, 4, 6)$ and I'm asked to estimate parameters, what exactly is $X$ and what exactly is $Y$? – John Dec 26 '21 at 16:51
  • And for example when I'm working on contingency tables. How then it works? I don't quite understand what's in those tables independent and dependent variable – John Dec 26 '21 at 16:52
  • In the first instance, $X=(1,1,1,1,1,1,1)^\prime$ and $\theta$ is a $1\times 1$ matrix. In the second instance, study "dummy coding" and regression with categorical explanatory variables. (You question is not specific to Poisson regression.) – whuber Dec 26 '21 at 16:54
  • Hmmm...okay I understand. Could you please send me something that I can read more about answer to first question? I don't quite get the intuition that when we have sample of counts, by default we consider intercept-only model. – John Dec 26 '21 at 17:05
  • I'm just wondering if its common practice that when you want to model counts you just use intercept. – John Dec 26 '21 at 17:21
  • That's equivalent to averaging them. See [posts about estimation of Poisson parameters.](https://stats.stackexchange.com/search?q=estimat*+poisson+count+answers%3A1) There's a good answer (to the highest-voted post) at https://stats.stackexchange.com/a/72015/919. – whuber Dec 26 '21 at 17:25
  • The formula that you post is more general and includes an $x$ which relates to the independent variable(s) in the regression model. See https://en.m.wikipedia.org/wiki/Dependent_and_independent_variables#Statistics – Sextus Empiricus Dec 26 '21 at 18:49
  • You have not correctly copied the log-likelihood - https://wikimedia.org/api/rest_v1/media/math/render/svg/8b8448603e01485da8fdab8fa096bbde8172695f – Glen_b Dec 27 '21 at 01:13

1 Answers1

1

In the comments, you provide an example dataset $y = (1,1,2,2,4,4,6)$. You provide no covariates (which is absolutely fine). Let me explain why.

For the case where we do have covariates, the generative model is

$$ y_i \sim \mbox{Poisson}(\lambda_i)$$

$$ \log(\lambda_i) = \beta_0 + \sum_j \beta_j x_{j, i} $$

Here, each $y_i$ is distributed according to a poisson distribution with parameter $\lambda_i$ which depends on the covariates. If there are no covariates, note we still have the intercept term $\beta_0$. Hence, the model for this case would be

$$ y_i \sim \mbox{Poisson}(\lambda_i)$$

$$ \log(\lambda_i) = \beta_0 $$

The expression for $\log(\lambda)$ is often written as a matrix computation $\log(\lambda) = \mathbf{X} \beta$. In this case, $\beta, \lambda$ are vectors and $\mathbf{X}$ is a matrix. To account for the intercept, we understand one column of $\mathbf{X}$ to always consist of 1s. This is why, for example, R's model.matrix alwasy returns a column of 1s in the matrix.

So, even when we do not observe any covariates directly, we can always consider $\mathbf{X}$ in the expression for the model to be a single column of 1s so that we can estimate a single parameter, namely the intercept.

Demetri Pananos
  • 24,380
  • 1
  • 36
  • 94