In a hierarchical model of data $y$ where $$y \sim \textrm{Poisson}(\lambda)$$ $$\lambda \sim \textrm{Gamma}(\alpha, \beta)$$ it appears to be typical in practice to chose values ($\alpha, \beta)$ such that the mean and variance of the gamma distribution roughly match the mean and variance of the data $y$ (e.g., Clayton and Kaldor, 1987 "Empirical Bayes Estimates of Age-Standardized Relative Risks for Disease Mapping," Biometrics). Clearly this is just an ad hoc solution, though, since it would overstate the researcher's confidence in the parameters $(\alpha, \beta)$ and small fluctuations in the realized data could have large consequences for the gamma density, even if the underlying data generation process remains the same.
Furthermore, in Bayesian Data Analysis (2nd Ed), Gelman writes that this method is "sloppy;" in the book and this paper (starting p. 3232), he instead suggests that some hyperprior density $p(\alpha, \beta)$ should be chosen, in a fashion similar to the rat tumors example (starting p. 130).
Although it's clear that any $p(\alpha, \beta)$ is admissible so long as it produces a finite posterior density, I have not found any examples of hyperprior densities that researchers have used for this problem in the past. I would greatly appreciate it if someone could point me to books or articles which have employed a hyperprior density to estimate a Poisson-Gamma model. Ideally, I am interested in $p(\alpha, \beta)$ that is relatively flat and would be dominated by the data as in the rat tumor example, or a discussion comparing several alternative specifications and the trade-offs associated with each.