5

I know there is an R spatstat function to generate a ppp (Poisson Point Process), but I'm working in python, and I am not clear what spatstat.ppp is doing behind the scenes.

If I generate a an array of ordered pairs of random x's and random y's (within the spatial extent), will that be (or how will it differ from) a ppp? Should the x's and y's be chosen based on a Poisson distribution (np.random.poisson)?

I've read the spatstat documentation on ppp and the wikipedia pages on Poisson point process and Poisson distribution, but I just have no knack for statistics. Is it the number of events that is random? Or where they fall in space?

Thanks for any help you can give.
(ps -- this problem is entirely spatial, there is no time dimension.)

J Kelly
  • 507
  • 3
  • 13
  • I don't see a `spatstat` function in R. There's a *package* called "spatstat". It seems to have a function called `ppp` (`spatstat::ppp`), though (and a class called *ppp*). What other functions are you calling? Have you checked the spatstat vignettes or the code? – Glen_b Feb 05 '14 at 00:19
  • Sorry, my bad grammar, using spatstat as an adjective. I meant it in the way one would say an R function. I guess I have read the vignette; how can one see the code? – J Kelly Feb 05 '14 at 03:12
  • http://stackoverflow.com/search?q=[r]+see+source+code -- several different ways for different things (it's a bit different for S3 and S4 classes for example); plus anything on CRAN should have the actual source code it was built from. Actually, here you go, I [just went and found the tarball](http://cran.r-project.org/src/contrib/spatstat_1.35-0.tar.gz) – Glen_b Feb 05 '14 at 04:19
  • Wow, thanks, that is going to be interesting reading. I looked at ppp.R and it seems to have no random component at all. It seems to take any set of x,y you want to give it, even a perfectly regular grid. So why "Poisson"? Maybe it doesn't matter how my random points are distributed. – J Kelly Feb 05 '14 at 18:58
  • I gather the point is that it's *not simulation* or any other form of creating data at all. It doesn't ever consider whether the data is Poisson. Its purpose is simply to take point-pattern data (however obtained) and turn it into a *ppp*-object (on which other functions in spatstat can then operate). It's no more responsible for what data you give it than `data.frame` is, and its purpose is analogous to a call to `data.frame` or any other function to turn data into an object of a particular class. – Glen_b Feb 05 '14 at 23:23
  • The Poisson part has nothing to do with the coordinates of the points. Instead, it has to do with the distribution of *counts* of points contained within a subset of your space. Let's say you've generated points over a unit square using a Poisson Point Process. Next, you randomly choose a sub-area of the square--say a smaller square that is of length 1/100 per side. You count the points that fall in that random sub-square. If you repeat this (randomly selecting a different location for your sub-square and counting the number of points), you will find that the distribution of counts is Poisson. – adparker Apr 18 '14 at 22:33
  • @adparker "Nothing to do with" seems a little too strong to me. An equivalent formulation of a Poisson point process is that the points are independent and have a uniform distribution over the region. It is *called* a Poisson process for the reason you give. (PS I edited your comment to make it fit the space available. If any of those changes are objectionable, just flag it for moderator attention and we can fix or delete it.) – whuber Apr 18 '14 at 22:39

1 Answers1

2

The number of points in any given region is Poisson distributed, with mean equal to the integral of the rate function over that region. For example, if you have a homogeneous PP on the unit square, with rate function $\lambda(x,y)=\lambda$ (i.e. constant because it's homogeneous) then you could sample points as

  1. Sample $N \sim Poisson(\lambda) $
  2. Sample $N$ points uniformly on the unit square, i.e. $x,y \sim^{iid} U[0,1]$
daknowles
  • 401
  • 3
  • 8
  • Sorry, I don't understand your answer. I am not trying to be difficult or dense, but how is the "number of points in a given region Poisson distributed"? I mean, there is an infinite assortment of ways points can be in a region. My intuition says that some of those ways would be Poisson distributed and some wouldn't be. You could maybe help by explaining why we don't treat the points in, say, 2D space as bi-variate normally distributed, for example. – J Kelly Feb 05 '14 at 03:23
  • You could make it so the points are bivariate normal distributed by using a rate function proportion to the bivariate normal density. But how would you know how many of these points to sample? By first sampling that number from a Poisson with mean equal to the integral of your rate function. What rate function and spatial extent do you have? – daknowles Feb 06 '14 at 03:41
  • As an exercise, constant rate, though it's not. I have 1000 measured data points, clustered along rivers, so there is an underlying weighted surface. I'm trying to write a script like spatstat.Gest: compare a CDF of the nearest nbr distances with that of a random set of pts, but my "random" x,y's may not be random in the right (Poisson) way. I just can't locate the Poisson randomness; if x and y were Poisson distributed, wouldn't they cluster around the centroid? The distribution must be around the mean intensity at each pixel, but it's a slippery concept; I can't pin it down. – J Kelly Feb 06 '14 at 16:51
  • You seem to be interested in an *inhomogeneous* (spatial) Poisson process. These are surprisingly easy to generate using appropriate representations of the intensity function: a raster representation is best. – whuber Feb 06 '22 at 17:06