Choice of weight function in Moran's I

Question

I'm doing an autocorrelation analysis for a spatially distributed collection of observations. To perform my analysis, I am using Moran's I statistic.

My questions are: (1) What are the implications and benefits of using different weighting functions, i.e. $d^{-1}$, $d^{-2}$, $\exp(-d)$, and (2) Is there any (perhaps informal) answer to which of the possible weighting functions is used most frequently in the geo-statistics literature (and for what purposes)?

As for why I care: I am trying to explore whether there is clustering in my data set at different scales of structure, following some of the methodology of Fauchald 2000. I am plotting Moran's I versus aggregation scale. The interesting thing that the resulting correlation curves show very different qualitative behavior when calculating using $d^{-1}$ and $d^{-2}$ weighting functions ($d^{-1}$ has a discontinuity point, for example). I'm having a hard time understanding why this would be true -- does anyone have experience with this who may be able to point me to some references?

b_dev · Answer 1 · 2011-05-03T08:46:51.997

Moran's I statistic is used to explore a specific type of spatial clustering: whether high values are located in proximity to other high values and whether low values are located in proximity to other low values.

The trick then is 1st to get a sense of what you mean by proximity and 2nd formulating this mathematically. This idea of proximity will depend on the what type of observations (attributes) you are working with and what type of questions you have in mind.

For example, for human beings proximity could mean the distance needed to have a chat. So, if you wanted to know whether high income people like to chat with other high income people at your cocktail party, you could formulate proximity by using binary weights where 1 is defined by 2 people being within 3 feet of each other. To see whether house prices are spatially correlated you could define proximity as when 2 houses are neighbors or perhaps if two houses are on the same block or if 2 houses are within sight of one another etc etc.

Basically, you need a hypothesis of proximity that is based on some of your prior common sense ideas or expert knowledge of why 2 objects that are close to one another are more associated than 2 objects that are far from one another.

Moran's I can then be seen as a test of your hypothesis of how your notion of proximity structures high values next to one another on the landscape.

Andy W · Answer 2 · 2011-01-31T14:43:16.333

Although not within the domain of geostatistics, for question #2, I would casually say the most frequent weighting function used in my field (Criminology) would be a a binary weighting scheme. Although I have rarely seen a good theoretical or empirical argument to use one weighting scheme over another (or how one defines a neighbor in a binary weighting scheme either). It may simply be because of historical preference and convienance that such a scheme is typically used.

There is a distinction that should be drawn between data driven approaches to constructing spatial weights and the theory based approach to deriving spatial weights. You are currently performing the former, and in this approach you are implicitly treating the estimation of spatial weights as a problem of measurement error, and hence should use techniques to validate your measurements (which is considerably complicated due to the endogeneity of the spatial weights). Using a weighting scheme based on some of the chance variation in the data and using it in subsequent causal models is synonymous with other fallacies related to inference and data snooping. Unfortunately I have no good references of spatial weight models validated in any meaningful way besides the extent of the auto-correlation, which to be frank isn't all that convincing of an empirical argument. Spatial dependence can be the result of either causal processes (i.e. the value at one point in space affects the value at another point in space), or it can be the result of other measurement errors (i.e. the measured support of the data do not match the support of the processes that generate those phenomena).

This is oppossed to theory based construction of spatial weights (or "model-driven" in Luc Anselin's terminology), in which one specifies the weight matrix a priori to estimating a model. I did not read the Fauchald paper you cited, but it appears in the abstract they have plausible theoretical explanations for the observed patterns based on some optimal foraging strategy.

For readings I would suggest Luc Anselin's book, Spatial Econometrics: Methods and Models (1988), particularly chapters 2 and 3 will be of most interest. Also as another work with a similar viewpoint to mine (although it will likely be of less interest) is an essay piece by Gary King, "Why context should not count". I would also suggest another paper as it appears they had similar goals to yours, and defined the weights for a lattice system based on variogram estimates (Negreiros, 2010).

Thanks for the insight and references! The Gary King piece is indeed very interesting though not directly on topic. — pariser, Feb 01 '11 at 07:31

Choice of weight function in Moran's I

2 Answers2