HES 505 Fall 2023: Session 21
By the end of today you should be able to:
Use the spdep
package to identify the neighbors of a given polygon based on proximity, distance, and minimum number
Understand the underlying mechanics of Moran’s I and calculate it for various neighbors
Distinguish between global and local measures of spatial autocorrelation
Visualize neighbors and clusters
Attributes (features) are often non-randomly distributed
Especially true with aggregated data
Interest is in the relationship between proximity and the feature
Difference from kriging and semivariance
How do we define \(I(d)\) for areal data?
What about \(w_{ij}\)?
We can use spdep
for that!!
::: :::
spdep
::: :::
Queen, rook, (and bishop) cases impose neighbors by contiguity
Weights calculated as a \(1/ num. of neighbors\)
asthma.lag
[1,] "Camas" "9.9" "10.3"
[2,] "Kootenai" "10.4" "9.575"
[3,] "Kootenai" "10" "9.88"
[4,] "Kootenai" "9.5" "10.2666666666667"
[5,] "Twin Falls" "10.2" "9.5"
[6,] "Twin Falls" "10.4" "9.9"
Moran’s I coefficient is the slope of the regression of the lagged asthma percentage vs. the asthma percentage in the tract
More generally it is the slope of the lagged average to the measurement
cdc$casthma_cr
0.6467989
We can generate the expected distribution of Moran’s I coefficients under a Null hypothesis of no spatial autocorrelation
Using permutation and a loop to generate simulations of Moran’s I
n <- 400L # Define the number of simulations
I.r <- vector(length=n) # Create an empty vector
for (i in 1:n){
# Randomly shuffle income values
x <- sample(cdc$casthma_cr, replace=FALSE)
# Compute new set of lagged values
x.lag <- lag.listw(lw.qn, x)
# Compute the regression slope and store its value
M.r <- lm(x.lag ~ x)
I.r[i] <- coef(M.r)[2]
}