Operations With Vector Data II

HES 505 Fall 2023: Session 12

Matt Williamson

Today’s Plan

Objectives

  • By the end of today, you should be able to:

    • Translate pseudocode commands into functional workflows

    • Articulate the importance of key arguments to sf functions

    • Generate new attributes and geometries from existing data.

Motivating Questions

Example questions

  • What is the chronic heart disease risk of the 10 ID tracts that are furthest from hospitals?

  • How may \(km^2\) of ID are served by more than 1 hospital?

  • What is the difference between the average risk of chronic heart disease in the tracts served by at least two hospitals compared to those that aren’t served by any?

Key assummptions

  • All hospital locations are contained in the landmarks dataset

  • A hospital service area is defined as a 50km radius

  • Hospital service areas can cross state lines.

Example 1

What is the chronic heart disease risk of the 10 ID tracts that are furthest from hospitals?

What do we need to know?

  • Where are the hospitals?

  • How far are the hospitals from ID tracts?

  • Which tracts are the furthest?

  • What is the CHD risk?

Pseudocode

1. Load the hospital and cdc datasets
2. Align the data
3. Filter cdc so it only has Idaho tracts
4. Calculate distance from hospitals
5. Find top 10 tracts based on distance
6. Map chronic heart disease risk

Adding Functions

  1. Load the hospital and cdc datasets
library(tidyverse)
library(sf)
library(tmap)
hospital.sf <- read_csv("data/opt/data/2023/vectorexample/hospitals_pnw.csv") %>% 
  st_as_sf(., coords = c("longitude", "latitude"))
st_crs(hospital.sf)
Coordinate Reference System: NA
cdc.sf <- read_sf("data/opt/data/2023/vectorexample/cdc_nw.shp")
st_crs(cdc.sf)$epsg
[1] NA

Adding Functions

  1. Align the data
st_crs(hospital.sf) <- 4326

hospital.sf.proj <- hospital.sf %>% 
  st_transform(., crs=st_crs(cdc.sf))

st_crs(hospital.sf.proj) == st_crs(cdc.sf)
[1] TRUE
identical(st_crs(hospital.sf.proj), st_crs(cdc.sf))
[1] TRUE

Adding Functions

  1. Filter cdc so it only has Idaho tracts
cdc.idaho <- cdc.sf %>% 
  filter(STATEFP == "16")
plot(st_geometry(cdc.idaho))

Adding Functions

  1. Calculate distance from hospitals
nearest.hosp <- st_nearest_feature(cdc.idaho, hospital.sf.proj)
str(nearest.hosp)
 int [1:191] 6 45 45 45 3 3 3 3 6 3 ...
nearest.hosp.sf <- hospital.sf.proj[nearest.hosp,]
hospital.dist <- st_distance(cdc.idaho, nearest.hosp.sf, by_element = TRUE)
str(hospital.dist)
 Units: [m] num [1:191] 29501 46541 39386 32726 23534 ...

Adding Functions

  1. Find top 10 counties based on distance
cdc.idaho.hosp <- cdc.idaho %>% 
  mutate(., disthosp = hospital.dist)

cdc.furthest <- cdc.idaho.hosp %>% 
  slice_max(., n=10, order_by= disthosp)

head(cdc.furthest$disthosp)
Units: [m]
[1] 94506.47 83446.11 81134.60 70762.53 70425.16 70084.68

Adding Functions

  1. Map chronic heart disease risk
library(tmap)

tm_shape(tigris::counties("ID", progress_bar=FALSE)) +
  tm_polygons() +
  tm_shape(cdc.furthest) +
  tm_polygons("disthosp", title="Dist to Hospital (m2)") +
  tm_shape(hospital.sf.proj[cdc.idaho,]) +
  tm_symbols(size=0.25)

Example questions

How may \(km^2\) of ID are served by more than 1 hospital?

What do we need to know?

  • Where are the hospitals?

  • What is the service area for each hospital?

  • Where do those service areas overlap?

  • How big is the overlap area?

Pseudocode

1. Load the hospital dataset and add projection
2. Buffer hospitals by service area
3. Find intersection of service areas
4. Calculate area of overlap

Adding Functions

  1. Load the hospital dataset and add projection
hospital.sf <- read_csv("data/opt/data/2023/vectorexample/hospitals_pnw.csv") %>% 
  st_as_sf(., coords = c("longitude", "latitude"))

st_crs(hospital.sf) <- 4326

Adding Functions

  1. Buffer hospitals by service area
hospital.buf <- hospital.sf %>%
  filter(STATEFP == "16") %>% 
  st_buffer(., dist = units::set_units(30, "kilometers"))
plot(st_geometry(hospital.buf))

Adding Functions

  1. Find intersection of service areas ::: columns ::: {.column width=“40%”}
hospital.int <- hospital.buf %>% 
  st_intersection()
all(st_is_valid(hospital.int))

::: ::: {.column width=“40%”}

Adding Functions

  1. Calculate area of overlap

Plotting the Results

Example questions

  • What is the difference between the average risk of chronic heart disease in the counties served by at least two hospitals compared to those that aren’t served by any?

What do we need to know?

Pseudocode

Adding Functions

Plotting the Results