Assignment 3 Solutions: Coordinates and Geometries

1. Write out the pseudocode that you would use to set up an analysis of the spatial correlations between chronic asthma risk, exposure to PM2.5, and wildfire. You don’t have to write functions or any actual code. Just write the steps and insert named code blocks for each step.

This one is probably a little tricky if you haven’t taken the time to check out the attributes of the data (which you should always do). That said, some pretty generic steps would be:

1. Load each dataset
2. Check geometry validity
3. Align CRS
4. Run Correlation
5. Print Results

There are two key steps here, that you’ll repeat for any/all spatial analyses that you do: 1) checking for valid geometries and 2) making sure the data are aligned in a sensible CRS. I can add a code chunk for each now.

1. Load each dataset
2. Check geometry validity
3. Align CRS
4. Run Correlation
5. Print Results

2. Read in the cdc_nw.shp, pm_nw.shp, and wildfire_hazard_agg.tif files and print the coordinate reference system for each object. Do they match?

Here I’m going to combine the load portion of my pseudocode with the validity since I can do that without creating additional object. I use the str() function to get a sense for what the data looks like and to understand what data classes I’m working with. Then, I use the all() function to make sure that all of the results of st_is_valid() are true. I don’t need to do that with the raster file as the geometry is implicit which means that it has to be topologically valid (this doesn’t mean that the numbers are accurate, it just means that the dataset conforms to the data model R expects). Then I’ll add another code to check the CRS of the different objects.

library(sf)
library(terra)
library(tidyverse)

cdc.nw <- read_sf("data/opt/data/2023/assignment03/cdc_nw.shp")
str(cdc.nw)
sf [2,005 × 78] (S3: sf/tbl_df/tbl/data.frame)
 $ STATEFP   : chr [1:2005] "16" "16" "16" "16" ...
 $ COUNTYFP  : chr [1:2005] "025" "055" "055" "055" ...
 $ TRACTCE   : chr [1:2005] "970100" "000602" "000401" "000301" ...
 $ GEOID     : chr [1:2005] "16025970100" "16055000602" "16055000401" "16055000301" ...
 $ NAME      : chr [1:2005] "9701" "6.02" "4.01" "3.01" ...
 $ NAMELSAD  : chr [1:2005] "Census Tract 9701" "Census Tract 6.02" "Census Tract 4.01" "Census Tract 3.01" ...
 $ MTFCC     : chr [1:2005] "G5020" "G5020" "G5020" "G5020" ...
 $ FUNCSTAT  : chr [1:2005] "S" "S" "S" "S" ...
 $ ALAND     : num [1:2005] 2.78e+09 5.54e+06 1.04e+08 9.33e+07 1.05e+07 ...
 $ AWATER    : num [1:2005] 11548054 488272 1356762 2372161 1149 ...
 $ INTPTLAT  : chr [1:2005] "+43.5025515" "+47.7056992" "+47.6514553" "+47.7954122" ...
 $ INTPTLON  : chr [1:2005] "-114.7721349" "-116.9179618" "-117.0032478" "-116.9909923" ...
 $ stateabbr : chr [1:2005] "ID" "ID" "ID" "ID" ...
 $ statedesc : chr [1:2005] "Idaho" "Idaho" "Idaho" "Idaho" ...
 $ countyname: chr [1:2005] "Camas" "Kootenai" "Kootenai" "Kootenai" ...
 $ countyfips: chr [1:2005] "16025" "16055" "16055" "16055" ...
 $ totalpopul: chr [1:2005] "1117" "5727" "5834" "5316" ...
 $ access2_cr: num [1:2005] 16.5 16.2 16.6 14.1 22.7 20 24.8 12.2 28.6 24.1 ...
 $ access2__2: chr [1:2005] "(14.1, 18.8)" "(13.7, 19.3)" "(13.8, 19.6)" "(12.3, 16.1)" ...
 $ arthritis_: num [1:2005] 27.2 26.3 24 25.4 26.9 21.9 22 28 22.6 22.8 ...
 $ arthriti_2: chr [1:2005] "(25.9, 28.4)" "(25.1, 27.6)" "(22.8, 25.2)" "(24.5, 26.4)" ...
 $ binge_crud: num [1:2005] 14.2 14.9 15.8 15.4 13 15 14.9 12.1 15 15.1 ...
 $ binge_cr_2: chr [1:2005] "(13.8, 14.6)" "(14.2, 15.6)" "(15.3, 16.3)" "(15.0, 15.8)" ...
 $ bphigh_cru: num [1:2005] 35 32 29.9 30.9 34.5 28.8 30.7 35.1 29.8 29.1 ...
 $ bphigh_c_2: chr [1:2005] "(34.0, 36.0)" "(30.8, 33.2)" "(29.0, 30.8)" "(30.2, 31.6)" ...
 $ bpmed_crud: num [1:2005] 72.9 71.4 69 71.1 74.1 69.6 66.6 77.4 68.5 69.2 ...
 $ bpmed_cr_2: chr [1:2005] "(72.0, 73.9)" "(70.1, 72.7)" "(67.9, 70.0)" "(70.3, 71.9)" ...
 $ cancer_cru: num [1:2005] 7 6.5 6 6.9 7.3 5.7 4.7 9.1 5.4 5.8 ...
 $ cancer_c_2: chr [1:2005] "( 6.7,  7.4)" "( 6.2,  6.9)" "( 5.7,  6.3)" "( 6.7,  7.2)" ...
 $ casthma_cr: num [1:2005] 9.9 10.4 10 9.5 10.2 10.4 11.1 9 10 10.1 ...
 $ casthma__2: chr [1:2005] "( 9.4, 10.4)" "( 9.9, 11.0)" "( 9.5, 10.6)" "( 9.1,  9.9)" ...
 $ cervical_c: num [1:2005] 78.9 75.7 77 79.5 77.8 75.1 71.7 82.3 75.6 76.6 ...
 $ cervical_2: chr [1:2005] "(76.9, 80.7)" "(72.8, 78.8)" "(75.0, 79.3)" "(78.0, 81.1)" ...
 $ chd_crudep: num [1:2005] 7.7 7.1 6.2 6.4 8 6.1 6.4 7.5 6.8 6.6 ...
 $ chd_crude9: chr [1:2005] "( 7.1,  8.4)" "( 6.5,  7.7)" "( 5.7,  6.7)" "( 5.9,  6.8)" ...
 $ checkup_cr: num [1:2005] 70.2 70.8 69.4 70.9 70.9 68.7 66.7 74.6 65.4 66.4 ...
 $ checkup__2: chr [1:2005] "(69.0, 71.4)" "(69.7, 71.9)" "(68.3, 70.6)" "(70.1, 71.7)" ...
 $ cholscreen: num [1:2005] 82.3 79.3 79.9 83.3 81.4 76.1 72.4 87.1 76.2 76.4 ...
 $ cholscre_2: chr [1:2005] "(81.9, 82.7)" "(79.0, 79.7)" "(79.8, 80.0)" "(83.1, 83.5)" ...
 $ colon_scre: num [1:2005] 63 66.9 66.5 70.1 63.8 63.7 56.8 69.6 58.1 60.1 ...
 $ colon_sc_2: chr [1:2005] "(60.7, 65.6)" "(63.7, 70.1)" "(64.0, 69.0)" "(68.3, 71.7)" ...
 $ copd_crude: num [1:2005] 8.1 7.8 6.8 6.4 8.4 6.8 8.5 6.5 7.5 7.3 ...
 $ copd_cru_2: chr [1:2005] "( 7.2,  9.2)" "( 6.8,  8.9)" "( 5.9,  7.8)" "( 5.7,  7.0)" ...
 $ corem_crud: num [1:2005] 33.5 35.7 36.7 39.4 43.5 43.7 37.4 50.2 32.6 34.6 ...
 $ corem_cr_2: chr [1:2005] "(27.2, 40.2)" "(29.0, 42.9)" "(31.3, 42.5)" "(35.0, 43.6)" ...
 $ corew_crud: num [1:2005] 34.8 34.8 36.5 38.4 34.8 35.8 31.6 39.3 28.7 28.7 ...
 $ corew_cr_2: chr [1:2005] "(29.5, 40.4)" "(28.8, 41.4)" "(31.6, 41.6)" "(34.5, 42.2)" ...
 $ csmoking_c: num [1:2005] 19.5 18.9 18.6 16.3 18.4 17.6 24.8 12.3 20.5 19 ...
 $ csmoking_2: chr [1:2005] "(17.6, 21.5)" "(16.1, 21.7)" "(16.5, 20.7)" "(14.9, 17.7)" ...
 $ dental_cru: num [1:2005] 59.3 60.3 62.3 67.3 59.4 60.6 49.9 72 53.9 56.9 ...
 $ dental_c_2: chr [1:2005] "(56.0, 62.7)" "(56.6, 63.9)" "(58.3, 65.8)" "(64.9, 69.5)" ...
 $ depression: num [1:2005] 19.8 21.3 20.8 19.8 21.2 22.2 23.6 18.6 19.7 20.1 ...
 $ depressi_2: chr [1:2005] "(18.8, 20.9)" "(20.1, 22.5)" "(19.5, 22.1)" "(19.0, 20.6)" ...
 $ diabetes_c: num [1:2005] 11.7 10.6 9.5 9.6 11.8 9.2 10.6 10.3 11.1 10.3 ...
 $ diabetes_2: chr [1:2005] "(10.9, 12.4)" "( 9.8, 11.4)" "( 8.8, 10.2)" "( 9.0, 10.1)" ...
 $ ghlth_crud: num [1:2005] 16.3 15.8 14.1 12.6 17.5 14.5 19.6 11.8 19.4 17.5 ...
 $ ghlth_cr_2: chr [1:2005] "(14.4, 18.3)" "(13.9, 18.0)" "(12.4, 16.3)" "(11.3, 13.8)" ...
 $ highchol_c: num [1:2005] 33.4 30 29 30.7 31.9 27.5 27.7 33.4 29 28.4 ...
 $ highchol_2: chr [1:2005] "(32.6, 34.3)" "(29.0, 31.0)" "(28.2, 29.7)" "(30.1, 31.3)" ...
 $ kidney_cru: num [1:2005] 3.3 3.1 2.7 2.7 3.6 2.9 3.1 3.3 3.1 3.1 ...
 $ kidney_c_2: chr [1:2005] "( 3.1,  3.5)" "( 2.9,  3.4)" "( 2.5,  2.9)" "( 2.6,  2.9)" ...
 $ lpa_crudep: num [1:2005] 25 25 23.4 21.3 25.9 22.8 28.3 19.1 30 27.6 ...
 $ lpa_crude9: chr [1:2005] "(23.2, 26.9)" "(22.3, 27.5)" "(21.5, 25.3)" "(19.9, 22.6)" ...
 $ mammouse_c: num [1:2005] 68.8 70.5 70.4 71.3 68.7 69.5 68.7 71.9 67.3 68.7 ...
 $ mammouse_2: chr [1:2005] "(64.4, 73.1)" "(66.4, 74.5)" "(66.1, 74.7)" "(68.3, 74.1)" ...
 $ mhlth_crud: num [1:2005] 13.8 15.2 14.8 13.5 14 15.2 17 11 15.2 15.2 ...
 $ mhlth_cr_2: chr [1:2005] "(12.9, 14.6)" "(14.2, 16.3)" "(13.7, 15.9)" "(12.8, 14.2)" ...
 $ obesity_cr: num [1:2005] 36.4 32.8 31.8 30.7 34.3 32.3 36.5 29.4 36.6 35 ...
 $ obesity__2: chr [1:2005] "(34.6, 38.1)" "(31.4, 34.2)" "(30.4, 33.3)" "(29.7, 31.7)" ...
 $ phlth_crud: num [1:2005] 12.7 12.7 11.5 10.8 12.9 11 13.7 10.1 13.2 12.4 ...
 $ phlth_cr_2: chr [1:2005] "(11.6, 13.9)" "(11.5, 14.0)" "(10.4, 12.8)" "(10.0, 11.6)" ...
 $ sleep_crud: num [1:2005] 33.5 32.5 32.4 31.4 32.4 32.5 35.4 28.9 32.3 31.6 ...
 $ sleep_cr_2: chr [1:2005] "(32.0, 34.9)" "(31.0, 33.9)" "(30.8, 33.9)" "(30.3, 32.4)" ...
 $ stroke_cru: num [1:2005] 3.6 3.4 2.9 2.8 3.9 3 3.5 3.4 3.4 3.3 ...
 $ stroke_c_2: chr [1:2005] "( 3.2,  4.0)" "( 3.0,  3.8)" "( 2.6,  3.2)" "( 2.6,  3.1)" ...
 $ teethlost_: num [1:2005] 16.1 17 15.7 12.5 18.4 16.4 23.6 10.3 22.4 20.7 ...
 $ teethlos_2: chr [1:2005] "(12.4, 20.3)" "(12.1, 23.0)" "(11.9, 20.5)" "(10.1, 15.0)" ...
 $ geometry  :sfc_MULTIPOLYGON of length 2005; first list element: List of 1
  ..$ :List of 1
  .. ..$ : num [1:2521, 1:2] -115 -115 -115 -115 -115 ...
  ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
  ..- attr(*, "names")= chr [1:77] "STATEFP" "COUNTYFP" "TRACTCE" "GEOID" ...
all(st_is_valid(cdc.nw))
[1] TRUE
pm.nw <- read_sf("data/opt/data/2023/assignment03/pm_nw.shp")
str(pm.nw)
sf [3,241 × 4] (S3: sf/tbl_df/tbl/data.frame)
 $ STATE_NAME: chr [1:3241] "Idaho" "Idaho" "Idaho" "Idaho" ...
 $ CNTY_NAME : chr [1:3241] "Ada" "Ada" "Ada" "Ada" ...
 $ PM25      : num [1:3241] 7.85 7.85 7.56 7.67 7.94 ...
 $ geometry  :sfc_MULTIPOLYGON of length 3241; first list element: List of 1
  ..$ :List of 1
  .. ..$ : num [1:88, 1:2] -12938469 -12938370 -12938267 -12938049 -12937973 ...
  ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA
  ..- attr(*, "names")= chr [1:3] "STATE_NAME" "CNTY_NAME" "PM25"
all(st_is_valid(pm.nw))
[1] TRUE
wildfire.haz <- rast("data/opt/data/2023/assignment03/wildfire_hazard_agg.tif")
str(wildfire.haz)
S4 class 'SpatRaster' [package "terra"]

Now that I’ve gotten the data into my environment, I need to make sure that the CRS are alligned. I’ll demonstrate that with a few different approaches. You can use the logical == or the identical function to check, but remember that these fnctions are not specific to spatial objects, they evaluate things very literally. So even if the CRS is the same, if st_crs returns the CRS in one format (WKT) and crs returns it in another, you’ll get FALSE even if they are actually the same CRS - pay attention to that. You’ll notice that they aren’t identical; we’ll deal with that in the next question.

st_crs(cdc.nw)
Coordinate Reference System:
  User input: NAD83 
  wkt:
GEOGCRS["NAD83",
    DATUM["North American Datum 1983",
        ELLIPSOID["GRS 1980",6378137,298.257222101,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["latitude",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["longitude",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4269]]
st_crs(pm.nw)
Coordinate Reference System:
  User input: WGS 84 / Pseudo-Mercator 
  wkt:
PROJCRS["WGS 84 / Pseudo-Mercator",
    BASEGEOGCRS["WGS 84",
        ENSEMBLE["World Geodetic System 1984 ensemble",
            MEMBER["World Geodetic System 1984 (Transit)"],
            MEMBER["World Geodetic System 1984 (G730)"],
            MEMBER["World Geodetic System 1984 (G873)"],
            MEMBER["World Geodetic System 1984 (G1150)"],
            MEMBER["World Geodetic System 1984 (G1674)"],
            MEMBER["World Geodetic System 1984 (G1762)"],
            MEMBER["World Geodetic System 1984 (G2139)"],
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]],
            ENSEMBLEACCURACY[2.0]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4326]],
    CONVERSION["Popular Visualisation Pseudo-Mercator",
        METHOD["Popular Visualisation Pseudo Mercator",
            ID["EPSG",1024]],
        PARAMETER["Latitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["False easting",0,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",0,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["easting (X)",east,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["northing (Y)",north,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Web mapping and visualisation."],
        AREA["World between 85.06°S and 85.06°N."],
        BBOX[-85.06,-180,85.06,180]],
    ID["EPSG",3857]]
crs(wildfire.haz)
[1] "PROJCRS[\"unnamed\",\n    BASEGEOGCRS[\"NAD83\",\n        DATUM[\"North American Datum 1983\",\n            ELLIPSOID[\"GRS 1980\",6378137,298.257222101004,\n                LENGTHUNIT[\"metre\",1]]],\n        PRIMEM[\"Greenwich\",0,\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n        ID[\"EPSG\",4269]],\n    CONVERSION[\"Albers Equal Area\",\n        METHOD[\"Albers Equal Area\",\n            ID[\"EPSG\",9822]],\n        PARAMETER[\"Latitude of false origin\",23,\n            ANGLEUNIT[\"degree\",0.0174532925199433],\n            ID[\"EPSG\",8821]],\n        PARAMETER[\"Longitude of false origin\",-96,\n            ANGLEUNIT[\"degree\",0.0174532925199433],\n            ID[\"EPSG\",8822]],\n        PARAMETER[\"Latitude of 1st standard parallel\",29.5,\n            ANGLEUNIT[\"degree\",0.0174532925199433],\n            ID[\"EPSG\",8823]],\n        PARAMETER[\"Latitude of 2nd standard parallel\",45.5,\n            ANGLEUNIT[\"degree\",0.0174532925199433],\n            ID[\"EPSG\",8824]],\n        PARAMETER[\"Easting at false origin\",0,\n            LENGTHUNIT[\"metre\",1],\n            ID[\"EPSG\",8826]],\n        PARAMETER[\"Northing at false origin\",0,\n            LENGTHUNIT[\"metre\",1],\n            ID[\"EPSG\",8827]]],\n    CS[Cartesian,2],\n        AXIS[\"easting\",east,\n            ORDER[1],\n            LENGTHUNIT[\"metre\",1,\n                ID[\"EPSG\",9001]]],\n        AXIS[\"northing\",north,\n            ORDER[2],\n            LENGTHUNIT[\"metre\",1,\n                ID[\"EPSG\",9001]]]]"
identical(st_crs(cdc.nw), st_crs(pm.nw))
[1] FALSE
st_crs(cdc.nw) == st_crs(pm.nw)
[1] FALSE

3. Re-project the cdc_nw.shp and pm_nw.shp shapefiles so that they have the same CRS as the wildfire_hazard_agg.tfi file. Verify that all the files have the same projection.

Now we’ll use st_transform to get the two shapefiles aligned with the raster (because we generally want to avoid projecting rasters if we can). We can then use the same steps above to see if they’re aligned. Note that I’m using the terra::crs() function to make sure that the output is printed in exactly the same format

cdc.nw.proj <- cdc.nw %>% st_transform(., crs=crs(wildfire.haz))
pm.nw.proj <- pm.nw %>% st_transform(., crs=crs(wildfire.haz))

identical(crs(cdc.nw.proj), crs(wildfire.haz))
[1] TRUE
identical(crs(pm.nw.proj), crs(wildfire.haz))
[1] TRUE

4. How does reprojecting change the coordinates of the bounding box for the two shapefiles? Show your code

Now we just want to look at the bounding box of the data before and after it was projected. We can do this using st_bbox. One of the most obvious changes is that the units for cdc.nw have changed from degrees to meters (as evidenced by the much larger numbers). For the pm.nw object we can see that the raw coordinates indicate a shift to the west; however, because the origin for this crs has also changed, the states still show up in the correct place.

st_bbox(cdc.nw)
      xmin       ymin       xmax       ymax 
-124.74918   41.98818 -111.04349   49.00232 
st_bbox(cdc.nw.proj)
    xmin     ymin     xmax     ymax 
-2295337  2208890 -1189292  3177425 
st_bbox(pm.nw)
     xmin      ymin      xmax      ymax 
-13898126   5159210 -12361307   6275276 
st_bbox(pm.nw.proj)
    xmin     ymin     xmax     ymax 
-2300791  2208891 -1189293  3177426 

5. What class of geometry does the pm_nw.shp have (show your code)? Now filter the pm_nw.shp file so that only the records from Ada County, Idaho are showing. Find the record with the lowest value for PM25. How many coordinates are associated with that geometry?

This one was probably a little tricky. First, to check the geometry type, we use st_geometry_type setting by_geometry to FALSE means we get the geometry type for the entire object instead of each observation. We then use a series of filter commands to get the records from Idaho and Ada county. Once we’ve narrowed the data to our correct region, we can filter again to find the row with the minimum value of PM25 (note that we have to set na.rm=TRUE so that we ignore the NA values). Then we just take the number of rows (nrow) of the result of st_coordinates to get the number of coordinates associated with that geometry.

st_geometry_type(pm.nw, by_geometry = FALSE)
[1] MULTIPOLYGON
18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE
ada.pm <- pm.nw %>% 
  filter(STATE_NAME=="Idaho" & CNTY_NAME=="Ada") %>% 
  filter(PM25 == min(PM25, na.rm = TRUE))

ada.pm
Simple feature collection with 1 feature and 3 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -12935300 ymin: 5329192 xmax: -12910260 ymax: 5402433
Projected CRS: WGS 84 / Pseudo-Mercator
# A tibble: 1 × 4
  STATE_NAME CNTY_NAME  PM25                                            geometry
* <chr>      <chr>     <dbl>                                  <MULTIPOLYGON [m]>
1 Idaho      Ada        6.68 (((-12935301 5391002, -12934885 5391290, -12934526…
nrow(st_coordinates(ada.pm))
[1] 1394