Reading Spatial Data in R

HES 505 Fall 2023: Session 4

Matt Williamson

Objectives

Revisit the components of spatial data
Describe some of the key considerations for thinking about spatial data
Introduce the two primary R packages for spatial workflows
Learn to read and explore spatial objects in R

Describing Absolute Locations

Coordinates: 2 or more measurements that specify location relative to a reference system

Cartesian coordinate system
origin (O) = the point at which both measurement systems intersect
Adaptable to multiple dimensions (e.g. z for altitude)

Locations on a Globe

The earth is not flat…

Latitude and Longitude

Locations on a Globe

The earth is not flat…
Global Reference Systems (GRS)
Graticule: the grid formed by the intersection of longitude and latitude
The graticule is based on an ellipsoid model of earth’s surface and contained in the datum

Global Reference Systems

The datum describes which ellipsoid to use and the precise relations between locations on earth’s surface and Cartesian coordinates

Geodetic datums (e.g., WGS84): distance from earth’s center of gravity
Local data (e.g., NAD83): better models for local variation in earth’s surface

Describing location: extent

How much of the world does the data cover?
For rasters, these are the corners of the lattice
For vectors, we call this the bounding box

Describing location: resolution

Resolution: the accuracy that the location and shape of a map’s features can be depicted
Minimum Mapping Unit: The minimum size and dimensions that can be reliably represented at a given map scale.
Map scale vs. scale of analysis

The earth is not flat…

Projections

But maps, screens, and publications are…
Projections describe how the data should be translated to a flat surface
Rely on ‘developable surfaces’
Described by the Coordinate Reference System (CRS)

Projection necessarily induces some form of distortion (tearing, compression, or shearing)

Coordinate Reference Systems

Some projections minimize distortion of angle, area, or distance
Others attempt to avoid extreme distortion of any kind
Includes: Datum, ellipsoid, units, and other information (e.g., False Easting, Central Meridian) to further map the projection to the GCS
Not all projections have/require all of the parameters

Choosing Projections

Equal-area for thematic maps
Conformal for presentations
Mercator or equidistant for navigation and distance

Geometries, support, and spatial messiness

Geometries

Vectors store aggregate the locations of a feature into a geometry
Most vector operations require simple, valid geometries

Valid Geometries

A linestring is simple if it does not intersect
Valid polygons
Are closed (i.e., the last vertex equals the first)
Have holes (inner rings) that inside the the exterior boundary
Have holes that touch the exterior at no more than one vertex (they don’t extend across a line) - For multipolygons, adjacent polygons touch only at points
Do not repeat their own path

Empty Geometries

Empty geometries arise when an operation produces NULL outcomes (like looking for the intersection between two non-intersecting polygons)
sf allows empty geometries to make sure that information about the data type is retained
Similar to a data.frame with no rows or a list with NULL values
Most vector operations require simple, valid geometries

Support

Support is the area to which an attribute applies.

For vectors, the attribute-geometry-relationship can be:
constant = applies to every point in the geometry (lines and polygons are just lots of points)
identity = a value unique to a geometry
aggregate = a single value that integrates data across the geometry
Rasters can have point (attribute refers to the cell center) or cell (attribute refers to an area similar to the pixel) support

Spatial Messiness

Quantitative geography requires that our data are aligned
Achieving alignment is part of reproducible workflows
Making principled decisions about projections, resolution, extent, etc

Mapping Location in `R`

Data Types and `R` Packages

Data Types

Vector Data
- Point features
- Line features
- Area features (polygons)
Raster Data
- Spatially continuous field
- Based on pixels (not points)

Reading in Spatial Data: spreadsheets

Most basic form of spatial data
Need x (longitude) and y (latitude) as columns
Need to know your CRS
read_*** necessary to bring in the data

library(tidyverse)
library(sf)

file.to.read <- read_csv(file = "path/to/your/file", 
                         col_names = TRUE, col_types = NULL, 
                         na =na = c("", "NA"))

file.as.sf <- st_as_sf(file.to.read, 
                       coords = c("longitude", "latitude"), 
                       crs=4326)

Reading in Spatial Data: shapefiles

ALL FILES NEED TO BE IN THE SAME FOLDER

.shp is the shapefile itself
.prj contains the CRS information
.dbf contains the attributes
.shx contains the indices for matching attributes to geometries
other extensions contain metadata

st_read and read_sf in the sf package will read shapefiles into R
read_sf leaves character vectors alone (often beneficial)
st_read can handle other datatypes (like geodatabases)
Returns slightly different classes

Reading in Spatial Data: shapefiles

library(sf)
shapefile.inR <- read_sf(dsn = "path/to/file.shp", 
                         layer=NULL, geometry_column=...)

Reading in Spatial Data: rasters

rast will read rasters using the terra package
Also used to create rasters from scratch
Returns SpatRaster object

library(sf)
raster.inR <- rast(x = "path/to/file.shp", 
                         lyrs=NULL)

Introducing the Data

Good idea to get to know your data before manipulating it
str, summary, nrow, ncol are good places to start
st_crs (for sf class objects) and crs (for SpatRaster objects)
We’ll practice a few of these now…

Saving your data

write_sf for sf objects; writeRaster for SpatRasters

library(sf)
library(terra)

write_sf(object = object.to.save, dsn = "path/to/save/object", append = FALSE)
writeRaster(x=object, filename = "path/to/save")

Reading Spatial Data in R

Objectives

Describing Absolute Locations

Locations on a Globe

Locations on a Globe

Global Reference Systems

Describing location: extent

Describing location: resolution

Projections

Coordinate Reference Systems

Choosing Projections

Geometries

Valid Geometries

Empty Geometries

Support

Spatial Messiness

Mapping Location in R

Data Types and R Packages

Data Types

Reading in Spatial Data: spreadsheets

Reading in Spatial Data: shapefiles

Reading in Spatial Data: shapefiles

Reading in Spatial Data: rasters

Introducing the Data

Saving your data

Mapping Location in `R`

Data Types and `R` Packages