HES 505 Fall 2023: Session 4
Revisit the components of spatial data
Describe some of the key considerations for thinking about spatial data
Introduce the two primary R packages for spatial workflows
Learn to read and explore spatial objects in R
Cartesian coordinate system
origin (O) = the point at which both measurement systems intersect
Adaptable to multiple dimensions (e.g. z for altitude)
Latitude and Longitude
The earth is not flat…
Global Reference Systems (GRS)
Graticule: the grid formed by the intersection of longitude and latitude
The graticule is based on an ellipsoid model of earth’s surface and contained in the datum
The datum describes which ellipsoid to use and the precise relations between locations on earth’s surface and Cartesian coordinates
Geodetic datums (e.g., WGS84): distance from earth’s center of gravity
Local data (e.g., NAD83): better models for local variation in earth’s surface
How much of the world does the data cover?
For rasters, these are the corners of the lattice
For vectors, we call this the bounding box
Resolution: the accuracy that the location and shape of a map’s features can be depicted
Minimum Mapping Unit: The minimum size and dimensions that can be reliably represented at a given map scale.
Map scale vs. scale of analysis
The earth is not flat…
But maps, screens, and publications are…
Projections describe how the data should be translated to a flat surface
Rely on ‘developable surfaces’
Described by the Coordinate Reference System (CRS)
Projection necessarily induces some form of distortion (tearing, compression, or shearing)
Some projections minimize distortion of angle, area, or distance
Others attempt to avoid extreme distortion of any kind
Includes: Datum, ellipsoid, units, and other information (e.g., False Easting, Central Meridian) to further map the projection to the GCS
Not all projections have/require all of the parameters
Equal-area for thematic maps
Conformal for presentations
Mercator or equidistant for navigation and distance
Geometries, support, and spatial messiness
linestring is simple if it does not intersectEmpty geometries arise when an operation produces NULL outcomes (like looking for the intersection between two non-intersecting polygons)
sf allows empty geometries to make sure that information about the data type is retained
Similar to a data.frame with no rows or a list with NULL values
Most vector operations require simple, valid geometries
For vectors, the attribute-geometry-relationship can be:
constant = applies to every point in the geometry (lines and polygons are just lots of points)
identity = a value unique to a geometry
aggregate = a single value that integrates data across the geometry
Rasters can have point (attribute refers to the cell center) or cell (attribute refers to an area similar to the pixel) support
Quantitative geography requires that our data are aligned
Achieving alignment is part of reproducible workflows
Making principled decisions about projections, resolution, extent, etc
RR PackagesMost basic form of spatial data
Need x (longitude) and y (latitude) as columns
Need to know your CRS
read_*** necessary to bring in the data
.shp is the shapefile itself.prj contains the CRS information.dbf contains the attributes.shx contains the indices for matching attributes to geometriesst_read and read_sf in the sf package will read shapefiles into R
read_sf leaves character vectors alone (often beneficial)
st_read can handle other datatypes (like geodatabases)
Returns slightly different classes
rast will read rasters using the terra package
Also used to create rasters from scratch
Returns SpatRaster object
Good idea to get to know your data before manipulating it
str, summary, nrow, ncol are good places to start
st_crs (for sf class objects) and crs (for SpatRaster objects)
We’ll practice a few of these now…
write_sf for sf objects; writeRaster for SpatRasters