HES 505 Fall 2023: Session 4
Revisit the components of spatial data
Describe some of the key considerations for thinking about spatial data
Introduce the two primary R
packages for spatial workflows
Learn to read and explore spatial objects in R
Cartesian coordinate system
origin (O) = the point at which both measurement systems intersect
Adaptable to multiple dimensions (e.g. z for altitude)
Latitude and Longitude
The earth is not flat…
Global Reference Systems (GRS)
Graticule: the grid formed by the intersection of longitude and latitude
The graticule is based on an ellipsoid model of earth’s surface and contained in the datum
The datum describes which ellipsoid to use and the precise relations between locations on earth’s surface and Cartesian coordinates
Geodetic datums (e.g., WGS84
): distance from earth’s center of gravity
Local data (e.g., NAD83
): better models for local variation in earth’s surface
How much of the world does the data cover?
For rasters, these are the corners of the lattice
For vectors, we call this the bounding box
Resolution: the accuracy that the location and shape of a map’s features can be depicted
Minimum Mapping Unit: The minimum size and dimensions that can be reliably represented at a given map scale.
Map scale vs. scale of analysis
The earth is not flat…
But maps, screens, and publications are…
Projections describe how the data should be translated to a flat surface
Rely on ‘developable surfaces’
Described by the Coordinate Reference System (CRS)
Projection necessarily induces some form of distortion (tearing, compression, or shearing)
Some projections minimize distortion of angle, area, or distance
Others attempt to avoid extreme distortion of any kind
Includes: Datum, ellipsoid, units, and other information (e.g., False Easting, Central Meridian) to further map the projection to the GCS
Not all projections have/require all of the parameters
Equal-area for thematic maps
Conformal for presentations
Mercator or equidistant for navigation and distance
Geometries, support, and spatial messiness
linestring
is simple if it does not intersectEmpty geometries arise when an operation produces NULL
outcomes (like looking for the intersection between two non-intersecting polygons)
sf
allows empty geometries to make sure that information about the data type is retained
Similar to a data.frame
with no rows or a list
with NULL
values
Most vector operations require simple, valid geometries
For vectors, the attribute-geometry-relationship can be:
constant = applies to every point in the geometry (lines and polygons are just lots of points)
identity = a value unique to a geometry
aggregate = a single value that integrates data across the geometry
Rasters can have point (attribute refers to the cell center) or cell (attribute refers to an area similar to the pixel) support
Quantitative geography requires that our data are aligned
Achieving alignment is part of reproducible workflows
Making principled decisions about projections, resolution, extent, etc
R
R
PackagesMost basic form of spatial data
Need x
(longitude) and y
(latitude) as columns
Need to know your CRS
read_***
necessary to bring in the data
.shp
is the shapefile itself.prj
contains the CRS information.dbf
contains the attributes.shx
contains the indices for matching attributes to geometriesst_read
and read_sf
in the sf
package will read shapefiles into R
read_sf
leaves character vectors alone (often beneficial)
st_read
can handle other datatypes (like geodatabases)
Returns slightly different classes
rast
will read rasters using the terra
package
Also used to create rasters from scratch
Returns SpatRaster
object
Good idea to get to know your data before manipulating it
str
, summary
, nrow
, ncol
are good places to start
st_crs
(for sf
class objects) and crs
(for SpatRaster
objects)
We’ll practice a few of these now…
write_sf
for sf
objects; writeRaster
for SpatRasters