Today we will be focusing on the practice of geospatial data visualization.
Once again, my preferred framework for the workflow of data visualization is shown in Figure 14.1
4.1 Load and Install Packages
Load the tidyverse and sf. Today we will be making static maps in ggplot which is part of the tidyverse ecosystem. We need the sf package to load, transform, and display geospatial data.
4.3 Visualize the Data - Geospatial ggplot Edition
While the tidyverse doesn’t have all the features of leaflet, it can be a quick way to visualize geospatial data for static maps and there are times when adding a static ggplot map is sufficient and actually preferred to the more detailed leaflet maps.
Note that ggplot uses + rather than the magrittr %>% for connecting lines of code.
In contrast to the leaflet map, the ggplot defaults to showing the x- and y-axis coordinates (latitude and longitude), shows guidelines, and only draws the counties in the dataset, rather than defaulting to showing an interactive map.
Figure 4.3 shows the same style of map replacing nc with SoCalEJ.
We have many options to improve a ggplot visualization. Let’s start by cleaning up the background using theme_bw(). theme_bw() changes the background from gray to a cleaner black-white style as shown in Figure 4.4.
Use aes(fill = <VARIABLE NAME>) to assign a category to color the counties by. The color palette to fill with is selected in scale_fill_<TYPE> where TYPE can be any of the following categories
binned
brewer
continuous
date or datetime
discrete
fermenter
viridis
First let’s use the default palette for BIR79 to show the county birthrates in 1979. Adding fill = BIR79 to the aes() defaults to the Blues palette. Figure 4.6 shows the result of adding a fill color scale.
Let’s change that to a viridis color scale. The function scale_fill_viridis_c() adds a fancier color-blind viridis palette in a continuous scale. Figure 4.7 shows this color scale option.
We can also add other geoms, like points or labels to this map. Let’s try to label the counties.
The nc dataset has a variable called NAME for the county names. Figure 4.8 shows the figure when we add the county names using the function geom_sf_text().
Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not
give correct results for longitude/latitude data
There is a lot going on in that function. I made the text white color = ‘white’, the size of the font 1.5 size = 1.5, and added the label aesthetic with aes(label = NAME). If you remove the size or the color, you can see why those alterations were made.
4.3.3.1 Exercise - Improve the SoCalEJ Visualization
Show a variable (categorical, continuous, or quantile) using a fill option.
Add two or more SoCal locations to the map using geom_point and your locations table. If that is easy, try increasing the salience of the points through size, color, or shape modifications to that layer.
Figure 4.9 shows a potential example of what that might look like.
It is really hard to see the details here. Let’s learn one last trick to zoom in on a ggplot to adjust the axes. The scale_x_continuous() and scale_y_continuous() functions allow us to set different axis limits. Figure 4.10 shows the