23  Advanced Spatial Visualization

Today we will be focusing on the theory and practice of fancy geospatial data visualization.

23.1 Visual Categories and Encodings

Let’s go back to the beginning of this course. There are 3 categories of information that can be displayed.

  1. Quantitative
  2. Qualitative
  3. Spatial

Lecture 1.2.1

The three types of data can be encoded in:

  • Geometric primitives - points, lines, and areas
  • Visual channels - size, color, shape, position, angle, and texture

An advanced spatial visualization covering multiple layers of information needs to use multiple sets of encodings to convey information quickly and intuitively while not overwhelming the audience.

23.2 Circles, Lines, and Polygons - Oh My!

Fancy maps need distinct visual encodings, so the eye can be drawn the salient features.

One key way to do this is through ensuring different types/styles/aesthetics are displayed as unique fingerprints of visual encodings.

Let’s combine the three datasets we have showed in class for the sacrifice zones projects as example 1.

First, get all the libraries we need loaded up.

23.2.1 Example 1 - Uranium Mines and Navajo Lands

23.2.1.1 Acquire datasets

23.2.1.1.1 US EPA Uranium Mills and Mines Database

Here’s a geospatial dataset the EPA created for abandoned uranium mines in the Western US. It is downloadable as a zip file, which has multiple subdirectories. We will point to the master database as a first exploration.

U <- sf::st_read(dsn = 'uld-ii_gis/Master_Database_and_Shape_Files') %>% 
  st_transform(crs = 4326)
Multiple layers are present in data source C:\Dev\EnviroDataVis\uld-ii_gis\Master_Database_and_Shape_Files, reading layer `ULD_albers'.
Use `st_layers' to list all layer names and their type in a data source.
Set the `layer' argument in `st_read' to read a particular layer.
Warning in evalq((function (..., call. = TRUE, immediate. = FALSE, noBreaks. =
FALSE, : automatically selected the first layer in a data source containing more
than one.
Reading layer `ULD_albers' from data source 
  `C:\Dev\EnviroDataVis\uld-ii_gis\Master_Database_and_Shape_Files' 
  using driver `ESRI Shapefile'
Simple feature collection with 14810 features and 30 fields
Geometry type: MULTIPOINT
Dimension:     XY
Bounding box:  xmin: -3296195 ymin: -1542681 xmax: 1955673 ymax: 4183792
Projected CRS: North_America_Albers_Equal_Area_Conic
23.2.1.1.2 Make some exploratory maps

Let’s see what the basic mines dataset looks like.

Figure 23.1

leaflet() %>% 
  addTiles() %>% 
  addMarkers(data = U,
                   lat = ~LATITUDE,
                   lng = ~LONGITUDE,
                   clusterOptions = markerClusterOptions())

Figure 23.1: Basic map of uranium mines

There are WAY more uranium mines than I expected. Let’s focus on areas near the Navajo Nation in the four-corners states of Colorado, New Mexico, Arizona, and Utah.

Let’s filter() the U dataset to the spatial scale of interest using the STATE_NAME column.

states <- c('Colorado', 'New Mexico', 'Arizona', 'Utah')
#not the band
U2 <- U %>% 
  filter(STATE_NAME %in% states)

Ok, let’s look at that and see if we are limiting our dataset.

Figure 23.2 shows the four-corners mines.

leaflet() %>% 
  addTiles() %>% 
  addProviderTiles(provider = providers$Stamen.Terrain) %>% 
  addCircleMarkers(data = U2,
                   lat = ~LATITUDE,
                   lng = ~LONGITUDE,
                   clusterOptions = markerClusterOptions(),
                   color = 'darkred',
                   label = ~htmlEscape(MINENAME)) %>% 
  addMiniMap()

Figure 23.2: Uranium mines in the four corners states

23.2.1.1.3 Tribal Lands

Federally recognized tribal lands are available here

It is downloadable as a zip file, which needs to be extracted.

After extracting it into the working directory, the default extracted directory is tl_2020_us_aitsn on my machine.

Let’s do the steps - import, display a simple map, then display a fancy map.

tribalLands <- sf::st_read(dsn = 'tl_2020_us_aitsn') %>% 
  st_transform(crs = 4326)
Reading layer `tl_2020_us_aitsn' from data source 
  `C:\Dev\EnviroDataVis\tl_2020_us_aitsn' using driver `ESRI Shapefile'
Simple feature collection with 484 features and 14 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -124.0932 ymin: 31.50786 xmax: -83.15622 ymax: 48.63591
Geodetic CRS:  NAD83

Make a simple map next.

I am not displaying the basic example because my course website is limited to 100 MB, but the code does work.

leaflet() %>% 
  addTiles() %>% 
  addPolygons(data = tribalLands,
              weight = 1) 

That worked. Let’s try to combine the Uranium mines map with the tribal lands of the Navajo Nation using setView(). I clicked on a google map to get the lat (36.481) and lng (-109.495).

Figure 23.3 shows the result.

leaflet() %>% 
  addTiles() %>% 
  #addProviderTiles(provider = providers$Stamen.Terrain) %>% 
  setView(lng = -109.495, lat = 36.481, zoom = 7) %>% 
  addPolygons(data = tribalLands,
              weight = 1, 
              color = 'blue',
              fillOpacity = 0.2) %>% 
  addCircleMarkers(data = U2,
                   lat = ~LATITUDE,
                   lng = ~LONGITUDE,
                   clusterOptions = markerClusterOptions(),
                   color = 'darkred',
                   label = ~htmlEscape(MINENAME)) %>% 
  addMiniMap()

Figure 23.3: Uranium mines in Navajo Nation boundaries

23.2.1.2 Discussion and Critique

Do Circles and Polygons overlay in a useful way?

What changes do you think would make this map more usable and intuitive?

23.2.2 Example 2. Water in LA County

LA County runs an annual water deficit, which requires large annual imports of water from multiple sources. Let’s try to quantify these flows.

Let’s start with the supply of water data from the city of Los Angeles.

Here is a LADWP water supply in acre-feet.


Attaching package: 'janitor'
The following objects are masked from 'package:stats':

    chisq.test, fisher.test
H2O_data <- read_csv('https://data.lacity.org/api/views/qyvz-diiw/rows.csv?accessType=DOWNLOAD') %>% 
  clean_names()
Rows: 49 Columns: 11
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): Date Value, Fiscal Year
dbl (9): MWD, LA Aqueduct, Local Groundwater, Recycled Water, Total Acre Fee...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

23.2.2.1 Plot water supply over time

Figure 23.4 shows a simple bar chart of LA City water supply sources. Note that there is some fancy data manipulation first though. Also, there’s a call to the scales package to make the x-axis label nicer.

H2O_data %>% 
  select(1, 3:6) %>% 
  pivot_longer(names_to = 'parameter', values_to = 'acreFeet', cols = 2:5) %>% 
  mutate(date_value = lubridate::mdy_hms(date_value)) %>% 
  ggplot(aes(x = date_value, y = acreFeet, fill = parameter)) +
    #geom_line() +
    #geom_point() +
    geom_bar(stat = 'identity') +
    theme_bw() +
    scale_x_datetime(labels = scales::label_date_short(), date_breaks = '5 years') +
    labs(x = '', y = 'Supply in Acre Feet')

Figure 23.4: Water supply trends for LA City

It is very clear that groundwater and recycling are minimal. Most water is imported from MWD or the LA Aqueduct, with a general long-term trend to rely more on MWD over time.

The LA Aqueduct gets its water from the Owens Valley and by diverting water from Mono Lake. That source is restricted in the last 20 years due to agreements to stop diverting so much water in drought years in order to keep Mono Lake levels stable.

More complicated is the MWD - Municipal water district.

Water in the MWD comes from three main sources.

MWD Water Sources

Now, there are three separate sources feeding into MWD supply. The Colorado River Aqueduct, the State Water Project, and the local sources.

Calculate latest ten-year average water contribution from LA aqueduct and MWD sources. The table below shows the values.

H2O_2 <- H2O_data %>% 
  mutate(yr = lubridate::year(lubridate::mdy_hms(date_value))) %>% 
           filter(yr > 2006) %>% 
           summarize(avg_LA_aqueduct = mean(la_aqueduct_percent_of_total),
                     avg_MWD = mean(mwd_percent_of_total)) %>%
  mutate(CO_aqueduct = 0.25*avg_MWD, state_water_project = 0.35*avg_MWD)
         
H2O_2         

?(caption)

# A tibble: 1 × 4
  avg_LA_aqueduct avg_MWD CO_aqueduct state_water_project
            <dbl>   <dbl>       <dbl>               <dbl>
1            31.9    54.8        13.7                19.2

Ok, about 32% of water came from the LA aqueduct, 13.7% came from Colorado River Aqueduct, and 19.2% came from the State water project. That’s about 65% of the water imported from the three aqueducts.

I can now overlay some markers or change the aqueduct thickness or think of some other way to visualize the source of the water coming to LA county.

23.2.2.2 Optional Exercise #2

  1. What other way would you show this information? A sankey diagram?