class: center, middle, inverse, title-slide .title[ # Geospatial visualization: vector maps ] .author[ ### MACS 20400
University of Chicago ] --- # Map data file formats * Vector files * Raster images * Numeric data * Popular formats * Shapefile * GeoJSON --- # Shapefile * Encodes points, lines, and polygons * Collection of files * `.shp` - geographic coordinates * `.dbf` - data associated with the geographic features * `.prj` - projection of the coordinates in the shapefile -- ``` ## -- cb_2013_us_county_20m.dbf ## -- cb_2013_us_county_20m.prj ## -- cb_2013_us_county_20m.shp ## -- cb_2013_us_county_20m.shp.iso.xml ## -- cb_2013_us_county_20m.shp.xml ## -- cb_2013_us_county_20m.shx ## -- county_20m.ea.iso.xml ``` --- # GeoJSON * Uses **J**ava**S**cript **O**bject **N**otation (JSON) file format ```json { "type": "Feature", "geometry": { "type": "Point", "coordinates": [125.6, 10.1] }, "properties": { "name": "Dinagat Islands" } } ``` * Plain text files --- # Simple features * [Packages in R for spatial data](https://cran.r-project.org/web/views/Spatial.html) * Tidy packages for spatial data * Simple features and `sf` * Emphasizes spatial geometry * Describes how to store and retrieve objects * Defines geometrical operations --- # What is a feature? * Thing or an object in the real world * Sets of features * Geometry * Attributes --- # Dimensions * Geometries composed of points * Coordinates in a 2-, 3- or 4-dimensional space * All points in a geometry have the same dimensionality * X and Y coordinates * Z coordinate * M coordinate (measure associated with point rather than the feature) --- # Simple feature geometry types | type | description | | ---- | ------------------------------------------------------- | | `POINT` | zero-dimensional geometry containing a single point | | `LINESTRING` | sequence of points connected by straight, non-self intersecting line pieces; one-dimensional geometry | | `POLYGON` | geometry with a positive area (two-dimensional); sequence of points form a closed, non-self intersecting ring; the first ring denotes the exterior ring, zero or more subsequent rings denote holes in this exterior ring | -- * `MULTIPOINT` * `MULTILINESTRING` * `MULTIPOLYGON` --- # Simple features in R * Uses basic R data structures * Data frame with one row per feature * Lots of list columns --- # Importing shapefiles ```r chi_shape <- here("static/data/Boundaries - Community Areas (current)/geo_export_328cdcbf-33ba-4997-8ce8-90953c6fec19.shp") %>% st_read() ``` ``` ## Reading layer `geo_export_328cdcbf-33ba-4997-8ce8-90953c6fec19' from data source `/Users/soltoffbc/Projects/Computing for Social Sciences/course-site/static/data/Boundaries - Community Areas (current)/geo_export_328cdcbf-33ba-4997-8ce8-90953c6fec19.shp' ## using driver `ESRI Shapefile' ## Simple feature collection with 77 features and 9 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: -87.94011 ymin: 41.64454 xmax: -87.52414 ymax: 42.02304 ## Geodetic CRS: WGS84(DD) ``` --- # Importing shapefiles ```r chi_shape ``` ``` ## Simple feature collection with 77 features and 9 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: -87.94011 ymin: 41.64454 xmax: -87.52414 ymax: 42.02304 ## Geodetic CRS: WGS84(DD) ## First 10 features: ## perimeter community shape_len shape_area area comarea area_numbe ## 1 0 DOUGLAS 31027.05 46004621 0 0 35 ## 2 0 OAKLAND 19565.51 16913961 0 0 36 ## 3 0 FULLER PARK 25339.09 19916705 0 0 37 ## 4 0 GRAND BOULEVARD 28196.84 48492503 0 0 38 ## 5 0 KENWOOD 23325.17 29071742 0 0 39 ## 6 0 LINCOLN SQUARE 36624.60 71352328 0 0 4 ## 7 0 WASHINGTON PARK 28175.32 42373881 0 0 40 ## 8 0 HYDE PARK 29746.71 45105380 0 0 41 ## 9 0 WOODLAWN 46936.96 57815180 0 0 42 ## 10 0 ROGERS PARK 34052.40 51259902 0 0 1 ## area_num_1 comarea_id geometry ## 1 35 0 MULTIPOLYGON (((-87.60914 4... ## 2 36 0 MULTIPOLYGON (((-87.59215 4... ## 3 37 0 MULTIPOLYGON (((-87.6288 41... ## 4 38 0 MULTIPOLYGON (((-87.60671 4... ## 5 39 0 MULTIPOLYGON (((-87.59215 4... ## 6 4 0 MULTIPOLYGON (((-87.67441 4... ## 7 40 0 MULTIPOLYGON (((-87.60604 4... ## 8 41 0 MULTIPOLYGON (((-87.58038 4... ## 9 42 0 MULTIPOLYGON (((-87.57714 4... ## 10 1 0 MULTIPOLYGON (((-87.65456 4... ``` --- # Importing shapefiles ```r select(chi_shape, community, geometry) ``` ``` ## Simple feature collection with 77 features and 1 field ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: -87.94011 ymin: 41.64454 xmax: -87.52414 ymax: 42.02304 ## Geodetic CRS: WGS84(DD) ## First 10 features: ## community geometry ## 1 DOUGLAS MULTIPOLYGON (((-87.60914 4... ## 2 OAKLAND MULTIPOLYGON (((-87.59215 4... ## 3 FULLER PARK MULTIPOLYGON (((-87.6288 41... ## 4 GRAND BOULEVARD MULTIPOLYGON (((-87.60671 4... ## 5 KENWOOD MULTIPOLYGON (((-87.59215 4... ## 6 LINCOLN SQUARE MULTIPOLYGON (((-87.67441 4... ## 7 WASHINGTON PARK MULTIPOLYGON (((-87.60604 4... ## 8 HYDE PARK MULTIPOLYGON (((-87.58038 4... ## 9 WOODLAWN MULTIPOLYGON (((-87.57714 4... ## 10 ROGERS PARK MULTIPOLYGON (((-87.65456 4... ``` --- # Importing GeoJSON files ```r chi_json <- here("static/data/Boundaries - Community Areas (current).geojson") %>% st_read() ``` ``` ## Reading layer `Boundaries - Community Areas (current)' from data source ## `/Users/soltoffbc/Projects/Computing for Social Sciences/course-site/static/data/Boundaries - Community Areas (current).geojson' ## using driver `GeoJSON' ## Simple feature collection with 77 features and 9 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: -87.94011 ymin: 41.64454 xmax: -87.52414 ymax: 42.02304 ## Geodetic CRS: WGS 84 ``` --- # Drawing maps with `sf` objects ```r usa <- here( "static", "data", "census_bureau", "cb_2013_us_state_20m", "cb_2013_us_state_20m.shp" ) %>% st_read() ``` ``` ## Reading layer `cb_2013_us_state_20m' from data source ## `/Users/soltoffbc/Projects/Computing for Social Sciences/course-site/static/data/census_bureau/cb_2013_us_state_20m/cb_2013_us_state_20m.shp' ## using driver `ESRI Shapefile' ## Simple feature collection with 52 features and 9 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: -179.1473 ymin: 17.88481 xmax: 179.7785 ymax: 71.35256 ## Geodetic CRS: NAD83 ``` --- # USA boundaries ```r ggplot(data = usa) + geom_sf() ``` <img src="index_files/figure-html/geom-sf-1.png" width="864" /> --- # Plot a subset of a map ```r usa_48 <- usa %>% filter(!(NAME %in% c("Alaska", "District of Columbia", "Hawaii", "Puerto Rico"))) ggplot(data = usa_48) + geom_sf() ``` <img src="index_files/figure-html/usa-subset-1.png" width="864" /> --- # Just another `ggplot()` ```r ggplot(data = usa_48) + geom_sf(fill = "palegreen", color = "black") ``` <img src="index_files/figure-html/usa-fill-1.png" width="864" /> --- # `albersusa` ```r library(albersusa) ggplot(data = usa_sf()) + geom_sf() ``` <img src="index_files/figure-html/albersusa-1.png" width="864" /> --- # Points ```r library(nycflights13) airports ``` ``` ## # A tibble: 1,458 × 8 ## faa name lat lon alt tz dst tzone ## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr> ## 1 04G Lansdowne Airport 41.1 -80.6 1044 -5 A America/… ## 2 06A Moton Field Municipal Airport 32.5 -85.7 264 -6 A America/… ## 3 06C Schaumburg Regional 42.0 -88.1 801 -6 A America/… ## 4 06N Randall Airport 41.4 -74.4 523 -5 A America/… ## 5 09J Jekyll Island Airport 31.1 -81.4 11 -5 A America/… ## 6 0A9 Elizabethton Municipal Airport 36.4 -82.2 1593 -5 A America/… ## 7 0G6 Williams County Airport 41.5 -84.5 730 -5 A America/… ## 8 0G7 Finger Lakes Regional Airport 42.9 -76.8 492 -5 A America/… ## 9 0P2 Shoestring Aviation Airfield 39.8 -76.6 1000 -5 U America/… ## 10 0S9 Jefferson County Intl 48.1 -123. 108 -8 A America/… ## # … with 1,448 more rows ``` --- # Points ```r ggplot(airports, aes(lon, lat)) + geom_point() ``` <img src="index_files/figure-html/scatter-1.png" width="864" /> --- # Points ```r ggplot(data = usa_48) + geom_sf() + geom_point(data = airports, aes(x = lon, y = lat), shape = 1) ``` <img src="index_files/figure-html/flights-usa-1.png" width="864" /> --- # Cropped map ```r ggplot(data = usa_48) + geom_sf() + geom_point(data = airports, aes(x = lon, y = lat), shape = 1) + coord_sf( xlim = c(-130, -60), ylim = c(20, 50) ) ``` <img src="index_files/figure-html/crop-1.png" width="864" /> --- # Converting to `sf` data frame ```r airports_sf <- st_as_sf(airports, coords = c("lon", "lat")) st_crs(airports_sf) <- 4326 # set the coordinate reference system airports_sf ``` ``` ## Simple feature collection with 1458 features and 6 fields ## Geometry type: POINT ## Dimension: XY ## Bounding box: xmin: -176.646 ymin: 19.72137 xmax: 174.1136 ymax: 72.27083 ## Geodetic CRS: WGS 84 ## # A tibble: 1,458 × 7 ## faa name alt tz dst tzone geometry ## * <chr> <chr> <dbl> <dbl> <chr> <chr> <POINT [°]> ## 1 04G Lansdowne Airport 1044 -5 A Amer… (-80.61958 41.13047) ## 2 06A Moton Field Municipa… 264 -6 A Amer… (-85.68003 32.46057) ## 3 06C Schaumburg Regional 801 -6 A Amer… (-88.10124 41.98934) ## 4 06N Randall Airport 523 -5 A Amer… (-74.39156 41.43191) ## 5 09J Jekyll Island Airport 11 -5 A Amer… (-81.42778 31.07447) ## 6 0A9 Elizabethton Municip… 1593 -5 A Amer… (-82.17342 36.37122) ## 7 0G6 Williams County Airp… 730 -5 A Amer… (-84.50678 41.46731) ## 8 0G7 Finger Lakes Regiona… 492 -5 A Amer… (-76.78123 42.88356) ## 9 0P2 Shoestring Aviation … 1000 -5 U Amer… (-76.64719 39.79482) ## 10 0S9 Jefferson County Intl 108 -8 A Amer… (-122.8106 48.05381) ## # … with 1,448 more rows ``` --- # Plotting with two sf data frames ```r ggplot() + geom_sf(data = usa_48) + geom_sf(data = airports_sf, shape = 1) + coord_sf( xlim = c(-130, -60), ylim = c(20, 50) ) ``` <img src="index_files/figure-html/flights-sf-plot-1.png" width="864" /> --- # Fill (choropleths) ```r (fb_state <- here( "static", "data", "census_bureau", "ACS_13_5YR_B05012_state", "ACS_13_5YR_B05012.csv" ) %>% read_csv() %>% mutate(rate = HD01_VD03 / HD01_VD01)) ``` ``` ## # A tibble: 51 × 10 ## GEO.id GEO.id2 `GEO.display-la…` HD01_VD01 HD02_VD01 HD01_VD02 HD02_VD02 ## <chr> <chr> <chr> <dbl> <lgl> <dbl> <dbl> ## 1 0400000US01 01 Alabama 4799277 NA 4631045 2881 ## 2 0400000US02 02 Alaska 720316 NA 669941 1262 ## 3 0400000US04 04 Arizona 6479703 NA 5609835 7725 ## 4 0400000US05 05 Arkansas 2933369 NA 2799972 2568 ## 5 0400000US06 06 California 37659181 NA 27483342 30666 ## 6 0400000US08 08 Colorado 5119329 NA 4623809 5778 ## 7 0400000US09 09 Connecticut 3583561 NA 3096374 5553 ## 8 0400000US10 10 Delaware 908446 NA 831683 2039 ## 9 0400000US11 11 District of Colu… 619371 NA 534142 2017 ## 10 0400000US12 12 Florida 19091156 NA 15392410 16848 ## # … with 41 more rows, and 3 more variables: HD01_VD03 <dbl>, HD02_VD03 <dbl>, ## # rate <dbl> ``` --- # Join the data ```r (usa_fb <- usa_48 %>% left_join(fb_state, by = c("STATEFP" = "GEO.id2"))) ``` ``` ## Simple feature collection with 48 features and 18 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: -124.7332 ymin: 24.5447 xmax: -66.9499 ymax: 49.38436 ## Geodetic CRS: NAD83 ## First 10 features: ## STATEFP STATENS AFFGEOID GEOID STUSPS NAME LSAD ALAND ## 1 01 01779775 0400000US01 01 AL Alabama 00 131172434095 ## 2 05 00068085 0400000US05 05 AR Arkansas 00 134772954601 ## 3 06 01779778 0400000US06 06 CA California 00 403482685922 ## 4 09 01779780 0400000US09 09 CT Connecticut 00 12541965607 ## 5 12 00294478 0400000US12 12 FL Florida 00 138897453172 ## 6 13 01705317 0400000US13 13 GA Georgia 00 148962779995 ## 7 16 01779783 0400000US16 16 ID Idaho 00 214045724209 ## 8 17 01779784 0400000US17 17 IL Illinois 00 143793994610 ## 9 18 00448508 0400000US18 18 IN Indiana 00 92789545929 ## 10 20 00481813 0400000US20 20 KS Kansas 00 211752860834 ## AWATER GEO.id GEO.display-label HD01_VD01 HD02_VD01 HD01_VD02 ## 1 4594920201 0400000US01 Alabama 4799277 NA 4631045 ## 2 2958815561 0400000US05 Arkansas 2933369 NA 2799972 ## 3 20484304865 0400000US06 California 37659181 NA 27483342 ## 4 1815409624 0400000US09 Connecticut 3583561 NA 3096374 ## 5 31413676956 0400000US12 Florida 19091156 NA 15392410 ## 6 4947803555 0400000US13 Georgia 9810417 NA 8859747 ## 7 2397731902 0400000US16 Idaho 1583364 NA 1489560 ## 8 6201680290 0400000US17 Illinois 12848554 NA 11073828 ## 9 1536677621 0400000US18 Indiana 6514861 NA 6206801 ## 10 1346718440 0400000US20 Kansas 2868107 NA 2677007 ## HD02_VD02 HD01_VD03 HD02_VD03 rate geometry ## 1 2881 168232 2881 0.03505361 MULTIPOLYGON (((-88.31002 3... ## 2 2568 133397 2568 0.04547570 MULTIPOLYGON (((-94.61792 3... ## 3 30666 10175839 30666 0.27020872 MULTIPOLYGON (((-118.6034 3... ## 4 5553 487187 5553 0.13595053 MULTIPOLYGON (((-73.69595 4... ## 5 16848 3698746 16848 0.19374133 MULTIPOLYGON (((-80.6602 24... ## 6 7988 950670 7988 0.09690414 MULTIPOLYGON (((-85.60516 3... ## 7 2528 93804 2528 0.05924348 MULTIPOLYGON (((-117.2151 4... ## 8 10091 1774726 10093 0.13812652 MULTIPOLYGON (((-91.51033 4... ## 9 4499 308060 4500 0.04728574 MULTIPOLYGON (((-88.09496 3... ## 10 3095 191100 3100 0.06662931 MULTIPOLYGON (((-102.0517 4... ``` --- # Draw the map ```r ggplot(data = usa_fb) + geom_sf(aes(fill = rate)) ``` <img src="index_files/figure-html/geom-map-state-1.png" width="864" /> --- # Exercise using `tidycensus` ![](http://www.vurtegopogo.com/wp-content/uploads/2016/10/using-a-pogo-stick-for-exercise-and-training.jpg) --- # Bin data to discrete intervals * Continuous vs. discrete variables for color * Collapse to a discrete variable --- # `cut_interval()` ```r usa_fb %>% mutate(rate_cut = cut_interval(rate, n = 6)) %>% ggplot() + geom_sf(aes(fill = rate_cut)) ``` <img src="index_files/figure-html/cut-interval-1.png" width="864" /> --- # `cut_number()` ```r usa_fb %>% mutate(rate_cut = cut_number(rate, n = 6)) %>% ggplot() + geom_sf(aes(fill = rate_cut)) ``` <img src="index_files/figure-html/cut-number-1.png" width="864" /> --- # Map projection <iframe width="560" height="315" src="https://www.youtube.com/embed/vVX-PrBRtTY?rel=0" frameborder="0" allowfullscreen></iframe> --- # Map projection .center[ ![[Mercator Projection](https://xkcd.com/2082/)](https://imgs.xkcd.com/comics/mercator_projection.png) ] --- # Map projection * Coordinate reference system * `proj4string` --- # Mercator projection ```r map_proj_base + coord_sf(crs = "+proj=merc") + ggtitle("Mercator projection") ``` <img src="index_files/figure-html/mercator-sf-1.png" width="864" /> --- # Projection using standard lines <img src="index_files/figure-html/projection-rest-1.png" width="864" /> --- # Select a color palette <img src="index_files/figure-html/color-wheel-1.png" width="864" /> -- * Optimizing color palette --- # Color Brewer * [Color Brewer](http://colorbrewer2.org/) --- # Sequential palettes <img src="index_files/figure-html/cb-seq-1.png" width="864" /> --- # Sequential palettes <img src="index_files/figure-html/cb-seq-map-1.png" width="864" /> --- # Diverging palettes <img src="index_files/figure-html/cb-div-1.png" width="864" /> --- # Diverging palettes <img src="index_files/figure-html/cb-div-map-1.png" width="864" /> --- # Qualitative <img src="index_files/figure-html/cb-qual-1.png" width="864" /> --- # Qualitative <img src="index_files/figure-html/cb-qual-map-1.png" width="864" /> --- # Viridis <img src="index_files/figure-html/viridis-1.png" width="864" />