class: center, middle, inverse, title-slide # Spatial Data ## Statistical Computing & Programming ### Shawn Santo ### 05-28-20 --- ## Supplementary materials Companion videos - [Spatial data lecture preview](https://warpwire.duke.edu/w/dcwDAA/) - [Introduction to spatial data](https://warpwire.duke.edu/w/a8wDAA/) - [Spatial data in R with package `sf`](https://warpwire.duke.edu/w/bcwDAA/) - [Visualizing spatial data](https://warpwire.duke.edu/w/b8wDAA/) - [Map layers and CRS](https://warpwire.duke.edu/w/ccwDAA/) - [Wrangling spatial data](https://warpwire.duke.edu/w/c8wDAA/) Additional resources - Simple Features for R [vignettes](https://r-spatial.github.io/sf/) - [CRS in R](https://www.nceas.ucsb.edu/~frazier/RSpatialGuides/OverviewCoordinateReferenceSystems.pdf) by Melanie Frazier - [Leaflet for R](https://rstudio.github.io/leaflet/) --- class: inverse, center, middle # Introduction --- ## Spatial data is different Our typical tidy data frame: .tiny[ ``` #> # A tibble: 336,776 x 19 #> year month day dep_time sched_dep_time dep_delay arr_time #> <int> <int> <int> <int> <int> <dbl> <int> #> 1 2013 1 1 517 515 2 830 #> 2 2013 1 1 533 529 4 850 #> 3 2013 1 1 542 540 2 923 #> 4 2013 1 1 544 545 -1 1004 #> 5 2013 1 1 554 600 -6 812 #> 6 2013 1 1 554 558 -4 740 #> 7 2013 1 1 555 600 -5 913 #> 8 2013 1 1 557 600 -3 709 #> 9 2013 1 1 557 600 -3 838 #> 10 2013 1 1 558 600 -2 753 #> # … with 336,766 more rows, and 12 more variables: sched_arr_time <int>, #> # arr_delay <dbl>, carrier <chr>, flight <int>, tailnum <chr>, #> # origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>, #> # minute <dbl>, time_hour <dttm> ``` ] --- A simple features object: .tiny[ ``` #> Simple feature collection with 100 features and 5 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965 #> epsg (SRID): 4267 #> proj4string: +proj=longlat +datum=NAD27 +no_defs #> First 10 features: #> AREA PERIMETER CNTY_ CNTY_ID NAME #> 1 0.114 1.442 1825 1825 Ashe #> 2 0.061 1.231 1827 1827 Alleghany #> 3 0.143 1.630 1828 1828 Surry #> 4 0.070 2.968 1831 1831 Currituck #> 5 0.153 2.206 1832 1832 Northampton #> 6 0.097 1.670 1833 1833 Hertford #> 7 0.062 1.547 1834 1834 Camden #> 8 0.091 1.284 1835 1835 Gates #> 9 0.118 1.421 1836 1836 Warren #> 10 0.124 1.428 1837 1837 Stokes #> geometry #> 1 MULTIPOLYGON (((-81.47276 3... #> 2 MULTIPOLYGON (((-81.23989 3... #> 3 MULTIPOLYGON (((-80.45634 3... #> 4 MULTIPOLYGON (((-76.00897 3... #> 5 MULTIPOLYGON (((-77.21767 3... #> 6 MULTIPOLYGON (((-76.74506 3... #> 7 MULTIPOLYGON (((-76.00897 3... #> 8 MULTIPOLYGON (((-76.56251 3... #> 9 MULTIPOLYGON (((-78.30876 3... #> 10 MULTIPOLYGON (((-80.02567 3... ``` ] --- Another simple features object: .tiny[ ``` #> Simple feature collection with 94 features and 1 field #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: 127456.7 ymin: 26560.42 xmax: 923523.8 ymax: 318097.4 #> epsg (SRID): 32119 #> proj4string: +proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334 +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs #> # A tibble: 94 x 2 #> GML_HAB geometry #> <chr> <MULTIPOLYGON [m]> #> 1 Alcoa (((512096.2 183241.7, 512185.7 183203.4, 512226 1831… #> 2 Alligator River (((869633.1 244541.9, 869739.4 243987.6, 869762.7 24… #> 3 Angola Bay (((721333.1 107161.1, 721374.7 107222.9, 721474 1073… #> 4 Bachelor Bay (((813742.2 238618.7, 813730 238603.2, 813693.8 2385… #> 5 Bertie County (((797133.8 247034.5, 797119.5 247030, 797112.2 2470… #> 6 Bladen Lakes Stat… (((658970.6 95406.32, 660025.1 94245.76, 659839.4 94… #> 7 Brinkleyville (((714741 276970.3, 714623.9 276970, 714622.1 277000… #> 8 Buckhorn (((589723.7 253224.6, 589568.5 252937.2, 589689.8 25… #> 9 Buckridge (((871137.4 219894.9, 871124.9 219827.8, 871124.2 21… #> 10 Buffalo Cove (((381445.9 260375.4, 381574.9 259668.3, 381915 2597… #> # … with 84 more rows ``` ] --- ## Spatial data plotting needs care <img src="lec-10_files/figure-html/unnamed-chunk-5-1.png" width="100%" style="display: block; margin: auto;" /> --- <img src="lec-10_files/figure-html/unnamed-chunk-6-1.png" width="100%" style="display: block; margin: auto;" /> --- <img src="lec-10_files/figure-html/unnamed-chunk-7-1.png" width="100%" style="display: block; margin: auto;" /> --- class: center, middle ## Can we combine the two plots? --- <img src="lec-10_files/figure-html/unnamed-chunk-8-1.png" width="100%" style="display: block; margin: auto;" /> --- class: center, middle ## We can, but more care is needed. --- <img src="lec-10_files/figure-html/unnamed-chunk-9-1.png" width="100%" style="display: block; margin: auto;" /> --- ## Spatial data challenges 1. Different data types exist. 2. Special attention must be given to the coordinate reference system (CRS). 3. Manipulating spatial data objects is similar but not identical to manipulating data frame objects. --- class: inverse, center, middle # Spatial data and R --- ## Analysis of spatial data in R .pull-left[ <br/> - Package `raster` contains classes and tools for handling spatial raster data. <br/><br/> - Package `sf` combines the functionality of `sp`, `rgdal`, and `rgeos` into a single package based on tidy simple features. ] .pull-right[ ![](images/vector_raster_comparison.png) ] <br/> Whether or not you use vector or raster data depends on the type of problem and the data source. Our focus will be on vector data and package `sf`. *Source:* https://commons.wikimedia.org/wiki/File:Raster_vector_tikz.png --- ## Installing package `sf` From https://r-spatial.github.io/sf/index.html **Windows** Installing `sf` from source works under windows when Rtools is installed. This downloads the system requirements from rwinlib. **MacOS** ```bash brew install pkg-config brew install gdal ``` Once gdal is installed, you will be able to install sf package from source in R. **Linux** For Unix-alikes, GDAL (>= 2.0.1), GEOS (>= 3.4.0) and Proj.4 (>= 4.8.0) are required. --- ## Features and simple features - A **feature** is a thing or object in the real world: a house, a city, a park, a forest, etc. <br/><br/> - A **simple feature** as defined by OpenGIS Abstract specification is to have both spatial and non-spatial attributes. Spatial attributes are geometry valued, and simple features are based on 2D geometry with linear interpolation between vertices. <br/><br/> .tiny[ ```r #> Simple feature collection with 100 features and 1 field #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965 #> epsg (SRID): 4326 #> proj4string: +proj=longlat +datum=WGS84 +no_defs #> # A tibble: 100 x 2 #> NAME geometry #> <chr> <MULTIPOLYGON [°]> *#> 1 Ashe (((-81.47276 36.23436, -81.54084 36.27251, -... #> 2 Alleghany (((-81.23989 36.36536, -81.24069 36.37942, -... ``` ] --- ## Simple features examples <img src="lec-10_files/figure-html/unnamed-chunk-12-1.png" width="100%" style="display: block; margin: auto;" /> --- ## Objects `sf` and `sfc` .tiny[ ```r nc <- st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE) ``` ] -- .tiny[ ```r nc ``` ``` #> Simple feature collection with 100 features and 14 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965 #> epsg (SRID): 4267 #> proj4string: +proj=longlat +datum=NAD27 +no_defs #> First 10 features: #> AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO CRESS_ID BIR74 #> 1 0.114 1.442 1825 1825 Ashe 37009 37009 5 1091 #> 2 0.061 1.231 1827 1827 Alleghany 37005 37005 3 487 #> 3 0.143 1.630 1828 1828 Surry 37171 37171 86 3188 #> 4 0.070 2.968 1831 1831 Currituck 37053 37053 27 508 #> 5 0.153 2.206 1832 1832 Northampton 37131 37131 66 1421 #> 6 0.097 1.670 1833 1833 Hertford 37091 37091 46 1452 #> 7 0.062 1.547 1834 1834 Camden 37029 37029 15 286 #> 8 0.091 1.284 1835 1835 Gates 37073 37073 37 420 #> 9 0.118 1.421 1836 1836 Warren 37185 37185 93 968 #> 10 0.124 1.428 1837 1837 Stokes 37169 37169 85 1612 #> SID74 NWBIR74 BIR79 SID79 NWBIR79 geometry #> 1 1 10 1364 0 19 MULTIPOLYGON (((-81.47276 3... #> 2 0 10 542 3 12 MULTIPOLYGON (((-81.23989 3... #> 3 5 208 3616 6 260 MULTIPOLYGON (((-80.45634 3... #> 4 1 123 830 2 145 MULTIPOLYGON (((-76.00897 3... #> 5 9 1066 1606 3 1197 MULTIPOLYGON (((-77.21767 3... #> 6 7 954 1838 5 1237 MULTIPOLYGON (((-76.74506 3... #> 7 0 115 350 2 139 MULTIPOLYGON (((-76.00897 3... #> 8 0 254 594 2 371 MULTIPOLYGON (((-76.56251 3... #> 9 4 748 1190 2 844 MULTIPOLYGON (((-78.30876 3... #> 10 1 160 2038 5 176 MULTIPOLYGON (((-80.02567 3... ``` ] --- ```r attributes(nc) ``` ``` #> $names #> [1] "AREA" "PERIMETER" "CNTY_" "CNTY_ID" "NAME" #> [6] "FIPS" "FIPSNO" "CRESS_ID" "BIR74" "SID74" #> [11] "NWBIR74" "BIR79" "SID79" "NWBIR79" "geometry" #> #> $row.names #> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 #> [18] 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 #> [35] 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 #> [52] 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 #> [69] 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 #> [86] 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 #> #> $class #> [1] "sf" "data.frame" #> #> $sf_column #> [1] "geometry" #> #> $agr #> AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO #> <NA> <NA> <NA> <NA> <NA> <NA> <NA> #> CRESS_ID BIR74 SID74 NWBIR74 BIR79 SID79 NWBIR79 #> <NA> <NA> <NA> <NA> <NA> <NA> <NA> #> Levels: constant aggregate identity ``` --- ```r nc_polygons <- st_geometry(nc) nc_polygons ``` ``` #> Geometry set for 100 features #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965 #> epsg (SRID): 4267 #> proj4string: +proj=longlat +datum=NAD27 +no_defs #> First 5 geometries: ``` ```r #> Geometry set for 100 features #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965 #> epsg (SRID): 4267 #> proj4string: +proj=longlat +datum=NAD27 +no_defs #> First 5 geometries: *#> MULTIPOLYGON (((-81.47276 36.23436, -81.54084 3... #> MULTIPOLYGON (((-81.23989 36.36536, -81.24069 3... #> MULTIPOLYGON (((-80.45634 36.24256, -80.47639 3... #> MULTIPOLYGON (((-76.00897 36.3196, -76.01735 36... #> MULTIPOLYGON (((-77.21767 36.24098, -77.23461 3... ``` --- ```r attributes(nc_polygons) ``` ``` #> $n_empty #> [1] 0 #> #> $crs #> Coordinate Reference System: #> EPSG: 4267 #> proj4string: "+proj=longlat +datum=NAD27 +no_defs" #> #> $class #> [1] "sfc_MULTIPOLYGON" "sfc" #> #> $precision #> [1] 0 #> #> $bbox #> xmin ymin xmax ymax #> -84.32385 33.88199 -75.45698 36.58965 ``` <br/><br/><br/> -- We see that `nc` has a class attribute `sf`, and object `nc_polygons` has a class attribute `sfc`. What methods are available? --- ```r methods(class = "sf") ``` ``` #> [1] [ [[<- $<- #> [4] aggregate anti_join arrange #> [7] as.data.frame cbind coerce #> [10] dbDataType dbWriteTable distinct #> [13] filter full_join gather #> [16] group_by group_map group_split #> [19] identify initialize inner_join #> [22] left_join mapView merge #> [25] mutate nest plot #> [28] print rbind rename #> [31] right_join sample_frac sample_n #> [34] select semi_join separate_rows #> [37] separate show slice #> [40] slotsFromS3 spread st_agr #> [43] st_agr<- st_area st_as_sf #> [46] st_bbox st_boundary st_buffer #> [49] st_cast st_centroid st_collection_extract #> [52] st_convex_hull st_coordinates st_crop #> [55] st_crs st_crs<- st_difference #> [58] st_geometry st_geometry<- st_interpolate_aw #> [61] st_intersection st_intersects st_is #> [64] st_join st_line_merge st_nearest_points #> [67] st_node st_normalize st_point_on_surface #> [70] st_polygonize st_precision st_segmentize #> [73] st_set_precision st_simplify st_snap #> [76] st_sym_difference st_transform st_triangulate #> [79] st_union st_voronoi st_wrap_dateline #> [82] st_write st_zm summarise #> [85] transmute ungroup unite #> [88] unnest #> see '?methods' for accessing help and source code ``` --- ```r methods(class = "sfc") ``` ``` #> [1] [ [<- as.data.frame #> [4] c coerce format #> [7] fortify identify initialize #> [10] mapView obj_sum Ops #> [13] print rep scale_type #> [16] show slotsFromS3 st_area #> [19] st_as_binary st_as_grob st_as_sf #> [22] st_as_text st_bbox st_boundary #> [25] st_buffer st_cast st_centroid #> [28] st_collection_extract st_convex_hull st_coordinates #> [31] st_crop st_crs st_crs<- #> [34] st_difference st_geometry st_intersection #> [37] st_intersects st_is st_line_merge #> [40] st_nearest_points st_node st_normalize #> [43] st_point_on_surface st_polygonize st_precision #> [46] st_segmentize st_set_precision st_simplify #> [49] st_snap st_sym_difference st_transform #> [52] st_triangulate st_union st_voronoi #> [55] st_wrap_dateline st_write st_zm #> [58] str summary type_sum #> see '?methods' for accessing help and source code ``` --- ## Reading and writing spatial data - `st_read()` / `st_write()`, Shapefile, GeoJSON, KML, ... - `st_as_sfc()` - `st_as_text()`, well-known text format - `st_as_binary()`, well-known binary format <br/><br/><br> See https://r-spatial.github.io/sf/articles/sf2.html for the full set of driver availability. --- ## Plotting with `plot()` ```r plot(nc) ``` <img src="lec-10_files/figure-html/unnamed-chunk-21-1.png" style="display: block; margin: auto;" /> --- ```r plot(nc["NAME"]) ``` <img src="lec-10_files/figure-html/unnamed-chunk-22-1.png" style="display: block; margin: auto;" /> --- ```r par(oma=c(0,2,0,0)) plot(nc["AREA"], graticule = TRUE, axes = TRUE, las = 1) ``` <img src="lec-10_files/figure-html/unnamed-chunk-23-1.png" style="display: block; margin: auto;" /> --- ## What is happening with `[` and the `sf` object? ```r nc["AREA"] ``` ``` #> Simple feature collection with 100 features and 1 field #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965 #> epsg (SRID): 4267 #> proj4string: +proj=longlat +datum=NAD27 +no_defs #> First 10 features: #> AREA geometry #> 1 0.114 MULTIPOLYGON (((-81.47276 3... #> 2 0.061 MULTIPOLYGON (((-81.23989 3... #> 3 0.143 MULTIPOLYGON (((-80.45634 3... #> 4 0.070 MULTIPOLYGON (((-76.00897 3... #> 5 0.153 MULTIPOLYGON (((-77.21767 3... #> 6 0.097 MULTIPOLYGON (((-76.74506 3... #> 7 0.062 MULTIPOLYGON (((-76.00897 3... #> 8 0.091 MULTIPOLYGON (((-76.56251 3... #> 9 0.118 MULTIPOLYGON (((-78.30876 3... #> 10 0.124 MULTIPOLYGON (((-80.02567 3... ``` --- ```r nc$AREA ``` ``` #> [1] 0.114 0.061 0.143 0.070 0.153 0.097 0.062 0.091 0.118 0.124 0.114 #> [12] 0.153 0.143 0.109 0.072 0.190 0.053 0.199 0.081 0.063 0.044 0.064 #> [23] 0.086 0.128 0.108 0.170 0.111 0.180 0.104 0.077 0.142 0.059 0.131 #> [34] 0.122 0.080 0.118 0.219 0.118 0.155 0.069 0.066 0.145 0.134 0.100 #> [45] 0.099 0.116 0.201 0.180 0.094 0.134 0.168 0.106 0.168 0.207 0.144 #> [56] 0.094 0.203 0.141 0.070 0.065 0.146 0.142 0.154 0.118 0.078 0.125 #> [67] 0.181 0.143 0.091 0.130 0.103 0.095 0.078 0.104 0.098 0.091 0.060 #> [78] 0.131 0.241 0.082 0.120 0.172 0.121 0.163 0.138 0.098 0.167 0.204 #> [89] 0.121 0.051 0.177 0.080 0.195 0.240 0.125 0.225 0.214 0.240 0.042 #> [100] 0.212 ``` --- ```r par(oma=c(0,2,0,0)) plot(nc["AREA"], col = "lightblue", graticule = TRUE, axes = TRUE, las = 1) ``` <img src="lec-10_files/figure-html/unnamed-chunk-26-1.png" style="display: block; margin: auto;" /> --- ```r par(oma=c(0,2,0,0)) plot(st_geometry(nc), graticule = TRUE, axes = TRUE, las = 1) ``` <img src="lec-10_files/figure-html/unnamed-chunk-27-1.png" style="display: block; margin: auto;" /> --- ## Plotting with `ggplot()` ```r ggplot(nc) + geom_sf() + theme_bw(base_size = 16) ``` <img src="lec-10_files/figure-html/unnamed-chunk-28-1.png" style="display: block; margin: auto;" /> --- ```r ggplot(nc) + geom_sf(aes(fill = AREA)) + scale_fill_gradient(low = "#fee8c8", high = "#7f0000") + theme_bw(base_size = 16) ``` <img src="lec-10_files/figure-html/unnamed-chunk-29-1.png" style="display: block; margin: auto;" /> --- class: center, middle <img src="lec-10_files/figure-html/unnamed-chunk-30-1.png" style="display: block; margin: auto;" /> --- ```r p1 <- ggplot(nc) + geom_sf(aes(fill = SID74)) + scale_fill_gradient(low = "#fff7f3", high = "#49006a") + theme_bw(base_size = 16) p2 <- ggplot(nc) + geom_sf(aes(fill = SID79)) + scale_fill_gradient(low = "#fff7f3", high = "#49006a") + theme_bw(base_size = 16) p1 / p2 ``` --- ## Plotting with `mapview()` ```r mapview(nc) ```
--- ```r mapviewOptions(legend.pos = "bottomright") mapview(nc["SID74"], col.regions = sf.colors(10), layer.name = "SID 1974") ```
--- class: inverse, center, middle # Map layers --- ## Game Lands data The North Carolina Department of Environment and Natural Resources, Wildlife Resources Commission and the NC Center for Geographic Information and Analysis has a shapefile data set available on all public Game Lands in NC. https://www.nconemap.gov/datasets/e5ddff9b96204c6181be7c022e61d946_0 We can directly download and unzip the shapefile via ```r download.file("https://opendata.arcgis.com/datasets/e5ddff9b96204c6181be7c022e61d946_0.zip?outSR=%7B%22latestWkid%22%3A32119%2C%22wkid%22%3A32119%7D", destfile = "data/Gamelands.zip") unzip("data/Gamelands.zip", exdir = "data/") ``` -- To see the available files ```r list.files(path = "data/")[1:5] ``` ``` #> [1] "679190a9-3cfb-4be4-9a1e-255188fd3f6e202044-1-707iid.5jdc8.cpg" #> [2] "679190a9-3cfb-4be4-9a1e-255188fd3f6e202044-1-707iid.5jdc8.dbf" #> [3] "679190a9-3cfb-4be4-9a1e-255188fd3f6e202044-1-707iid.5jdc8.prj" #> [4] "679190a9-3cfb-4be4-9a1e-255188fd3f6e202044-1-707iid.5jdc8.shp" #> [5] "679190a9-3cfb-4be4-9a1e-255188fd3f6e202044-1-707iid.5jdc8.shx" ``` --- ## Read in the shapefile ```r nc_gamelands <- st_read("data/679190a9-3cfb-4be4-9a1e-255188fd3f6e202044-1-707iid.5jdc8.shp", quiet = T) ``` .tiny[ ```r print(nc_gamelands, n = 5) ``` ``` #> Simple feature collection with 94 features and 6 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: 127456.7 ymin: 26560.42 xmax: 923523.8 ymax: 318097.4 #> epsg (SRID): 32119 #> proj4string: +proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334 +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs #> First 5 features: #> OBJECTID GML_HAB SUM_ACRES GameLandID Shape__Are Shape__Len #> 1 1 Alcoa 11210.840 1 45367613 447378.38 #> 2 2 Alligator River 24439.089 2 98901485 151120.16 #> 3 3 Angola Bay 34067.381 3 137865800 87534.59 #> 4 4 Bachelor Bay 2786.258 4 11275585 26613.27 #> 5 5 Bertie County 3883.768 5 15717053 67472.65 #> geometry #> 1 MULTIPOLYGON (((512096.2 18... #> 2 MULTIPOLYGON (((869633.1 24... #> 3 MULTIPOLYGON (((721333.1 10... #> 4 MULTIPOLYGON (((813742.2 23... #> 5 MULTIPOLYGON (((797133.8 24... ``` ] --- ## Metadata for each `sf` object ```r nc #> Simple feature collection with 100 features and 14 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965 *#> epsg (SRID): 4267 *#> proj4string: +proj=longlat +datum=NAD27 +no_defs #> First 10 features: ``` ```r nc_gamelands #> Simple feature collection with 94 features and 6 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: 127456.7 ymin: 26560.42 xmax: 923523.8 ymax: 318097.4 *#> epsg (SRID): 32119 *#> proj4string: +proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334 +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs #> First 10 features: ``` --- ## Check the CRS ```r st_crs(nc) ``` ``` #> Coordinate Reference System: #> EPSG: 4267 #> proj4string: "+proj=longlat +datum=NAD27 +no_defs" ``` ```r st_crs(nc_gamelands) ``` ``` #> Coordinate Reference System: #> EPSG: 32119 #> proj4string: "+proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334 +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs" ``` --- ## Coordinate reference systems (CRS) - CRS provide a standardized way of describing locations. - Different CRS arise from various ways data were gathered, the locations, and purposes of the data. - A CRS is comprised of - an ellipsoid, to define the earth's shape; - a datum, to define the origin and orientation of coordinate axes; - a projection, to go from 3D to 2D. - It is important that you transform your spatial data to a common CRS before plotting. --- ## Transform CRS ```r nc_gamelands <- st_transform(nc_gamelands, st_crs(nc)) ``` -- ```r st_crs(nc) ``` ``` #> Coordinate Reference System: #> EPSG: 4267 #> proj4string: "+proj=longlat +datum=NAD27 +no_defs" ``` ```r st_crs(nc_gamelands) ``` ``` #> Coordinate Reference System: #> EPSG: 4267 #> proj4string: "+proj=longlat +datum=NAD27 +no_defs" ``` --- ## Map overlay with `plot()` ```r plot(st_geometry(nc), axes = T, las = 1, main = "NC Public Game Lands", cex.main = 3, cex.lab = 2, cex.axis = 1.5) plot(st_geometry(nc_gamelands), * add = T, col = "#ff6700") legend("bottomleft", legend = "WRC Game Lands", fill = "#ff6700") ``` --- <img src="lec-10_files/figure-html/unnamed-chunk-44-1.png" style="display: block; margin: auto;" /> --- ## Map overlay with `mapview()` ```r nc_mapview <- mapview(nc, alpha.regions = .2, alpha = .9, label = nc[, "NAME", drop = T], layer.name = "NC Counties") ``` ```r nc_gamelands_mapview <- mapview(nc_gamelands, col.regions = "#ff6700", label = round(nc_gamelands[, "SUM_ACRES", drop = T], 2), layer.name = "NC Gamelands") ``` ```r nc_mapview + nc_gamelands_mapview ``` <br/><br/><br/> *These should run in RStudio. There is an issue embedding this overlay in the slides*. --- ## Exercise Create a map that includes NC county boundaries, Game Lands, and hazardous waste sites. Data for the hazardous waste sites is available at https://www.nconemap.gov/datasets/hazardous-waste-sites This data set represents the location of sites within North Carolina that are regulated by the hazardous waste portions of the Resource Conservation and Recovery Act (RCRA). --- class: inverse, center, middle # Manipulating `sf` type objects --- ## Intersects ```r nc <- st_transform(nc, st_crs(32119)) nc_gamelands <- st_transform(nc_gamelands, st_crs(32119)) ``` *Source*: https://spatialreference.org/ref/epsg/32119/ .tiny[ ```r durham_county <- nc %>% filter(NAME == "Durham") nc[durham_county, ] ``` ``` #> Simple feature collection with 6 features and 14 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: 559228.2 ymin: 195318.3 xmax: 676964.2 ymax: 310228.5 #> epsg (SRID): 32119 #> proj4string: +proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334 +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs #> AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO CRESS_ID BIR74 #> 13 0.143 1.663 1840 1840 Granville 37077 37077 39 1671 #> 14 0.109 1.325 1841 1841 Person 37145 37145 73 1556 #> 29 0.104 1.294 1907 1907 Orange 37135 37135 68 3164 #> 30 0.077 1.271 1908 1908 Durham 37063 37063 32 7970 #> 37 0.219 2.130 1938 1938 Wake 37183 37183 92 14484 #> 48 0.180 2.142 1973 1973 Chatham 37037 37037 19 1646 #> SID74 NWBIR74 BIR79 SID79 NWBIR79 geometry #> 13 4 930 2074 4 1058 MULTIPOLYGON (((632202.4 25... #> 14 4 613 1790 4 650 MULTIPOLYGON (((626970 2753... #> 29 4 776 4478 6 1086 MULTIPOLYGON (((607963.3 23... #> 30 16 3732 10432 22 4948 MULTIPOLYGON (((607963.3 23... #> 37 16 4397 20857 31 6221 MULTIPOLYGON (((616754.3 20... #> 48 2 591 2398 3 687 MULTIPOLYGON (((559228.2 19... ``` ] --- .tiny[ ```r st_intersects(nc, durham_county, sparse = F) %>% nc[., ] ``` ``` #> Simple feature collection with 6 features and 14 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: 559228.2 ymin: 195318.3 xmax: 676964.2 ymax: 310228.5 #> epsg (SRID): 32119 #> proj4string: +proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334 +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs #> AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO CRESS_ID BIR74 #> 13 0.143 1.663 1840 1840 Granville 37077 37077 39 1671 #> 14 0.109 1.325 1841 1841 Person 37145 37145 73 1556 #> 29 0.104 1.294 1907 1907 Orange 37135 37135 68 3164 #> 30 0.077 1.271 1908 1908 Durham 37063 37063 32 7970 #> 37 0.219 2.130 1938 1938 Wake 37183 37183 92 14484 #> 48 0.180 2.142 1973 1973 Chatham 37037 37037 19 1646 #> SID74 NWBIR74 BIR79 SID79 NWBIR79 geometry #> 13 4 930 2074 4 1058 MULTIPOLYGON (((632202.4 25... #> 14 4 613 1790 4 650 MULTIPOLYGON (((626970 2753... #> 29 4 776 4478 6 1086 MULTIPOLYGON (((607963.3 23... #> 30 16 3732 10432 22 4948 MULTIPOLYGON (((607963.3 23... #> 37 16 4397 20857 31 6221 MULTIPOLYGON (((616754.3 20... #> 48 2 591 2398 3 687 MULTIPOLYGON (((559228.2 19... ``` ] --- ## Touches ```r st_touches(nc, durham_county, sparse = F) %>% nc[., ] ``` ``` #> Simple feature collection with 5 features and 14 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: 559228.2 ymin: 195318.3 xmax: 676964.2 ymax: 310228.5 #> epsg (SRID): 32119 #> proj4string: +proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334 +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs #> AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO CRESS_ID BIR74 #> 13 0.143 1.663 1840 1840 Granville 37077 37077 39 1671 #> 14 0.109 1.325 1841 1841 Person 37145 37145 73 1556 #> 29 0.104 1.294 1907 1907 Orange 37135 37135 68 3164 #> 37 0.219 2.130 1938 1938 Wake 37183 37183 92 14484 #> 48 0.180 2.142 1973 1973 Chatham 37037 37037 19 1646 #> SID74 NWBIR74 BIR79 SID79 NWBIR79 geometry #> 13 4 930 2074 4 1058 MULTIPOLYGON (((632202.4 25... #> 14 4 613 1790 4 650 MULTIPOLYGON (((626970 2753... #> 29 4 776 4478 6 1086 MULTIPOLYGON (((607963.3 23... #> 37 16 4397 20857 31 6221 MULTIPOLYGON (((616754.3 20... #> 48 2 591 2398 3 687 MULTIPOLYGON (((559228.2 19... ``` --- ## Join .tiny[ ```r durham_area_counties <- st_intersects(nc, durham_county, sparse = F) %>% nc[., ] durham_area_gamelands <- st_join(durham_area_counties, nc_gamelands, join = st_intersects) ``` ] -- .tiny[ ```r print(durham_area_gamelands, n = 3) ``` ``` #> Simple feature collection with 14 features and 20 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: 559228.2 ymin: 195318.3 xmax: 676964.2 ymax: 310228.5 #> epsg (SRID): 32119 #> proj4string: +proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334 +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs #> First 3 features: #> AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO CRESS_ID BIR74 #> 13 0.143 1.663 1840 1840 Granville 37077 37077 39 1671 #> 14 0.109 1.325 1841 1841 Person 37145 37145 73 1556 #> 14.1 0.109 1.325 1841 1841 Person 37145 37145 73 1556 #> SID74 NWBIR74 BIR79 SID79 NWBIR79 OBJECTID GML_HAB #> 13 4 930 2074 4 1058 12 Butner-Falls of Neuse #> 14 4 613 1790 4 650 37 Hyco #> 14.1 4 613 1790 4 650 49 Mayo #> SUM_ACRES GameLandID Shape__Are Shape__Len #> 13 40674.205 13 164602670 421442.14 #> 14 3979.375 38 16103959 62956.92 #> 14.1 5247.622 46 21236373 124167.24 #> geometry #> 13 MULTIPOLYGON (((632202.4 25... #> 14 MULTIPOLYGON (((626970 2753... #> 14.1 MULTIPOLYGON (((626970 2753... ``` ] --- ## Proximity .tiny[ ```r st_is_within_distance(durham_county, nc, dist = 17550, sparse = F) %>% nc[., ] ``` ``` #> Simple feature collection with 7 features and 14 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: 559228.2 ymin: 195318.3 xmax: 698975.2 ymax: 310228.5 #> epsg (SRID): 32119 #> proj4string: +proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334 +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs #> AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO CRESS_ID BIR74 #> 13 0.143 1.663 1840 1840 Granville 37077 37077 39 1671 #> 14 0.109 1.325 1841 1841 Person 37145 37145 73 1556 #> 24 0.128 1.554 1897 1897 Franklin 37069 37069 35 1399 #> 29 0.104 1.294 1907 1907 Orange 37135 37135 68 3164 #> 30 0.077 1.271 1908 1908 Durham 37063 37063 32 7970 #> 37 0.219 2.130 1938 1938 Wake 37183 37183 92 14484 #> 48 0.180 2.142 1973 1973 Chatham 37037 37037 19 1646 #> SID74 NWBIR74 BIR79 SID79 NWBIR79 geometry #> 13 4 930 2074 4 1058 MULTIPOLYGON (((632202.4 25... #> 14 4 613 1790 4 650 MULTIPOLYGON (((626970 2753... #> 24 2 736 1863 0 950 MULTIPOLYGON (((676964.2 22... #> 29 4 776 4478 6 1086 MULTIPOLYGON (((607963.3 23... #> 30 16 3732 10432 22 4948 MULTIPOLYGON (((607963.3 23... #> 37 16 4397 20857 31 6221 MULTIPOLYGON (((616754.3 20... #> 48 2 591 2398 3 687 MULTIPOLYGON (((559228.2 19... ``` ] --- .tiny[ ```r st_overlaps(nc, nc_gamelands[nc_gamelands$GML_HAB == "Jordan", ], sparse = F) %>% nc[., ] ``` ``` #> Simple feature collection with 4 features and 14 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: 559228.2 ymin: 195318.3 xmax: 676964.2 ymax: 275782.8 #> epsg (SRID): 32119 #> proj4string: +proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334 +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs #> AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO CRESS_ID BIR74 SID74 #> 29 0.104 1.294 1907 1907 Orange 37135 37135 68 3164 4 #> 30 0.077 1.271 1908 1908 Durham 37063 37063 32 7970 16 #> 37 0.219 2.130 1938 1938 Wake 37183 37183 92 14484 16 #> 48 0.180 2.142 1973 1973 Chatham 37037 37037 19 1646 2 #> NWBIR74 BIR79 SID79 NWBIR79 geometry #> 29 776 4478 6 1086 MULTIPOLYGON (((607963.3 23... #> 30 3732 10432 22 4948 MULTIPOLYGON (((607963.3 23... #> 37 4397 20857 31 6221 MULTIPOLYGON (((616754.3 20... #> 48 591 2398 3 687 MULTIPOLYGON (((559228.2 19... ``` ] <br/><br/> For more geometry predicates see [Simple Features Cheatsheet](https://github.com/rstudio/cheatsheets/raw/master/sf.pdf) --- ## Exercise Create a plot of North Carolina's Game Lands and all the waste sites within 100 meters of a Game Land area. Try two of the three plot functions. ??? ## Solution .solution[ ```r waste <- st_read("data/Hazardous_Waste_Sites.shp", quiet = T) waste <- st_transform(waste, st_crs(nc_gamelands)) ``` ```r close_waste_lgl <- st_is_within_distance(waste, nc_gamelands, dist = 100, sparse = F) close_waste <- waste %>% filter(apply(close_waste_lgl, 1, sum) > 0) ``` ```r nc_gamelands_mapview <- mapview(nc_gamelands, col.regions = "#ff6700", label = nc_gamelands$GML_HAB, layer.name = "NC Gamelands") nc_waste_mapview <- mapview(close_waste, col.regions = "#65ff00", alpha = .3, alpha.regions = .3, label = waste[, "SITE_NAME", drop = T], layer.name = "NC Waste Sites") nc_gamelands_mapview + nc_waste_mapview ``` ```r ggplot(nc) + geom_sf(alpha = .3) + geom_sf(data = close_waste, color = "#65ff00", size = 3) + geom_sf(data = nc_gamelands, fill = "#ff6700", alpha = .5) + theme_bw() ``` ] --- ## References - Simple Features for R vignette, https://r-spatial.github.io/sf/ - `mapview` vignette, https://r-spatial.github.io/mapview/index.html - Coordinate Reference Systems in R https://www.nceas.ucsb.edu/~frazier/RSpatialGuides/OverviewCoordinateReferenceSystems.pdf by Melanie Frazier