Geospatial stuff is hard

Projections

Dateline

Want to fly from the Western most point in the US to the Eastern most point?

gcd = function(x,y ) { #long1, lat1, long2, lat2) {
    R = 6378137 # Earth mean radius [m]
    x = 2*pi*x/360; y = 2*pi*y/360;
    acos(sin(x[2])*sin(y[2]) + cos(x[2])*cos(y[2]) * cos(y[1]-y[2])) * R
}

ak1 = c(179.776,51.952)
ak2 = c(-179.146,51.273)

gcd(ak1, ak2)/1000
## [1] 7610.476
geosphere::distGeo(ak1, ak2)/1000
## [1] 106.2197

Even plotting is hard

## Warning in title(...): "graticles" is not a graphical parameter

Relationships

Distance for Simple Features

How do we define the distance between A and B, A and C, or B and C?

Geospatial Data and R

Analysis of geospatial data in R

R has a rich package ecosystem for read/writing, manipulating, and analyzing geospatial data.


Some core packages:

  • sp - core classes for handling spatial data, additional utility functions.

  • rgdal - R interface to gdal (Geospatial Data Abstraction Library) for reading and writing spatial data.

  • rgeos - R interface to geos (Geometry Engine Open Source) library for querying and manipulating spatial data. Reading and writing WKT.

  • raster - classes and tools for handling spatial raster data.


See more - Spatial task view

Analysis of geospatial data in R

R has a rich package ecosystem for read/writing, manipulating, and analyzing geospatial data.


Some core packages:

  • sp - core classes for handling spatial data, additional utility functions.

  • rgdal - R interface to gdal (Geospatial Data Abstraction Library) for reading and writing spatial data.

  • rgeos - R interface to geos (Geometry Engine Open Source) library for querying and manipulating spatial data. Reading and writing WKT.

  • sf - Combines the functionality of sp, rgdal, and rgeos into a single package based on tidy principles.

  • raster - classes and tools for handling spatial raster data.


See more - Spatial task view

Before Installing sf

The sf package is under active development and is evolving rapidly. The version on CRAN should be reasonably up to date, but the most current version is always available from github.

Difficulty comes from sf's dependency on external (system) libraries (geos, gdal, and proj4).

sf should already be installed and available on saxon, for your local machine:

  • Windows - installing from source works when Rtools is installed (system requirements are downloaded from rwinlib)

  • MacOS - install dependencies via homebrew:

brew tap osgeo/osgeo4mac && brew tap --repair
brew install proj geos gdal2 udunits 
brew link gdal2 --force
  • Linux - Install development packages for GDAL (>= 2.0.0), GEOS (>= 3.3.0) and Proj4 (>= 4.8.0) from your package manager of choice.

Simple Features - Single Geometries

pt = st_point(c(30, 10))
ls = st_linestring(matrix(c(30, 10, 10, 30, 40, 40), byrow=TRUE, ncol=2))
poly = st_polygon(list(matrix(c(30, 10, 40, 40, 20, 40, 10, 20, 30, 10), ncol=2, byrow=TRUE)))
poly_hole = st_polygon(
              list(
                matrix(c(35, 10, 45, 45, 15, 40, 10, 20, 35, 10), ncol=2, byrow=TRUE),
                matrix(c(20, 30, 35, 35, 30, 20, 20, 30), ncol=2, byrow=TRUE)
              )
            )

Simple Features - Multi Geometries

pts = st_multipoint(matrix(c(10, 40, 40, 30, 20, 20, 30, 10), ncol=2, byrow=TRUE))
lss = st_multilinestring(list( 
        matrix(c(10, 10, 20, 20, 10, 40), ncol=2, byrow=TRUE),
        matrix(c(40, 40, 30, 30, 40, 20, 30, 10), ncol=2, byrow=TRUE)
      ))
polys = st_multipolygon(list(
          list(matrix(c(30, 20, 45, 40, 10, 40, 30, 20), ncol=2, byrow=TRUE)),
          list(matrix(c(15, 5, 40, 10, 10, 20, 5, 10, 15, 5), ncol=2, byrow=TRUE))
        ))
polys_hole = st_multipolygon(list(
                list(matrix(c(40, 40, 20, 45, 45, 30, 40, 40), ncol=2, byrow=TRUE)),
                list(matrix(c(20, 35, 10, 30, 10, 10, 30, 5, 45, 20, 20, 35), ncol=2, byrow=TRUE),
                     matrix(c(30, 20, 20, 15, 20, 25, 30, 20), ncol=2, byrow=TRUE))
              ))

Reading and writing geospatial data

  • maptools
    • readShapePoints / writeShapePoints - Shapefile w/ points
    • readShapeLines / writeShapeLines - Shapefile w/ lines
    • readShapePoly / writeShapePoly - Shapefile w/ polygons
    • readShapeSpatial / writeShapeSpatial - Shapefile
  • rgdal
    • readOGR / writeOGR - Shapefile, GeoJSON, KML, …
  • rgeos
    • readWKT / writeWKT - Well Known Text
  • sf
    • st_read / st_write - Shapefile, GeoJSON, KML, …

Reading and writing geospatial data

  • maptools
    • readShapePoints / writeShapePoints - Shapefile w/ points
    • readShapeLines / writeShapeLines - Shapefile w/ lines
    • readShapePoly / writeShapePoly - Shapefile w/ polygons
    • readShapeSpatial / writeShapeSpatial - Shapefile
  • rgdal
    • readOGR / writeOGR - Shapefile, GeoJSON, KML, …
  • rgeos
    • readWKT / writeWKT - Well Known Text
  • sf
    • st_read / st_write - Shapefile, GeoJSON, KML, …

Working with Geospatial Data

Example data

nc  = st_read("data/gis/nc_counties/", quiet=TRUE, stringsAsFactors=FALSE) %>% tbl_df() %>% st_sf()
air = st_read("data/gis/airports/", quiet=TRUE, stringsAsFactors=FALSE) %>% tbl_df() %>% st_sf()
hwy = st_read("data/gis/us_interstates/", quiet=TRUE, stringsAsFactors=FALSE) %>% tbl_df() %>% st_sf()

nc
## Simple feature collection with 100 features and 8 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -84.32186 ymin: 33.84175 xmax: -75.46003 ymax: 36.58815
## epsg (SRID):    4269
## proj4string:    +proj=longlat +datum=NAD83 +no_defs
## # A tibble: 100 x 9
##          AREA PERIMETER COUNTYP010 STATE             COUNTY  FIPS STATE_FIPS SQUARE_MIL
##         <dbl>     <dbl>      <dbl> <chr>              <chr> <chr>      <chr>      <dbl>
##  1 0.11175964  1.610396       1994    NC        Ashe County 37009         37    429.350
##  2 0.06159483  1.354829       1996    NC   Alleghany County 37005         37    236.459
##  3 0.14023009  1.769388       1998    NC       Surry County 37171         37    538.863
##  4 0.08912401  1.425249       1999    NC       Gates County 37073         37    342.340
##  5 0.06865730  4.428217       2000    NC   Currituck County 37053         37    263.871
##  6 0.11859434  1.404309       2001    NC      Stokes County 37169         37    455.793
##  7 0.06259671  2.106357       2002    NC      Camden County 37029         37    240.615
##  8 0.11542955  1.462524       2003    NC      Warren County 37185         37    443.659
##  9 0.14328609  2.397293       2004    NC Northampton County 37131         37    550.581
## 10 0.09245561  1.810778       2005    NC    Hertford County 37091         37    355.525
## # ... with 90 more rows, and 1 more variables: geometry <simple_feature>

air
## Simple feature collection with 940 features and 16 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: -176.646 ymin: 17.70156 xmax: -64.80172 ymax: 71.28545
## epsg (SRID):    4269
## proj4string:    +proj=longlat +datum=NAD83 +no_defs
## # A tibble: 940 x 17
##    AIRPRTX010 FEATURE  ICAO  IATA                            AIRPT_NAME                CITY STATE
##         <dbl>   <chr> <chr> <chr>                                 <chr>               <chr> <chr>
##  1          0 AIRPORT  KGON   GON             GROTON-NEW LONDON AIRPORT GROTON (NEW LONDON)    CT
##  2          3 AIRPORT  K6S5   6S5                RAVALLI COUNTY AIRPORT            HAMILTON    MT
##  3          4 AIRPORT  KMHV   MHV                        MOJAVE AIRPORT              MOJAVE    CA
##  4          6 AIRPORT  KSEE   SEE               GILLESPIE FIELD AIRPORT  SAN DIEGO/EL CAJON    CA
##  5          7 AIRPORT  KFPR   FPR ST LUCIE COUNTY INTERNATIONAL AIRPORT         FORT PIERCE    FL
##  6          8 AIRPORT  KRYY   RYY           COBB COUNTY-MC COLLUM FIELD             ATLANTA    GA
##  7         10 AIRPORT  KMKL   MKL      MC KELLAR-SIPES REGIONAL AIRPORT             JACKSON    TN
##  8         11 AIRPORT  KCCR   CCR                BUCHANAN FIELD AIRPORT             CONCORD    CA
##  9         13 AIRPORT  KJYO   JYO            LEESBURG EXECUTIVE AIRPORT            LEESBURG    VA
## 10         15 AIRPORT  KCAD   CAD                WEXFORD COUNTY AIRPORT            CADILLAC    MI
## # ... with 930 more rows, and 10 more variables: STATE_FIPS <chr>, COUNTY <chr>, FIPS <chr>,
## #   TOT_ENP <dbl>, LATITUDE <dbl>, LONGITUDE <dbl>, ELEV <dbl>, ACT_DATE <chr>, CNTL_TWR <chr>,
## #   geometry <simple_feature>

hwy
## Simple feature collection with 233 features and 3 fields
## geometry type:  MULTILINESTRING
## dimension:      XY
## bbox:           xmin: -7472582 ymin: 2911107 xmax: 2443707 ymax: 8208428
## epsg (SRID):    26915
## proj4string:    +proj=utm +zone=15 +datum=NAD83 +units=m +no_defs
## # A tibble: 233 x 4
##    ROUTE_NUM DIST_MILES DIST_KM          geometry
##        <chr>      <dbl>   <dbl>  <simple_feature>
##  1       I10    2449.12 3941.48 <MULTILINESTR...>
##  2      I105      20.75   33.39 <MULTILINESTR...>
##  3      I110      41.42   66.65 <MULTILINESTR...>
##  4      I115       1.58    2.55 <MULTILINESTR...>
##  5       I12      85.32  137.31 <MULTILINESTR...>
##  6      I124       1.73    2.79 <MULTILINESTR...>
##  7      I126       3.56    5.72 <MULTILINESTR...>
##  8      I129       3.10    4.99 <MULTILINESTR...>
##  9      I135      96.29  154.96 <MULTILINESTR...>
## 10       I15    1435.98 2311.00 <MULTILINESTR...>
## # ... with 223 more rows

sf classes

str(nc, max.level=1)
## Classes 'sf', 'tbl_df', 'tbl' and 'data.frame':  100 obs. of  9 variables:
##  $ AREA      : num  0.1118 0.0616 0.1402 0.0891 0.0687 ...
##  $ PERIMETER : num  1.61 1.35 1.77 1.43 4.43 ...
##  $ COUNTYP010: num  1994 1996 1998 1999 2000 ...
##  $ STATE     : chr  "NC" "NC" "NC" "NC" ...
##  $ COUNTY    : chr  "Ashe County" "Alleghany County" "Surry County" "Gates County" ...
##  $ FIPS      : chr  "37009" "37005" "37171" "37073" ...
##  $ STATE_FIPS: chr  "37" "37" "37" "37" ...
##  $ SQUARE_MIL: num  429 236 539 342 264 ...
##  $ geometry  :sfc_MULTIPOLYGON of length 100; first list element: List of 1
##   ..- attr(*, "class")= chr  "XY" "MULTIPOLYGON" "sfg"
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA
##   ..- attr(*, "names")= chr  "AREA" "PERIMETER" "COUNTYP010" "STATE" ...
class(nc)
## [1] "sf"         "tbl_df"     "tbl"        "data.frame"

class(nc$geometry)
## [1] "sfc_MULTIPOLYGON" "sfc"
class(nc$geometry[[1]])
## [1] "XY"           "MULTIPOLYGON" "sfg"

Projections

st_crs(nc)
## Coordinate Reference System:
##   EPSG: 4269 
##   proj4string: "+proj=longlat +datum=NAD83 +no_defs"
st_crs(air)
## Coordinate Reference System:
##   EPSG: 4269 
##   proj4string: "+proj=longlat +datum=NAD83 +no_defs"
st_crs(hwy)
## Coordinate Reference System:
##   EPSG: 26915 
##   proj4string: "+proj=utm +zone=15 +datum=NAD83 +units=m +no_defs"

UTM Zones

Lat/Long

nc_ll = nc
air_ll = air
hwy_ll = st_transform(hwy, st_crs(nc))

UTM (Zone 15)

utm17 = "+proj=utm +zone=17 +datum=NAD83 +units=m +no_defs"
nc_utm = st_transform(nc, st_crs(utm17))
air_utm = st_transform(air, st_crs(utm17))
hwy_utm = st_transform(hwy, st_crs(utm17))

Subsetting vs. dplyr

nc[nc$COUNTY %in% c("Durham County","Wake County","Orange County"),]
## Simple feature collection with 3 features and 8 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -79.26453 ymin: 35.51905 xmax: -78.25503 ymax: 36.24385
## epsg (SRID):    4269
## proj4string:    +proj=longlat +datum=NAD83 +no_defs
## # A tibble: 3 x 9
##         AREA PERIMETER COUNTYP010 STATE        COUNTY  FIPS STATE_FIPS SQUARE_MIL          geometry
##        <dbl>     <dbl>      <dbl> <chr>         <chr> <chr>      <chr>      <dbl>  <simple_feature>
## 1 0.10401267  1.297813       2074    NC Orange County 37135         37    401.465 <MULTIPOLYGON...>
## 2 0.07714111  1.287529       2075    NC Durham County 37063         37    297.841 <MULTIPOLYGON...>
## 3 0.22144442  2.140667       2106    NC   Wake County 37183         37    857.610 <MULTIPOLYGON...>
filter(nc, COUNTY %in% c("Durham County","Wake County","Orange County"))
## Simple feature collection with 3 features and 8 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -79.26453 ymin: 35.51905 xmax: -78.25503 ymax: 36.24385
## epsg (SRID):    4269
## proj4string:    +proj=longlat +datum=NAD83 +no_defs
## # A tibble: 3 x 9
##         AREA PERIMETER COUNTYP010 STATE        COUNTY  FIPS STATE_FIPS SQUARE_MIL          geometry
##        <dbl>     <dbl>      <dbl> <chr>         <chr> <chr>      <chr>      <dbl>  <simple_feature>
## 1 0.10401267  1.297813       2074    NC Orange County 37135         37    401.465 <MULTIPOLYGON...>
## 2 0.07714111  1.287529       2075    NC Durham County 37063         37    297.841 <MULTIPOLYGON...>
## 3 0.22144442  2.140667       2106    NC   Wake County 37183         37    857.610 <MULTIPOLYGON...>

Distance between NC counties

counties = c("Durham County","Wake County","Orange County")
sub = nc$COUNTY %in% counties

st_distance(nc_ll[sub, ])
## Error in st_distance(nc_ll[sub, ]): st_distance for longitude/latitude data only available for POINT geometries
st_distance(nc_utm[sub, ])
## Units: m
##          [,1] [,2]     [,3]
## [1,]    0.000    0 9716.247
## [2,]    0.000    0    0.000
## [3,] 9716.247    0    0.000

Distance between NC counties (centroids)

nc_ll[sub, ] %>% st_centroid() %>% st_distance()
## Warning in st_centroid.sfc(st_geometry(x), of_largest_polygon = of_largest_polygon): st_centroid
## does not give correct centroids for longitude/latitude data
## Units: m
##          [,1]     [,2]     [,3]
## [1,]     0.00 22185.58 52031.22
## [2,] 22185.58     0.00 34076.78
## [3,] 52031.22 34076.78     0.00
nc_utm[sub, ] %>% st_centroid() %>% st_distance()
## Units: m
##          [,1]     [,2]     [,3]
## [1,]     0.00 22186.86 52019.67
## [2,] 22186.86     0.00 34067.94
## [3,] 52019.67 34067.94     0.00

nc_ll[sub, ] %>% st_centroid() %>% st_transform(st_crs(utm17)) %>% st_coordinates()
## Warning in st_centroid.sfc(st_geometry(x), of_largest_polygon = of_largest_polygon): st_centroid
## does not give correct centroids for longitude/latitude data
##          X       Y
## 1 669269.4 3992382
## 2 691329.8 3990029
## 3 712353.8 3963206
nc_utm[sub, ] %>% st_centroid() %>% st_coordinates()
##          X       Y
## 1 669267.9 3992369
## 2 691329.7 3990017
## 3 712338.6 3963198

Distance to the closest airport from each county?

d = st_distance(air_utm, nc_utm[sub,]) 
d[1:5,]
## Units: m
##           [,1]      [,2]      [,3]
## [1,]  825897.2  816837.2  814838.6
## [2,] 3171650.0 3195066.0 3220428.4
## [3,] 3659805.7 3686988.4 3695929.7
## [4,] 3617835.4 3643452.8 3648382.3
## [5,]  934947.9  936257.0  903174.3
nearest_airport = apply(d, 2, which.min) 
air %>% slice(nearest_airport) %>% .$AIRPT_NAME
## [1] "RALEIGH-DURHAM INTERNATIONAL AIRPORT" "RALEIGH-DURHAM INTERNATIONAL AIRPORT"
## [3] "RALEIGH-DURHAM INTERNATIONAL AIRPORT"

Predicate functions

DE-9IM

Sparse vs Full Results

st_intersects(nc_utm[20:30,], air_utm) %>% str()
## List of 11
##  $ : int(0) 
##  $ : int(0) 
##  $ : int(0) 
##  $ : int(0) 
##  $ : int(0) 
##  $ : int 268
##  $ : int 717
##  $ : int(0) 
##  $ : int(0) 
##  $ : int(0) 
##  $ : int(0) 
##  - attr(*, "predicate")= chr "intersects"
##  - attr(*, "region.id")= chr [1:11] "1" "2" "3" "4" ...
##  - attr(*, "ncol")= int 940
##  - attr(*, "class")= chr "sgbp"

st_intersects(nc_utm, air_utm, sparse=FALSE) %>% str()
##  logi [1:100, 1:940] FALSE FALSE FALSE FALSE FALSE FALSE ...
st_intersects(nc_utm, air_utm, sparse=FALSE) %>% .[20:30, 260:270]
##        [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10] [,11]
##  [1,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [2,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [3,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [4,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [5,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [6,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE
##  [7,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [8,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [9,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [10,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [11,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Which counties have airports?

nc_air = st_intersects(nc_utm, air_utm)
has_air = map_lgl(nc_air, ~ length(.) > 0)
nc %>% slice(which(has_air)) %>% .$COUNTY
##  [1] "Forsyth County"     "Guilford County"    "Dare County"        "Wake County"       
##  [5] "Pitt County"        "Catawba County"     "Buncombe County"    "Wayne County"      
##  [9] "Mecklenburg County" "Moore County"       "Cabarrus County"    "Lenoir County"     
## [13] "Craven County"      "Cumberland County"  "Onslow County"      "New Hanover County"
air_in_nc = nc_air %>% unlist() %>% unique()
air %>% slice(air_in_nc) %>% .$AIRPT_NAME
##  [1] "SMITH REYNOLDS AIRPORT"                                  
##  [2] "PIEDMONT TRIAD INTERNATIONAL AIRPORT"                    
##  [3] "DARE COUNTY REGIONAL AIRPORT"                            
##  [4] "RALEIGH-DURHAM INTERNATIONAL AIRPORT"                    
##  [5] "PITT-GREENVILLE AIRPORT"                                 
##  [6] "HICKORY REGIONAL AIRPORT"                                
##  [7] "ASHEVILLE REGIONAL AIRPORT"                              
##  [8] "SEYMOUR JOHNSON AIR FORCE BASE"                          
##  [9] "CHARLOTTE/DOUGLAS INTERNATIONAL AIRPORT"                 
## [10] "MOORE COUNTY AIRPORT"                                    
## [11] "CONCORD REGIONAL AIRPORT"                                
## [12] "KINSTON REGIONAL JETPORT AT STALLINGS FIELD"             
## [13] "CHERRY POINT MARINE CORPS AIR STATION /CUNNINGHAM FIELD/"
## [14] "COASTAL CAROLINA REGIONAL AIRPORT"                       
## [15] "POPE AIR FORCE BASE"                                     
## [16] "FAYETTEVILLE REGIONAL/GRANNIS FIELD"                     
## [17] "ALBERT J ELLIS AIRPORT"                                  
## [18] "WILMINGTON INTERNATIONAL AIRPORT"

plot(st_geometry(nc))
plot(st_geometry(nc[has_air,]), add=TRUE, col="lightblue")
plot(st_geometry(air[air_in_nc,]), add=TRUE, pch=16, col="blue")

A Quick Gerrymandering Example

(Fake) Data from Annearundel County, Maryland

(ac = st_read("data/gis/Annearundel/AnneArundelN.shp", quiet=TRUE) %>% tbl_df() %>% st_sf())
## Simple feature collection with 56 features and 7 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 413828 ymin: 146220 xmax: 441950.6 ymax: 174397.2
## epsg (SRID):    NA
## proj4string:    +proj=lcc +lat_1=38.3 +lat_2=39.45 +lat_0=37.66666666666666 +lon_0=-77 +x_0=400000 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
## # A tibble: 56 x 8
##    E2014_R E2014_D E2016_R E2016_D Population      id DISTRICT          geometry
##      <int>   <int>   <int>   <int>      <int>   <dbl>    <int>  <simple_feature>
##  1      48      52      64      36       6122 3702100        3 <MULTIPOLYGON...>
##  2      65      35      57      43       9154 3702201        2 <MULTIPOLYGON...>
##  3      44      56      58      42      15458 3702202        3 <MULTIPOLYGON...>
##  4      55      45      65      35       2424 3702203        3 <MULTIPOLYGON...>
##  5      48      52      44      56        430 3702800        3 <MULTIPOLYGON...>
##  6      42      58      49      51       4918 3730203        5 <MULTIPOLYGON...>
##  7      62      38      52      48       5486 3730204        5 <MULTIPOLYGON...>
##  8      48      52      46      54       3093 3730300        5 <MULTIPOLYGON...>
##  9      40      60      39      61       4137 3730401        6 <MULTIPOLYGON...>
## 10      35      65      56      44       5843 3730402        6 <MULTIPOLYGON...>
## # ... with 46 more rows

plot(select(ac,DISTRICT))
## Warning in classInt::classIntervals(values, min(nbreaks, n.unq), breaks): n same as number of
## different finite values\neach different finite value is a separate class

Adjacency matrix of precincts

adj = st_touches(ac, sparse=FALSE)

corrplot::corrplot(adj,method="color",type="full",tl.col="black",cl.pos = "n",)

Creating districts

ac_dists = ac %>% group_by(DISTRICT) %>% summarize(geometry = st_union(geometry))
plot(ac_dists)

Polsby-Popper

This is a measure of district compactness

\[ PP(d) = \frac{4\pi A(d)}{P(d)^2} \]

where \(A(d)\) and \(P(d)\) are the area and perimeter of the district respectively.

4 * pi * st_area(ac_dists) / st_length(ac_dists)^2
## Units: 1
## [1] 0.1578816 0.1465205 0.2948130 0.1177227 0.1422566 0.3782004 0.1953110 0.2524293