R has a rich package ecosystem for read/writing, manipulating, and analyzing geospatial data.
Some core packages:
sp
- core classes for handling spatial data, additional utility functions.
rgdal
- R interface to gdal
(Geospatial Data Abstraction Library) for reading and writing spatial data.
rgeos
- R interface to geos
(Geometry Engine Open Source) library for querying and manipulating spatial data. Reading and writing WKT.
raster
- classes and tools for handling spatial raster data.
See more - Spatial task view
R has a rich package ecosystem for read/writing, manipulating, and analyzing geospatial data.
Some core packages:
sp
- core classes for handling spatial data, additional utility functions.
rgdal
- R interface to gdal
(Geospatial Data Abstraction Library) for reading and writing spatial data.
rgeos
- R interface to geos
(Geometry Engine Open Source) library for querying and manipulating spatial data. Reading and writing WKT.
sf
- Combines the functionality of sp
, rgdal
, and rgeos
into a single package based on tidy priciples.
raster
- classes and tools for handling spatial raster data.
See more - Spatial task view
sf
The sf
package is currently under active development and is evolving rapidly. For the time being I suggest installing the latest version from github instead of using CRAN.
library(devtools)
install_github('edzer/sfr')
library(sf)
sf
The sf
package is currently under active development and is evolving rapidly. The version on CRAN should be reasonably up to date, but the most current version is always available from github.
Difficulty comes from requirements for external libraries (geos
, gdal
, and proj4
).
Windows - installing from source works when Rtools is installed (system requirements are downloaded from rwinlib)
MacOS - install dependencies via homebrew:
brew tap osgeo/osgeo4mac && brew tap --repair
brew install proj
brew install geos
brew install udunits
brew install gdal2
brew link gdal2 --force
Point, Multipoint, Multilinestring, Polygon
sp
maptools
readShapePoints
/ writeShapePoints
- Shapefile w/ pointsreadShapeLines
/ writeShapeLines
- Shapefile w/ linesreadShapePoly
/ writeShapePoly
- Shapefile w/ polygonsreadShapeSpatial
/ writeShapeSpatial
- Shapefilergdal
readOGR
/ writeOGR
- Shapefile, GeoJSON, KML, …rgeos
readWKT
/ writeWKT
- Well Known Textsf
st_read
/ st_write
- Shapefile, GeoJSON, KML, …sf
maptools
readShapePoints
/ writeShapePoints
- Shapefile w/ pointsreadShapeLines
/ writeShapeLines
- Shapefile w/ linesreadShapePoly
/ writeShapePoly
- Shapefile w/ polygonsreadShapeSpatial
/ writeShapeSpatial
- Shapefilergdal
readOGR
/ writeOGR
- Shapefile, GeoJSON, KML, …rgeos
readWKT
/ writeWKT
- Well Known Textsf
st_read
/ st_write
- Shapefile, GeoJSON, KML, …Want to fly from the Western most point in the US to the Eastern most point?
How do we define the distance between A and B, A and C, or B and C?
sf
nc = st_read("data/gis/nc_counties/", quiet=TRUE, stringsAsFactors=FALSE)
air = st_read("data/gis/airports/", quiet=TRUE, stringsAsFactors=FALSE)
hwy = st_read("data/gis/us_interstates/", quiet=TRUE, stringsAsFactors=FALSE)
head(nc)
## Simple feature collection with 6 features and 8 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -81.74178 ymin: 36.07215 xmax: -75.77323 ymax: 36.58815
## epsg (SRID): 4269
## proj4string: +proj=longlat +datum=NAD83 +no_defs
## AREA PERIMETER COUNTYP010 STATE COUNTY FIPS STATE_FIPS SQUARE_MIL
## 1 0.11175964 1.610396 1994 NC Ashe County 37009 37 429.350
## 2 0.06159483 1.354829 1996 NC Alleghany County 37005 37 236.459
## 3 0.14023009 1.769388 1998 NC Surry County 37171 37 538.863
## 4 0.08912401 1.425249 1999 NC Gates County 37073 37 342.340
## 5 0.06865730 4.428217 2000 NC Currituck County 37053 37 263.871
## 6 0.11859434 1.404309 2001 NC Stokes County 37169 37 455.793
## geometry
## 1 MULTIPOLYGON(((-81.65648656...
## 2 MULTIPOLYGON(((-81.30999399...
## 3 MULTIPOLYGON(((-80.71416384...
## 4 MULTIPOLYGON(((-76.91183250...
## 5 MULTIPOLYGON(((-75.82777792...
## 6 MULTIPOLYGON(((-80.43314893...
head(air)
## Simple feature collection with 6 features and 16 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: -118.1506 ymin: 27.49748 xmax: -72.04514 ymax: 46.25149
## epsg (SRID): 4269
## proj4string: +proj=longlat +datum=NAD83 +no_defs
## AIRPRTX010 FEATURE ICAO IATA AIRPT_NAME CITY STATE
## 1 0 AIRPORT KGON GON GROTON-NEW LONDON AIRPORT GROTON (NEW LONDON) CT
## 2 3 AIRPORT K6S5 6S5 RAVALLI COUNTY AIRPORT HAMILTON MT
## 3 4 AIRPORT KMHV MHV MOJAVE AIRPORT MOJAVE CA
## 4 6 AIRPORT KSEE SEE GILLESPIE FIELD AIRPORT SAN DIEGO/EL CAJON CA
## 5 7 AIRPORT KFPR FPR ST LUCIE COUNTY INTERNATIONAL AIRPORT FORT PIERCE FL
## 6 8 AIRPORT KRYY RYY COBB COUNTY-MC COLLUM FIELD ATLANTA GA
## STATE_FIPS COUNTY FIPS TOT_ENP LATITUDE LONGITUDE ELEV ACT_DATE CNTL_TWR
## 1 09 NEW LONDON 09011 75 41.33006 -72.04514 9 04/1940 Y
## 2 30 RAVALLI 30081 112 46.25149 -114.12554 3642 04/1940 N
## 3 06 KERN 06029 135 35.05864 -118.15056 2801 04/1940 Y
## 4 06 SAN DIEGO 06073 30 32.82622 -116.97244 388 12/1942 Y
## 5 12 ST LUCIE 12111 33 27.49748 -80.37263 24 <NA> Y
## 6 13 COBB 13067 110 34.01316 -84.59706 1041 12/1942 Y
## geometry
## 1 POINT(-72.045139 41.330056)
## 2 POINT(-114.12554 46.251494)
## 3 POINT(-118.150556 35.058639)
## 4 POINT(-116.972444 32.826222)
## 5 POINT(-80.372632 27.497479)
## 6 POINT(-84.597056 34.013157)
head(hwy)
## Simple feature collection with 6 features and 3 fields
## geometry type: MULTILINESTRING
## dimension: XY
## bbox: xmin: -1910156 ymin: 3264168 xmax: 1591535 ymax: 5340953
## epsg (SRID): 26915
## proj4string: +proj=utm +zone=15 +datum=NAD83 +units=m +no_defs
## ROUTE_NUM DIST_MILES DIST_KM geometry
## 1 I10 2449.12 3941.48 MULTILINESTRING((-1881199.8...
## 2 I105 20.75 33.39 MULTILINESTRING((-1910155.9...
## 3 I110 41.42 66.65 MULTILINESTRING((1054138.60...
## 4 I115 1.58 2.55 MULTILINESTRING((-1013795.8...
## 5 I12 85.32 137.31 MULTILINESTRING((680741.744...
## 6 I124 1.73 2.79 MULTILINESTRING((1201467.26...
sf
classesstr(nc)
## Classes 'sf' and 'data.frame': 100 obs. of 9 variables:
## $ AREA : num 0.1118 0.0616 0.1402 0.0891 0.0687 ...
## $ PERIMETER : num 1.61 1.35 1.77 1.43 4.43 ...
## $ COUNTYP010: num 1994 1996 1998 1999 2000 ...
## $ STATE : chr "NC" "NC" "NC" "NC" ...
## $ COUNTY : chr "Ashe County" "Alleghany County" "Surry County" "Gates County" ...
## $ FIPS : chr "37009" "37005" "37171" "37073" ...
## $ STATE_FIPS: chr "37" "37" "37" "37" ...
## $ SQUARE_MIL: num 429 236 539 342 264 ...
## $ geometry :sfc_MULTIPOLYGON of length 100; first list element: List of 1
## ..$ :List of 1
## .. ..$ : num [1:1030, 1:2] -81.7 -81.7 -81.7 -81.6 -81.6 ...
## ..- attr(*, "class")= chr "XY" "MULTIPOLYGON" "sfg"
## - attr(*, "sf_column")= chr "geometry"
## - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA
## ..- attr(*, "names")= chr "AREA" "PERIMETER" "COUNTYP010" "STATE" ...
class(nc)
## [1] "sf" "data.frame"
class(nc$geometry)
## [1] "sfc_MULTIPOLYGON" "sfc"
class(nc$geometry[[1]])
## [1] "XY" "MULTIPOLYGON" "sfg"
st_crs(nc)$proj4string
## [1] "+proj=longlat +datum=NAD83 +no_defs"
st_crs(air)$proj4string
## [1] "+proj=longlat +datum=NAD83 +no_defs"
st_crs(hwy)$proj4string
## [1] "+proj=utm +zone=15 +datum=NAD83 +units=m +no_defs"
nc_ll = nc
air_ll = air
hwy_ll = st_transform(hwy, st_crs(nc)$proj4string)
nc_utm = st_transform(nc, st_crs(hwy)$proj4string)
air_utm = st_transform(air, st_crs(hwy)$proj4string)
hwy_utm = hwy
sub = nc$COUNTY %in% c("Durham County","Wake County","Orange County")
nc[sub,]
## Simple feature collection with 3 features and 8 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -79.26453 ymin: 35.51905 xmax: -78.25503 ymax: 36.24385
## epsg (SRID): 4269
## proj4string: +proj=longlat +datum=NAD83 +no_defs
## AREA PERIMETER COUNTYP010 STATE COUNTY FIPS STATE_FIPS SQUARE_MIL
## 29 0.10401267 1.297813 2074 NC Orange County 37135 37 401.465
## 30 0.07714111 1.287529 2075 NC Durham County 37063 37 297.841
## 37 0.22144442 2.140667 2106 NC Wake County 37183 37 857.610
## geometry
## 29 MULTIPOLYGON(((-79.25563180...
## 30 MULTIPOLYGON(((-78.94843190...
## 37 MULTIPOLYGON(((-78.71263198...
filter(nc, COUNTY %in% c("Durham County","Wake County","Orange County"))
## Simple feature collection with 3 features and 8 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -84.32186 ymin: 33.84175 xmax: -75.46003 ymax: 36.58815
## epsg (SRID): 4269
## proj4string: +proj=longlat +datum=NAD83 +no_defs
## AREA PERIMETER COUNTYP010 STATE COUNTY FIPS STATE_FIPS SQUARE_MIL
## 1 0.10401267 1.297813 2074 NC Orange County 37135 37 401.465
## 2 0.07714111 1.287529 2075 NC Durham County 37063 37 297.841
## 3 0.22144442 2.140667 2106 NC Wake County 37183 37 857.610
## geometry
## 1 MULTIPOLYGON(((-79.25563180...
## 2 MULTIPOLYGON(((-78.94843190...
## 3 MULTIPOLYGON(((-78.71263198...
counties = c("Durham County","Wake County","Orange County")
sub = nc$COUNTY %in% counties
st_distance(nc_ll[sub, ])
## Error in st_distance(nc_ll[sub, ]): st_distance for longitude/latitude data only available for POINT geometries
st_distance(nc_utm[sub, ])
## Units: m
## [,1] [,2] [,3]
## [1,] 0.000 0 9906.327
## [2,] 0.000 0 0.000
## [3,] 9906.327 0 0.000
nc_ll[sub, ] %>% st_centroid() %>% st_distance()
## Warning in st_centroid.sfc(st_geometry(x)): st_centroid does not give correct centroids for
## longitude/latitude data
## Units: m
## [,1] [,2] [,3]
## [1,] 0.00 22185.58 52031.22
## [2,] 22185.58 0.00 34076.78
## [3,] 52031.22 34076.78 0.00
nc_utm[sub, ] %>% st_centroid() %>% st_distance()
## Units: m
## [,1] [,2] [,3]
## [1,] 0.00 22616.18 53050.15
## [2,] 22616.18 0.00 34751.60
## [3,] 53050.15 34751.60 0.00
d = st_distance(air_utm, nc_utm[sub,])
d[1:5,]
## Units: m
## [,1] [,2] [,3]
## [1,] 846916.0 837771.1 836234.3
## [2,] 3122697.5 3146840.3 3172522.0
## [3,] 3556664.1 3584394.6 3592972.9
## [4,] 3514296.0 3540264.5 3545184.1
## [5,] 952881.7 954495.9 921201.2
nearest_airport = apply(d, 2, which.min)
air %>% slice(nearest_airport) %>% .$AIRPT_NAME
## [1] "RALEIGH-DURHAM INTERNATIONAL AIRPORT" "RALEIGH-DURHAM INTERNATIONAL AIRPORT"
## [3] "RALEIGH-DURHAM INTERNATIONAL AIRPORT"
st_intersects(nc[20:30,], air) %>% str()
## although coordinates are longitude/latitude, it is assumed that they are planar
## List of 11
## $ : int(0)
## $ : int(0)
## $ : int(0)
## $ : int(0)
## $ : int(0)
## $ : int 268
## $ : int 717
## $ : int(0)
## $ : int(0)
## $ : int(0)
## $ : int(0)
st_intersects(nc, air, sparse=FALSE) %>% str()
## although coordinates are longitude/latitude, it is assumed that they are planar
## logi [1:100, 1:940] FALSE FALSE FALSE FALSE FALSE FALSE ...
nc_air = st_intersects(nc, air)
## although coordinates are longitude/latitude, it is assumed that they are planar
has_air = map_lgl(nc_air, ~ length(.) > 0)
nc %>% slice(which(has_air)) %>% .$COUNTY
## [1] "Forsyth County" "Guilford County" "Dare County" "Wake County"
## [5] "Pitt County" "Catawba County" "Buncombe County" "Wayne County"
## [9] "Mecklenburg County" "Moore County" "Cabarrus County" "Lenoir County"
## [13] "Craven County" "Cumberland County" "Onslow County" "New Hanover County"
air_in_nc = nc_air %>% unlist() %>% unique()
air %>% slice(air_in_nc) %>% .$AIRPT_NAME
## [1] "SMITH REYNOLDS AIRPORT"
## [2] "PIEDMONT TRIAD INTERNATIONAL AIRPORT"
## [3] "DARE COUNTY REGIONAL AIRPORT"
## [4] "RALEIGH-DURHAM INTERNATIONAL AIRPORT"
## [5] "PITT-GREENVILLE AIRPORT"
## [6] "HICKORY REGIONAL AIRPORT"
## [7] "ASHEVILLE REGIONAL AIRPORT"
## [8] "SEYMOUR JOHNSON AIR FORCE BASE"
## [9] "CHARLOTTE/DOUGLAS INTERNATIONAL AIRPORT"
## [10] "MOORE COUNTY AIRPORT"
## [11] "CONCORD REGIONAL AIRPORT"
## [12] "KINSTON REGIONAL JETPORT AT STALLINGS FIELD"
## [13] "CHERRY POINT MARINE CORPS AIR STATION /CUNNINGHAM FIELD/"
## [14] "COASTAL CAROLINA REGIONAL AIRPORT"
## [15] "POPE AIR FORCE BASE"
## [16] "FAYETTEVILLE REGIONAL/GRANNIS FIELD"
## [17] "ALBERT J ELLIS AIRPORT"
## [18] "WILMINGTON INTERNATIONAL AIRPORT"
plot(st_geometry(nc))
plot(st_geometry(nc[has_air,]), add=TRUE, col="lightblue")
plot(st_geometry(air[air_in_nc,]), add=TRUE, pch=16, col="blue")
nc = nc[order(nc$COUNTY),]
adj = st_touches(nc, sparse=FALSE)
## although coordinates are longitude/latitude, it is assumed that they are planar
str(adj)
## logi [1:100, 1:100] FALSE FALSE FALSE FALSE FALSE FALSE ...
durham = which(nc$COUNTY == "Durham County")
nc %>% slice(which(adj[durham,])) %>% .$COUNTY
## [1] "Chatham County" "Granville County" "Orange County" "Person County" "Wake County"
library(corrplot)
rownames(adj) = str_replace(nc$COUNTY, " County", "")
colnames(adj) = str_replace(nc$COUNTY, " County", "")
corrplot(adj[1:20,1:20],method="color",type="full",tl.col="black",cl.pos = "n")
most_neighbors = rowSums(adj)==max(rowSums(adj))
plot(st_geometry(nc))
plot(st_geometry(nc[most_neighbors,]),add=TRUE,col="lightblue")
nc$COUNTY[most_neighbors]
## [1] "Iredell County" "Moore County"
least_neighbors = rowSums(adj)==min(rowSums(adj))
plot(st_geometry(nc))
plot(st_geometry(nc[least_neighbors,]),add=TRUE,col="lightblue")
nc$COUNTY[least_neighbors]
## [1] "Chowan County" "Clay County" "Currituck County" "Dare County"
## [5] "New Hanover County" "Pamlico County" "Polk County" "Tyrrell County"