Shapefiles and rgeos


Shapefiles

Data

Today’s data is all from Mike Bostock’s us-atlas git repository - https://github.com/mbostock/us-atlas.

Available on saxon in ~cr173/Sta523/data/us-atlas/shp/us.

dir("~/Sta523/data/us-atlas/shp/us/","*.shp")
##  [1] "amtrak.shp"              "cbsa.shp"                "coast.shp"               "congress-unfiltered.shp"
##  [5] "counties.shp"            "counties-unfiltered.shp" "ferries.shp"             "nation-unmerged.shp"    
##  [9] "ports.shp"               "railroads.shp"           "roads-unmerged.shp"      "states-unfiltered.shp"  
## [13] "streams-unmerged.shp"    "waterbodies.shp"         "zipcodes-unmerged.shp"

Shapefile components

$ ls -lh ~cr173/Sta523/data/us-atlas/shp/us/states-unfiltered*
-rw-r--r--+ 1 cr173 visitor 8.8K Sep 30 14:36 ~cr173/Sta523/data/us-atlas/shp/us/states-unfiltered.dbf
-rw-r--r--+ 1 cr173 visitor  167 Sep 30 14:36 ~cr173/Sta523/data/us-atlas/shp/us/states-unfiltered.prj
-rw-r--r--+ 1 cr173 visitor  772 Sep 30 14:36 ~cr173/Sta523/data/us-atlas/shp/us/states-unfiltered.sbn
-rw-r--r--+ 1 cr173 visitor  156 Sep 30 14:36 ~cr173/Sta523/data/us-atlas/shp/us/states-unfiltered.sbx
-rw-r--r--+ 1 cr173 visitor  11M Sep 30 14:36 ~cr173/Sta523/data/us-atlas/shp/us/states-unfiltered.shp
-rw-r--r--+ 1 cr173 visitor  596 Sep 30 14:36 ~cr173/Sta523/data/us-atlas/shp/us/states-unfiltered.shx
-rw-r--r--+ 1 cr173 visitor  19K Sep 30 14:36 ~cr173/Sta523/data/us-atlas/shp/us/states-unfiltered.txt
-rw-r--r--+ 1 cr173 visitor 7.4K Sep 30 14:36 ~cr173/Sta523/data/us-atlas/shp/us/states-unfiltered.xml


Required files :

  • .shp - the feature geometry data
  • .shx - shape index
  • .dbf - columnar attributes for each shape, in dBase IV format

Optional files :

  • .prj - coordinate system and projection information
  • .sbn and .sbx - spatial indexes of the features
  • .xml - geospatial metadata

Shapefile info

suppressMessages(library(rgdal))
ogrInfo("/home/vis/cr173/Sta523/data/us-atlas/shp/us/","states-unfiltered")
## Source: "/home/vis/cr173/Sta523/data/us-atlas/shp/us/", layer: "states-unfiltered"
## Driver: ESRI Shapefile number of rows 62 
## Feature type: wkbPolygon with 2 dimensions
## Extent: (-179.1 17.67) - (179.8 71.39)
## CRS: +proj=longlat +ellps=GRS80 +datum=NAD83 +no_defs  
## LDID: 87 
## Number of fields: 9 
##         name type length typeName
## 1       AREA    2     11     Real
## 2  PERIMETER    2     11     Real
## 3 STATESP010    2     19     Real
## 4      STATE    4     20   String
## 5 STATE_FIPS    4      2   String
## 6  ORDER_ADM    2     19     Real
## 7  MONTH_ADM    4     18   String
## 8    DAY_ADM    2     19     Real
## 9   YEAR_ADM    2     19     Real

Reading a Shapefile

states = readOGR("/home/vis/cr173/Sta523/data/us-atlas/shp/us/","states-unfiltered", stringsAsFactors=FALSE)
## OGR data source with driver: ESRI Shapefile 
## Source: "/home/vis/cr173/Sta523/data/us-atlas/shp/us/", layer: "states-unfiltered"
## with 62 features and 9 fields
## Feature type: wkbPolygon with 2 dimensions
class(states)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"

Shapefile contents (data)

states@data
##       AREA PERIMETER STATESP010                STATE STATE_FIPS ORDER_ADM MONTH_ADM DAY_ADM YEAR_ADM
## 0  279.462   802.665          2               Alaska         02        49   January       3     1959
## 1    1.440    14.685         15               Hawaii         15        50    August      21     1959
## 2    0.764     8.126         72          Puerto Rico         72         0      <NA>       0        0
## 3   12.879    21.642          1              Alabama         01        22  December      14     1819
## 4   13.586    21.737          5             Arkansas         05        25      June      15     1836
## 5   28.919    23.890          4              Arizona         04        48  February      14     1912
## 6   41.613    55.323          6           California         06        31 September       9     1850
## 7   28.039    22.017          8             Colorado         08        38    August       1     1876
## 8    1.391     7.986          9          Connecticut         09         5   January       9     1788
## 9    0.017     0.866         11 District of Columbia         11         0      <NA>       0        0
## 10   0.533     5.469         10             Delaware         10         1  December       7     1787
## 11  13.477   112.896         12              Florida         12        27     March       3     1845
## 12  14.625    35.561         13              Georgia         13         4   January       2     1788
## 13  15.856    20.132         19                 Iowa         19        29  December      28     1846
## 14  24.459    30.909         16                Idaho         16        43      July       3     1890
## 15  15.408    21.660         17             Illinois         17        21  December       3     1818
## 16   0.443     3.098         17             Illinois         17        21  December       3     1818
## 17   9.873    17.293         18              Indiana         18        19  December      11     1816
## 18   0.065     1.694         18              Indiana         18        19  December      11     1816
## 19  22.005    21.410         20               Kansas         20        34   January      29     1861
## 20  10.664    22.510         21             Kentucky         21        15      June       1     1792
## 21  11.424    59.940         22            Louisiana         22        18     April      30     1812
## 22   2.293    23.275         25        Massachusetts         25         6  February       6     1788
## 23   2.664    50.893         24             Maryland         24         7     April      28     1788
## 24   9.666    51.869         23                Maine         23        23     March      15     1820
## 25  16.991    63.131         26             Michigan         26        26   January      26     1837
## 26  11.527    75.243         26             Michigan         26        26   January      26     1837
## 27  25.537    34.873         27            Minnesota         27        32       May      11     1858
## 28   0.790     7.392         27            Minnesota         27        32       May      11     1858
## 29  18.615    24.682         29             Missouri         29        24    August      10     1821
## 30  11.881    26.186         28          Mississippi         28        20  December      10     1817
## 31  45.081    36.427         30              Montana         30        41  November       8     1889
## 32  12.705    64.572         37       North Carolina         37        12  November      21     1789
## 33  21.840    24.663         38         North Dakota         38        39  November       2     1889
## 34  21.614    24.233         31             Nebraska         31        37     March       1     1867
## 35   2.677    10.088         33        New Hampshire         33         9      June      21     1788
## 36   2.073    17.481         34           New Jersey         34         3  December      18     1787
## 37  30.892    23.674         35           New Mexico         35        47   January       6     1912
## 38  29.943    23.854         32               Nevada         32        36   October      31     1864
## 39  13.909    42.221         36             New York         36        11      July      26     1788
## 40   1.166    13.524         36             New York         36        11      July      26     1788
## 41  11.325    19.294         39                 Ohio         39        17     March       1     1803
## 42   0.994    10.146         39                 Ohio         39        17     March       1     1803
## 43  18.003    26.979         40             Oklahoma         40        46  November      16     1907
## 44  28.147    33.718         41               Oregon         41        33  February      14     1859
## 45  12.531    17.686         42         Pennsylvania         42         2  December      12     1787
## 46   0.211     2.567         42         Pennsylvania         42         2  December      12     1787
## 47   0.304     6.629         44         Rhode Island         44        13       May      29     1790
## 48   7.795    34.720         45       South Carolina         45         8       May      23     1788
## 49  22.580    23.292         46         South Dakota         46        40  November       2     1889
## 50  10.890    22.359         47            Tennessee         47        16      June       1     1796
## 51  65.141    87.128         48                Texas         48        28  December      29     1845
## 52  22.975    20.001         49                 Utah         49        45   January       4     1896
## 53  10.559    57.087         51             Virginia         51        10      June      25     1788
## 54   2.799     9.455         50              Vermont         50        14     March       4     1791
## 55  20.844    58.094         53           Washington         53        42  November      11     1889
## 56   0.874    36.906         53           Washington         53        42  November      11     1889
## 57  16.482    29.492         55            Wisconsin         55        30       May      29     1848
## 58   2.739    23.805         55            Wisconsin         55        30       May      29     1848
## 59   6.492    20.039         54        West Virginia         54        35      June      20     1863
## 60  27.972    21.996         56              Wyoming         56        44      July      10     1890
## 61   0.030     2.761         78  U.S. Virgin Islands         78         0      <NA>       0        0

Shapefile contents (shapes)

plot(states, col="lightgrey", axes=TRUE)

plot of chunk unnamed-chunk-6

Shapefile contents (shapes) cont.

plot(states[states$ORDER_ADM %in% 1:48,], col="lightgrey", axes=TRUE)

plot of chunk unnamed-chunk-7

Why Illinois twice?

plot(states[states$STATE == "Illinois",], col=c("lightgrey","lightblue"), axes=TRUE)

plot of chunk unnamed-chunk-8

Why Michigan twice?

plot(states[states$STATE == "Michigan",], col=c("lightgrey","lightblue"), axes=TRUE)

plot of chunk unnamed-chunk-9

Removing duplicates / the Great Lakes

states = states[!duplicated(states$STATE),]
plot(states[states$ORDER_ADM %in% 1:48,], col="lightgrey", axes=TRUE)

plot of chunk unnamed-chunk-10

Railroads info

ogrInfo("/home/vis/cr173/Sta523/data/us-atlas/shp/us/","railroads")
## Source: "/home/vis/cr173/Sta523/data/us-atlas/shp/us/", layer: "railroads"
## Driver: ESRI Shapefile number of rows 9248 
## Feature type: wkbLineString with 2 dimensions
## Extent: (-150.1 25.47) - (-67.43 64.93)
## CRS: +proj=longlat +ellps=GRS80 +datum=NAD83 +no_defs  
## LDID: 87 
## Number of fields: 12 
##          name type length typeName
## 1      FNODE_    2     19     Real
## 2      TNODE_    2     19     Real
## 3      LENGTH    2     19     Real
## 4  RAILRDL010    2     19     Real
## 5    RROWNER1    4     85   String
## 6    RROWNER2    4     85   String
## 7    RROWNER3    4     85   String
## 8       MARK1    4      4   String
## 9       MARK2    4      4   String
## 10      MARK3    4      4   String
## 11      MILES    2     19     Real
## 12 KILOMETERS    2     19     Real

Railroad LineStrings

railroads = readOGR("/home/vis/cr173/Sta523/data/us-atlas/shp/us/","railroads",stringsAsFactors=FALSE)
## OGR data source with driver: ESRI Shapefile 
## Source: "/home/vis/cr173/Sta523/data/us-atlas/shp/us/", layer: "railroads"
## with 9248 features and 12 fields
## Feature type: wkbLineString with 2 dimensions
class(railroads)
## [1] "SpatialLinesDataFrame"
## attr(,"package")
## [1] "sp"

Railroad data

head(railroads@data, n=10)
##   FNODE_ TNODE_  LENGTH RAILRDL010                                      RROWNER1          RROWNER2 RROWNER3
## 0      1      2 0.08758          1                Union Pacific Railroad Company              <NA>     <NA>
## 1      3      4 0.04277          2                Union Pacific Railroad Company              <NA>     <NA>
## 2      4      5 0.08574          3                Union Pacific Railroad Company              <NA>     <NA>
## 3      6      7 0.09291          4                 Austin Area Terminal Railroad Longhorn Railroad     <NA>
## 4      8      9 0.17752          5            Point Comfort and Northern Railway              <NA>     <NA>
## 5     10     11 0.07076          6 Rockdale Sandow and Southern Railroad Company              <NA>     <NA>
## 6     12     13 0.14393          7                Union Pacific Railroad Company              <NA>     <NA>
## 7     14     15 0.08265          8                Union Pacific Railroad Company              <NA>     <NA>
## 8     16     17 0.01010          9                Union Pacific Railroad Company              <NA>     <NA>
## 9     10     18 0.23490         10                Union Pacific Railroad Company              <NA>     <NA>
##   MARK1 MARK2 MARK3   MILES KILOMETERS
## 0    UP  <NA>  <NA>  5.4246     8.7300
## 1    UP  <NA>  <NA>  2.8083     4.5195
## 2    UP  <NA>  <NA>  5.6913     9.1592
## 3  AUAR  LHRR  <NA>  6.1793     9.9446
## 4   PCN  <NA>  <NA> 12.0000    20.0000
## 5   RSS  <NA>  <NA>  4.7703     7.6771
## 6    UP  <NA>  <NA>  9.5339    15.0000
## 7    UP  <NA>  <NA>  5.2509     8.4505
## 8    UP  <NA>  <NA>  0.6106     0.9827
## 9    UP  <NA>  <NA> 14.0000    23.0000

Union Pacific tracks

owners = railroads@data[,c("RROWNER1","RROWNER2","RROWNER3")]
owners[is.na(owners[])] = ""

up_i = railroads$RROWNER1 == "Union Pacific Railroad Company" |
       railroads$RROWNER2 == "Union Pacific Railroad Company" |
       railroads$RROWNER3 == "Union Pacific Railroad Company"

up_rails = railroads[up_i,]

Union Pacific tracks (cont.)

plot(states[states$ORDER_ADM %in% 1:48,], border="lightgrey", axes=TRUE)
plot(up_rails, lwd=2, add=TRUE)

plot of chunk unnamed-chunk-15

Which states contain Union Pacific Tracks?

library(rgeos)
## rgeos version: 0.3-8, (SVN revision 460)
##  GEOS runtime version: 3.3.2-CAPI-1.7.2 
##  Polygon checking: TRUE
sec = gIntersects(states,up_rails,byid=TRUE)
dim(sec)
## [1] 1605   53
states$STATE[apply(sec,2,any)]
##  [1] "Arkansas"   "Arizona"    "California" "Colorado"   "Iowa"       "Idaho"      "Illinois"   "Kansas"    
##  [9] "Louisiana"  "Minnesota"  "Missouri"   "Montana"    "Nebraska"   "New Mexico" "Nevada"     "Oklahoma"  
## [17] "Oregon"     "Tennessee"  "Texas"      "Utah"       "Washington" "Wisconsin"  "Wyoming"

Spatial predicates in rgeos

  • gContains
  • gContainsProperly
  • gCovers
  • gCoveredBy
  • gCrosses
  • gDisjoint
  • gEquals
  • gEqualsExact
  • gIntersects
  • gOverlaps
  • gTouches
  • gWithin
predicates

Dimensionally Extended 9-Intersection Model

Find the Union Pacific tracks in California

up_cal = gIntersection(up_rails,states[states$STATE == "California",])

plot(states[states$STATE == "California",], border="lightgrey", axes=TRUE)
plot(up_cal, lwd=2, add=TRUE)

plot of chunk unnamed-chunk-17

Find areas within 50 kms of Nevada

UTM - http://en.wikipedia.org/wiki/Universal_Transverse_Mercator_coordinate_system

nevada  = states[states$STATE == "Nevada",]
nev_utm = spTransform(nevada, CRS("+proj=utm +zone=11 +ellps=WGS84 +datum=WGS84"))

nev_buf = gBuffer(nev_utm, width=50000)
plot(nev_utm, axes=TRUE, xlim=c(1e5, 9e5), 
     ylim=c(38e5, 47e5), asp=1)

plot of chunk unnamed-chunk-19

plot(nev_buf, axes=TRUE, xlim=c(1e5, 9e5), 
     ylim=c(38e5, 47e5), asp=1)
plot of chunk unnamed-chunk-20

Find areas within 50 kms of Nevada (cont.)

out = spTransform(gDifference(nev_buf, nev_utm), CRS(proj4string(states)))
plot(states, xlim=c(-125,-110), ylim=c(30,45), axes=TRUE)
plot(out, col=adjustcolor("blue",alpha=0.5), add=TRUE, border=rgb(0,0,0,alpha=0))

plot of chunk unnamed-chunk-21

Exercise

  • Load the roads data from roads-unmerged.shp, note that this is a largish file and may take a while to load into R.

  • Create a plot of the interstate highway system (this shapefile contains many additional road line segments) overlayed on a map of the lower 48 states.

  • Use a spatial predicate function to figure out which states I-10 passes through.