library(tidyverse)
library(janitor)
Parcel boundaries with address and revenue-related information for properties in Wake County, NC. http://data-wake.opendata.arcgis.com/datasets/parcels
wake <- read_csv("https://www2.stat.duke.edu/~sms185/data/econ/parcels.csv")
wake <- clean_names(wake)
Which city has the fewest land parcels in the dataset?
wake %>%
count(city_decode) %>%
arrange(n) %>%
slice(1)
Create a tibble that shows the year a parcel was built and the total value, where all parcels are located in Apex and are more than one acre in area. Sort the result in ascending order by year built.
wake %>%
filter(city_decode == "APEX", calc_area > 1) %>%
select(year_built, total_value_assd) %>%
arrange(year_built)
Choose a subset of five variables and 10 random rows from wake
and save it as an object named wake_mini
. Experiment renaming variables with select()
and rename()
on wake_mini
. What is the difference between the two functions?
Function select()
only keeps the variables you mention.
Function rename()
keeps all variables.
Compute the mean area for each design style.
wake %>%
group_by(design_style_decode) %>%
summarise(mean_area = mean(calc_area))
Compute the median sale price for each year. Hint: lubridate::year()
wake %>%
filter(!is.na(sale_date)) %>%
mutate(year = lubridate::year(sale_date)) %>%
group_by(year) %>%
summarise(median_sale_price = median(totsalprice, na.rm = TRUE))
Which city with at least 1,000 parcels classified as a “Townhouse” had the highest proportion of parcels as “Townhouse”?
wake %>%
count(city_decode, design_style_decode) %>%
group_by(city_decode) %>%
mutate(prop = n / sum(n)) %>%
ungroup() %>%
filter(design_style_decode == "Townhouse", n >= 1000) %>%
arrange(desc(prop)) %>%
slice(1)