deprivateR provides a unified API for accessing and
calculating measures of socioeconomic deprivation in the United States,
including the Area Deprivation Index (ADI), Neighborhood Deprivation
Index (NDI), and the Social Vulnerability Index (SVI). The Gini
Coefficient can also be returned, though it is not re-calculated on the
fly.
dep_set_api_key() function for easier Census API
key setupcliThe sociome
and ndi
packages are excellent contributions, but offer different APIs for
returning their respective indices. deprivateR provides a
unified interface for accessing these measures of deprivation, as well
as the ability to calculate the various forms of the Social
Vulnerability Index (SVI) that the Centers for Disease Control and
Prevention (CDC) has published. Importantly, SVI can be calculated for a
variety of years and geographic levels. This functionality expands the
possibilities for implementing these measures in research and public
health practice. However, users should also be aware that ADI, NDI, and
SVI have not been extensively validated for some Census geographies.
The easiest way to install deprivateR is from CRAN:
install.packages("deprivateR")Alternatively, you can install the development version of
deprivateR from GitHub with remotes:
# install.packages("remotes")
remotes::install_github("pfizer-opensource/deprivateR")deprivateR retrieves data from the U.S. Census Bureau
via tidycensus, which requires a free API key. If you’ve
already configured tidycensus, no additional setup is
needed. Otherwise, use dep_set_api_key() to store your
key:
dep_set_api_key("your_key_here")You can obtain a key at https://api.census.gov/data/key_signup.html.
The core function in deprivateR is
dep_get_index(). This function returns the specified index
for the given geography and year:
dep_get_index(geography = "county", state = "MO", index = "adi", year = 2022)#> Using FIPS code '29' for state 'MO'
#> # A tibble: 115 × 3
#> GEOID NAME ADI
#> <chr> <chr> <dbl>
#> 1 29001 Adair County, Missouri 101.
#> 2 29003 Andrew County, Missouri 69.4
#> 3 29005 Atchison County, Missouri 104.
#> 4 29007 Audrain County, Missouri 115.
#> 5 29009 Barry County, Missouri 106.
#> 6 29011 Barton County, Missouri 118.
#> 7 29013 Bates County, Missouri 107.
#> 8 29015 Benton County, Missouri 103.
#> 9 29017 Bollinger County, Missouri 104.
#> 10 29019 Boone County, Missouri 68.9
#> # ℹ 105 more rows
The index argument can take multiple indices at once, as
can the year argument. This gives users the ability to
compare multiple indices across multiple years:
dep_get_index(geography = "county", state = "MO", index = c("svi20", "svi20s"), year = c(2021, 2022))#> Using FIPS code '29' for state 'MO'
#> # A tibble: 230 × 5
#> GEOID NAME YEAR SVI_20 SVI_20S
#> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 29001 Adair County, Missouri 2021 0.377 0.386
#> 2 29001 Adair County, Missouri 2022 0.456 0.439
#> 3 29003 Andrew County, Missouri 2021 0 0
#> 4 29003 Andrew County, Missouri 2022 0 0
#> 5 29005 Atchison County, Missouri 2021 0.149 0.167
#> 6 29005 Atchison County, Missouri 2022 0.149 0.167
#> 7 29007 Audrain County, Missouri 2021 0.746 0.781
#> 8 29007 Audrain County, Missouri 2022 0.886 0.904
#> 9 29009 Barry County, Missouri 2021 0.702 0.693
#> 10 29009 Barry County, Missouri 2022 0.702 0.667
#> # ℹ 220 more rows
An alternative to dep_get_index() is
dep_calc_index(), which provides users with the ability to
calculate indices using pre-downloaded data. The
dep_sample_data() function can be used to explore how this
function works using sample data from the 2018-2022 5-year American
Community Survey for Missouri Counties:
ndi_m <- dep_sample_data(index = "ndi_m")
dep_calc_index(ndi_m, geography = "county", index = "ndi_m", year = 2022)#> Warning: The proportion of variance explained by PC1 is less than 0.50.
#> # A tibble: 115 × 4
#> GEOID NAME YEAR NDI_M
#> <chr> <chr> <dbl> <dbl>
#> 1 29001 Adair County, Missouri 2022 0.0193
#> 2 29003 Andrew County, Missouri 2022 -0.108
#> 3 29005 Atchison County, Missouri 2022 -0.0505
#> 4 29007 Audrain County, Missouri 2022 0.0107
#> 5 29009 Barry County, Missouri 2022 0.0129
#> 6 29011 Barton County, Missouri 2022 0.105
#> 7 29013 Bates County, Missouri 2022 0.0679
#> 8 29015 Benton County, Missouri 2022 0.0283
#> 9 29017 Bollinger County, Missouri 2022 0.0565
#> 10 29019 Boone County, Missouri 2022 -0.0646
#> # ℹ 105 more rows
The deprivateR package also contains a number of helper
functions that we use in our disparities work. These include:
dep_percentiles(): Calculate percentiles for a given
variable in a data frame. This is the method used to reproduce SVI
estimates, which include percentiles for each variable. It is also the
method used for dep_get_index() and
dep_calc_index() when
return_percentiles = TRUE.dep_quantiles(): Calculate quantiles for a given
variable in a data frame. We use this to create tertiles and quartiles
for descriptive statistics and regression analyses.dep_map_breaks(): Calculate map breaks for a given
variable in a data frame. This is useful for creating choropleth maps
with packages like ggplot2 or leaflet. It can
be used to create “bins” automatically, using any of the algorithms
supported by classInt::classIntervals(), or accept
pre-specified breaks.deprivateR would not be possible without the work of the
sociome
and ndi
packages. The sociome package’s development was led by Nik
Krieger, and the ndi package’s author is Ian D. Buller -
we’re immensely grateful for their contributions to the field. Likewise,
deprivateR would not be possible without Kyle Walker’s packages tigris
and tidycensus,
which provide access to the underlying U.S. Census Bureau data for
calculating these indices.
If you have feedback on deprivateR, please open an
issue on GitHub after checking the contribution
guidelines. Please note that this project is released with a
Contributor Code
of Conduct. By participating in this project you agree to abide by
its terms.