Median Income and Race in the District of Columbia

census
wealth
Mapping U.S. Census Data with R
Published

July 19, 2025

Packages

  • tidyverse (ggplot2, dplyr) for plotting and wrangling data
  • sf (simple features) - I’ll only be using one function geom_sf() to plot the map with ggplot2, but this package is part of the geocomputation ecosystem of R.
  • tidycensus - an R package by Kyle Walker that allows users to interface with a select number of the US Census Bureau’s data APIs and return tidyverse-ready data frames, optionally with simple feature geometry included
Code
# install.packages(c("tidyverse", "extrafont", "tidycensus", "sf"))

library(tidyverse)
library(tidycensus)

extrafont::loadfonts()

Intiating API Connection

To begin pulling data from the U.S. Census into R, you must submit your email to the Census in order to receive an API key. I have hidden my key for this demo.

Take your key and instantiate it in your R session like so:

Code
census_api_key(my_key)
options(tigris_use_cache = TRUE)

Data Pull

To download the data we need (median household income at the tract level of White, Black, Asian and Hispanic residents of DC) we will use get_acs().

Code
dc <- 
  get_acs(
    geography = "tract",
    # this could be block, block group, tract, county, etc.
    variables = c(
      "White" = "B19013A_001",
      # white
      "Black" = "B19013B_001",
      # black
      "Asian" = "B19013D_001",
      # asian
      "Hispanic " = "B19013I_001"
      # hispanic
    ),
    # this could be whole plotting and thotting topic on its own...
    state = "DC",
    # DC counts as a state here
    year = 2022,
    # I first created this plot when 2022 had the most recent data
    geometry = TRUE
    # we want to return an sf object (similar to tidy table/tibble that 
    # also has a geometric component)
  ) |>
  replace_na(list(estimate = 0))
  # if it's NA it equals 0 in our situation

Cleaning Data Set

Ok, so we know that nobody lives on the National Mall (which is its own census tract), so if we included that row in our map, it’ll be filled in just like the others.

I took the creative liberty to decide to remove all tracts where the value of median household income is 0.

Code
remove <- 
  dc |> 
  filter(estimate == 0) |> 
  # all rows where estimate is 0
  group_by(GEOID) |> 
  # create groups based on the ID of the geometric feature
  count() |> 
  # for each GEOID, how many rows have a 0 in estimate?
  filter(n == 4) |> 
  # all rows where value is 4 (four variables, four zeroes)
  pull(GEOID) 
  # retrieve GEOIDs of tracts we want to remove

dc_clean <-
  dc |> 
  filter(!GEOID %in% remove)

We can observe the first few rows of this data to get an idea of what’s going on:

Code
dc_clean |>
  head(4) |>
  knitr::kable("html") |> 
  kableExtra::kable_styling(bootstrap_options = c("striped", "hover", "condensed"))
GEOID NAME variable estimate moe geometry
11001010300 Census Tract 103; District of Columbia; District of Columbia White 193333 46866 POLYGON ((-77.03636 38.9748...
11001010300 Census Tract 103; District of Columbia; District of Columbia Black 85758 47260 POLYGON ((-77.03636 38.9748...
11001010300 Census Tract 103; District of Columbia; District of Columbia Asian 0 NA POLYGON ((-77.03636 38.9748...
11001010300 Census Tract 103; District of Columbia; District of Columbia Hispanic 64233 7858 POLYGON ((-77.03636 38.9748...

Here are four rows that each refer to Census Tract 103 in the District of Columbia. The variable column describes these rows as containing the median household income by race, while estimate shows the actual value. We can see that the White median household income for Tract 103 was estimated to be $193,333 in 2022; for Black households in that same tract the estimate is $85,758, and for Hispanic households it was $64,233. It appears that there are not enough Asian households in this tract for there to be a reliable estimate.

Plotting the Map

As mentioned earlier, we will use the geom_sf() function to draw a map within the ggplot framework.

Code
map <- 
  dc_clean |>
  # filter for rows where not GEOID are in remove (GEOID not in remove)
  ggplot(aes(fill = estimate)) + 
  # start ggplot object
  geom_sf() + 
  # apply sf geom
  labs(title = "Race and Income in DC",
       subtitle = "2022 Inflation-adjusted USD\nMedian Household Income",
       caption = "Source: 2018-2022 5-year\nAmerican Community Servey") +
  # set title, subtitle and caption (using new line character "\n")
  scale_fill_viridis_c(labels = scales::dollar,
                       option = "inferno",
                       direction = -1) +
  # apply a colorblind-friendly color palette
  # we also want dollar signs on our legend
  # the "inferno" color palette
  # reverse the high/low direction from default
  facet_wrap(~ variable, nrow = 2) +
  # instead of one DC
  # we get four DC's (for each variable, what is being faceted)
  # ensure there are 2 rows of plots
  theme_bw() + 
  # black and white theme 
  theme(panel.grid = element_blank(),
        # remove all grid lines (graticules are not necessary here)
        axis.text = element_blank(),
        # no need for degrees longitude and latitude
        axis.ticks = element_blank(),
        # ticks also go
        legend.title = element_blank(),
        # a title for the legend is redundant so it goes as well
        strip.background = element_rect(fill = "white"),
        # change from grey
        legend.spacing.y = unit(.3, "cm"),
        # adjust spacing a tiny bit
        plot.title = element_text(face = "bold"),
        # put the title to bold
        text = element_text(family = "Garamond")
        # change font to Garamond
  )

map

Saving Your Map

Now that you’ve successfully plotted Census data on map, you can save the image file to your computer using ggsave() (code sample below):

Code
# ggsave("median_income_by_race_dc_census_tracts.png",
#        plot = map,
#        device = "png",
#        width = 6.3,
#        height = 4.3,
#        unit = "in")

If you found this tutorial helpful, try this out on your own.