Skip to content

This contains materials for a workshop with the Data Storytelling Lab at Pratt Institute. The workshop covers how to query data from the Census API and then visualize and analyze it using R tools.

License

Notifications You must be signed in to change notification settings

DataStorytellingLab/visualize-social-patterns-census-api

Repository files navigation

Visualize Social Patterns with the Census API

Author: John Lauermann, School of Information, Pratt Institute

This repository contains materials for a workshop with the Data Storytelling Lab at Pratt Institute. The workshop covers how to query data from the Census API and then visualize and analyze it using R tools.

In this workshop, you’ll learn how to visualize and analyze data from the US Census. Census data products are the primary historic record of American society, available as open data for every community in the United States. The long temporal record and diverse geographic scope of these data present opportunities for data storytelling at scale. Census data allow you to tell stories that move beyond small scale case studies, to think about national patterns and historic trends. To work with this kind of big data, we’ll use open source statistical tools in R to analyze and visualize geographic patterns.

Learning outcomes

By the end of the workshop, you should be able to:

  1. Query American Community Survey data from the Census API.
  2. Visualize patterns in the data using choropleth maps and scatterplots.
  3. Analyze whether those patterns are statistically significant using correlation and simple linear regression.

How to set up

Initial set up steps include:

  • Sign up for a Census API key at this link
  • Install R and RStudio on your computer with this link
  • Download the code version of this workflow: visualize-social-patterns-census-api.rmd. (go to the page, click the download button on the top right)
  • Open the .rmd notebook in RStudio (from the top left menu bar: File -> Open File)

You can replicate the workflow step by step using the code chunks in the notebook. To run each chunk, either click the green ‘Run Current Chunk’ button or put your cursor in the code block and hit either Cntrl + Shift + Enter (for a PC) or Cmd + Shift + Enter (for a Mac).

To see the full scope of the this workshop, you can check out the markdown version of the notebook: visualize-social-patterns-census-api.md.

Basic workflow

  1. Install and load the libraries we’ll need for the workflow.
  2. Explore Census data and define a query to pull two variables from the API for further analysis. For this we’ll use tidycensus to query the Census API and dplyr to clean up the data table.
  3. Visualize the geographic distribution of each variable using choropleth maps built with ggplot2.
  4. Visualize the relationship between those two variables using a scatterplot built with ggplot2.
  5. Test the correlation between those variables using cor.test()
  6. Test whether one variable predicts the other using simple linear regression with the lm() function.

Further reading




Shield: CC BY-NC 4.0

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

CC BY-NC 4.0

About

This contains materials for a workshop with the Data Storytelling Lab at Pratt Institute. The workshop covers how to query data from the Census API and then visualize and analyze it using R tools.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published