06. More on dplyr and reshaping data

Published

Tuesday, September 23, 2025

This week …

More on dplyr: using across() to apply functions to multiple columns at once; using case_when() to create new variables based on conditions; more on reshaping data with pivot_longer(); more on filtering and selecting data.

Required Reading

  • Karl W. Broman and Kara H. Woo “Data Organization in Spreadsheets,” The American Statistician 72, no. 1 (January 2, 2018): 2–10, doi:10.1080/00031305.2017.1375989.

  • Next, read the vignettes that come with dplyr. (In R packages, a vignette is like and extended example or tutorial.) These are available on your computer because you’ve installed the package. But you can also read them online. Note that tidyverse vignettes do not use the base pipe, |>, they use the magrittr pipe, %>% instead. But for our purposes they are equivalent.

Read the following vignettes, in the order listed:

dplyr

tidyr

stringr

  • Regular expressions
  • Read the help page for str_detect(): ?stringr::str_detect, which can also be found here. Work through the examples one at a time and make a note of any that seem confusing.

Examples

(To follow.)

Assignment

(To follow.)

Slides