Introduction to R Packages for Data Management: tidyr and dplyr

Instructor

Dawn Koffman is a Statistical Programmer at the Office of Population Research at Princeton University. She earned an MS in Computer Science from University of Wisconsin-Madison, and an MPH in Epidemiology and Biostatistics from UMDNJ and Rutgers University.

Time/Place
5/09/2016 from 9:30 AM to 12:00 PM ~ Wallace 300
Description

This workshop introduces two modern R packages, both written by Hadley Wickham, that provide intuitive tools for handling common data management tasks. The first package, tidyr, provides functions that reshape data so it conforms to a specific “tidy” structure where each variable is saved in its own column, each observations is saved in its own row, and each type of observational unit is stored in a separate table. The second package, dplyr, provides a set of functions (referred to as “verbs”) that allow you to easily subset observations, re-order observations, select specific variables, add new variables, group observations, and summarize groups of observations.

Audience
Attendees should have previous R experience.
Format
Lecture, discussion and hands-on exercises.
Requirements
Attendees should bring a laptop with R and the R packages tidyr and dplyr already installed.
Downloads