Skip to content
R for the Rest of Us: A Statistics-Free Introduction comes out June 25th. Or you can read the online version today. Check it out →
R for the Rest of Us Logo

This lesson is locked

Get access to all lessons in this course.

If the video is not playing correctly, you can watch it in a new window


Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.

Your Turn

Complete the summarize sections of the data-wrangling-and-analysis-exercises.Rmd file.

Learn More

General Data Wrangling and Analysis Resources

Because most material that discusses data wrangling and analysis with the dplyr packges does so in a way that covers all of the verbs discussed in this course, I have chosen not to separate them by lesson. Instead, here are some helpful resources for learning more about all of the tidyverse verbs discussed in this course:

Chapter 5 of R for Data Science

RStudio Cloud primer on working with data

Tidyverse for Beginners by Danielle Navarro

Learning Statistics with R by Danielle Navarro

Introduction to the Tidyverse by Alison Hill

A gRadual intRoduction to data wRangling by Chester Ismay and Ted Laderas

Working in the Tidyverse by Desi Quintans and Jeff Powell

Christine Monnier video tutorials on dplyr

Have any questions? Put them below and we will help you out!

You need to be signed-in to comment on this post. Login.

I got the count function to count the number of rows as assigned, but wanted to figure out a way to figure out the number of rows without an NA/with an answer for hours of sleep per night. I tried to add an argument "na.rm = TRUE" to the n() function in a few places in the code chunk but it didn't work.

Laura Hickerson

Laura Hickerson

January 18, 2022

Hi David - with the summarize function, it looks like you did not use a select statement afterward, but it still outputs the result. I have to put in a select statement afterwards to get any output for mean_hours_sleep. Any thoughts?

Kim Cataldo

Kim Cataldo

April 4, 2022

After we run the summarize function to create ‘mean_hours_sleep’ is this considered a new variable? Or only if we assign it? If we don’t assign it, do we have to repeat the summarize line of code whenever we want to reference it again, like when we’re using group_by?

nhanes %>% group_by(gender, work) %>% summarize(mean_hours_sleep = mean(sleep_hrs_night, na.rm = TRUE))

Abdikadir Eftin

Abdikadir Eftin

November 2, 2023

Hi David,

Can I use mean_hrs? then rather "hours" what difference would it make? Why are we spelling out hours rather than leaving it at just hrs?