Get access to all lessons in this course.
-
RMarkdown
- Why Use RMarkdown?
- RMarkdown Overview
- YAML
- Text
- Code Chunks
- Wrapping Up
-
Data Wrangling and Analysis
- Getting Started
- The Tidyverse
- select
- mutate
- filter
- summarize
- group_by
- count
- arrange
- Create a New Data Frame
- Crosstabs
- Wrapping Up
-
Data Visualization
- An Important Workflow Tip
- The Grammar of Graphics
- Scatterplots
- Histograms
- Bar Charts
- color and fill
- scales
- Text and Labels
- Plot Labels
- Themes
- Facets
- Save Plots
- Wrapping Up
-
Wrapping Up
- You Did It!
Fundamentals of R
summarize
This lesson is locked
This lesson is called summarize, part of the Fundamentals of R course. This lesson is called summarize, part of the Fundamentals of R course.
Transcript
Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.
Your Turn
Complete the summarize sections of the data-wrangling-and-analysis-exercises.Rmd file.
Learn More
General Data Wrangling and Analysis Resources
Because most material that discusses data wrangling and analysis with the dplyr packges does so in a way that covers all of the verbs discussed in this course, I have chosen not to separate them by lesson. Instead, here are some helpful resources for learning more about all of the tidyverse verbs discussed in this course:
Chapter 5 of R for Data Science
RStudio Cloud primer on working with data
Tidyverse for Beginners by Danielle Navarro
Learning Statistics with R by Danielle Navarro
Introduction to the Tidyverse by Alison Hill
A gRadual intRoduction to data wRangling by Chester Ismay and Ted Laderas
You need to be signed-in to comment on this post. Login.
Elan Sykes
October 1, 2021
I got the count function to count the number of rows as assigned, but wanted to figure out a way to figure out the number of rows without an NA/with an answer for hours of sleep per night. I tried to add an argument "na.rm = TRUE" to the n() function in a few places in the code chunk but it didn't work.
David Keyes
October 3, 2021
You'd want to do this in two steps. I showed how to do it in this short video. The code I used is here. Hope that helps!
Leila Maina
November 24, 2022
Hi David. I had the same question as above but cannot access the short video. Any way you could repost the link? Thanks.
David Keyes
November 26, 2022
Sorry, that video got deleted, but here's a new one! If you have other questions, please let me know.
Laura Hickerson
January 17, 2022
Hi David - with the summarize function, it looks like you did not use a select statement afterward, but it still outputs the result. I have to put in a select statement afterwards to get any output for mean_hours_sleep. Any thoughts?
Laura Hickerson
January 17, 2022
oops - sorry, now my code is displaying the output without select. Any tips about when select is required for output?
David Keyes
January 18, 2022
I recorded a short video response. Hope it's helpful!
Kim Cataldo
April 3, 2022
After we run the summarize function to create ‘mean_hours_sleep’ is this considered a new variable? Or only if we assign it? If we don’t assign it, do we have to repeat the summarize line of code whenever we want to reference it again, like when we’re using group_by?
nhanes %>% group_by(gender, work) %>% summarize(mean_hours_sleep = mean(sleep_hrs_night, na.rm = TRUE))
Charlie Hadley
April 5, 2022
Hi Kim! This is another form of the "have you printed to the console or made an assignment" conundrum.
Your code as currently provided is spitting out to the console and that's it.
But if we include an assignment,
Then mean_hours_sleep will now live inside of nhanes_hours_sleep.
In terms of language, I would usually describe mean_hours_sleep as a column inside the dataset (or object) named nhanes_hours_sleep. That's because "variable" already has a few meanings; a single value object in your environment or a variable within a formula created with y ~ x. This isn't to "correct you". But our hope is eventually you end up teaching someone else R as well, and when they have questions about "variables" you'll need to understand what context that is within.
Cheers,
Charlie
Kim Cataldo
April 10, 2022
This is helpful, thank you!