 Welcome to Fundamentals of R
 Update Everything
 Start a New Project

Data Wrangling and Analysis
 The Tidyverse
 Pipes
 select()
 mutate()
 filter()
 summarize()
 group_by() and summarize()
 arrange()
 Create a New Data Frame
 Bring it All Together (Data Wrangling)

Data Visualization
 The Grammar of Graphics
 Scatterplots
 Histograms
 Bar Charts
 Setting color and fill Aesthetic Properties
 Setting color and fill Scales
 Setting x and y Scales
 Adding Text to Plots
 Plot Labels
 Themes
 Facets
 Save Plots
 Bring it All Together (Data Visualization)

Quarto
 Quarto Overview
 YAML
 Text
 Code Chunks
 Tips for Working with Quarto
 Bring It All Together (Quarto)

Wrapping Up
 An Important Workflow Tip
Fundamentals of R
group_by() and summarize()
This lesson is called group_by() and summarize(), part of the Fundamentals of R course. This lesson is called group_by() and summarize(), part of the Fundamentals of R course.
If the video is not playing correctly, you can watch it in a new window
Transcript
Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.
View code shown in video
# Load Packages 
library(tidyverse)
# Import Data 
penguins < read_csv("penguins.csv")
# group_by() and summarize() 
# summarize() becomes truly powerful when paired with group_by(),
# which enables us to perform calculations on multiple groups.
# Calculate the mean bill length for penguins on different islands.
penguins >
group_by(island) >
summarize(mean_bill_length = mean(bill_length_mm, na.rm = TRUE))
# We can use group_by() with multiple groups.
penguins >
group_by(island, year) >
summarize(mean_bill_length = mean(bill_length_mm, na.rm = TRUE))
# Another option is to use the .by argument in summarize().
penguins >
summarize(mean_bill_length = mean(bill_length_mm, na.rm = TRUE),
.by = c(island, year))
# You can count the number of penguins in each group using the n() summary function.
penguins >
group_by(island) >
summarize(number_of_penguins = n())
# But a simpler way do this is with the count() function.
penguins >
count(island)
# You can also use count() with multiple groups.
penguins >
count(island, year)
Your Turn
# Load Packages 
# Load the tidyverse package
library(tidyverse)
# Import Data 
# Download data from https://rfor.us/penguins
# Copy the data into the RStudio project
# Create a new R script file and add code to import your data
penguins < read_csv("penguins.csv")
# group_by() and summarize() 
# Calculate the weight of the heaviest penguin on each island.
# YOUR CODE HERE
# Calculate the weight of the heaviest penguin on each island for each year.
# YOUR CODE HERE
Learn More
To learn more about the group_by()
and summarize()
functions, check out Chapter 3 of R for Data Science.
You need to be signedin to comment on this post. Login.