Fundamentals of R

# group_by() and summarize()

### Transcript

``````# Load Packages -----------------------------------------------------------

library(tidyverse)

# Import Data -------------------------------------------------------------

penguins <- read_csv("penguins.csv")

# group_by() and summarize() ----------------------------------------------

# summarize() becomes truly powerful when paired with group_by(),
# which enables us to perform calculations on multiple groups.

# Calculate the mean bill length for penguins on different islands.

penguins |>
group_by(island) |>
summarize(mean_bill_length = mean(bill_length_mm, na.rm = TRUE))

# We can use group_by() with multiple groups.

penguins |>
group_by(island, year) |>
summarize(mean_bill_length = mean(bill_length_mm, na.rm = TRUE))

# Another option is to use the .by argument in summarize().

penguins |>
summarize(mean_bill_length = mean(bill_length_mm, na.rm = TRUE),
.by = c(island, year))

# You can count the number of penguins in each group using the n() summary function.

penguins |>
group_by(island) |>
summarize(number_of_penguins = n())

# But a simpler way do this is with the count() function.

penguins |>
count(island)

# You can also use count() with multiple groups.

penguins |>
count(island, year)``````

## Your Turn

``````# Load Packages -----------------------------------------------------------

# Load the tidyverse package

library(tidyverse)

# Import Data -------------------------------------------------------------

# Download data from https://rfor.us/penguins
# Copy the data into the RStudio project
# Create a new R script file and add code to import your data

penguins <- read_csv("penguins.csv")

# group_by() and summarize() ----------------------------------------------

# Calculate the weight of the heaviest penguin on each island.

# YOUR CODE HERE

# Calculate the weight of the heaviest penguin on each island for each year.

# YOUR CODE HERE
``````

## Learn More

To learn more about the `group_by()` and `summarize()` functions, check out Chapter 3 of R for Data Science.

