Skip to content
R for the Rest of Us Logo

Fundamentals of R

Create a New Data Frame

Transcript

Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.

View code shown in video
# Load Packages -----------------------------------------------------------

library(tidyverse)

# Import Data -------------------------------------------------------------

penguins <- read_csv("penguins.csv")

# Create a New Data Frame -------------------------------------------------

# Running pipelines simply displays the result

penguins |> 
  group_by(island, year) |> 
  summarize(mean_bill_length = mean(bill_length_mm, na.rm = TRUE)) |> 
  arrange(mean_bill_length)

# If we want to save the result, we need to use the assignment operator

# Most people use the left-hand assignment operator as follows:

penguin_weight_by_island <- penguins |> 
  group_by(island, year) |> 
  summarize(mean_bill_length = mean(bill_length_mm, na.rm = TRUE)) |> 
  arrange(mean_bill_length)

# You can also use the right-hand assignment operator as follows:

penguins |> 
  group_by(island, year) |> 
  summarize(mean_bill_length = mean(bill_length_mm, na.rm = TRUE)) |> 
  arrange(mean_bill_length) -> penguin_weight_by_island_v2

Your Turn

# Load Packages -----------------------------------------------------------

# Load the tidyverse package

library(tidyverse)

# Import Data -------------------------------------------------------------

# Download data from https://rfor.us/penguins
# Copy the data into the RStudio project
# Create a new R script file and add code to import your data

penguins <- read_csv("penguins.csv")
			
# Create a new data frame -------------------------------------------------

# Take the pipeline that you just created and copy it below
# Then assign the result of the pipeline to an object called penguin_body_mass_by_sex

# YOUR CODE HERE

Have any questions? Put them below and we will help you out!

You need to be signed-in to comment on this post. Login.

Da'Shon Carr

Da'Shon Carr • March 19, 2025

I went a step further by wanting to round my average and drop the decimals ( I noticed some decimal places in my answer). I used mutate and round to change it, but is there any easier or simpler way to format it?

Here is my example code:

penguins |> 
  filter(island == "Biscoe") |> 
  drop_na(body_mass_g, sex) |> 
  group_by(sex) |>
  summarize(mean_body_mass = mean(body_mass_g)) |> 
  arrange(desc(mean_body_mass)) |> 
  mutate(across(where(is.numeric), round, 0)) -> penguin_2
Gracielle Higino

Gracielle Higino Coach • March 20, 2025

Hi Da'Shon! This is pretty much as simple as it gets. In the live session today we learned how to use the {scales} package for similar things. One other option I could think of, if you don't want to round but just get rid of the decimals, is to use the as.integer() function to bring your numbers up to an integer type. Not the best approach, but it could be useful.

penguins |> 
  filter(island == "Biscoe") |> 
  drop_na(body_mass_g, sex) |> 
  group_by(sex) |>
  summarize(mean_body_mass = mean(body_mass_g)) |> 
  arrange(desc(mean_body_mass))  |> 
  mutate(mean_body_mass = as.integer(mean_body_mass))

Notice that there are differences between rounding functions in R. There's a very nice blog post about this here: https://www.spsanderson.com/steveondata/posts/2024-12-31/

David also has posted a video about rounding here: https://rfortherestofus.com/2024/05/r-rounding-methods