mutate()
This lesson is called mutate(), part of the R in 3 Months (Spring 2025) course. This lesson is called mutate(), part of the R in 3 Months (Spring 2025) course.
Transcript
Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.
View code shown in video
# Load Packages -----------------------------------------------------------
library(tidyverse)
# Import Data -------------------------------------------------------------
penguins <- read_csv("penguins.csv")
# mutate() ----------------------------------------------------------------
# We use mutate() we make new variables or change existing ones.
# We can use mutate() in three ways.
# 1. Create a new variable with a specific value:
penguins |>
mutate(continent = "Antarctica")
# 2. Create a new variable based on other variables:
penguins |>
mutate(body_mass_lbs = body_mass_g / 453.6)
# 3. Change an existing variable
penguins |>
mutate(bill_length_mm = bill_length_mm + 1)
Your Turn
# Load Packages -----------------------------------------------------------
# Load the tidyverse package
library(tidyverse)
# Import Data -------------------------------------------------------------
# Download data from https://rfor.us/penguins
# Copy the data into the RStudio project
# Create a new R script file and add code to import your data
penguins <- read_csv("penguins.csv")
# mutate() ----------------------------------------------------------------
# Use mutate() to create a variable called observation_station and set its value to "Palmer"
# YOUR CODE HERE
# 2. Create a new variable based on other variables:
# YOUR CODE HERE
# 3. Change an existing variable
# YOUR CODE HERE
Learn More
To learn more about the mutate()
function, check out Chapter 3 of R for Data Science.
Have any questions? Put them below and we will help you out!
Course Content
127 Lessons
1
Welcome to Getting Started with R
00:57
2
Install R
02:05
3
Install RStudio
02:14
4
Files in R
04:33
5
Projects
07:54
6
Packages
02:38
7
Import Data
05:24
8
Objects and Functions
03:16
9
Examine our Data
12:50
10
Import Our Data Again
07:11
11
Getting Help
07:46
12
Week 1 Live Session (Spring 2025)
1:03:11
1
Welcome to Fundamentals of R
01:36
2
Update Everything
02:45
3
Start a New Project
02:16
4
The Tidyverse
03:34
5
Pipes
04:15
6
select()
07:25
7
mutate()
04:25
8
filter()
10:05
9
summarize()
05:59
10
group_by() and summarize()
05:54
11
arrange()
02:07
12
Create a New Data Frame
03:58
13
Bring it All Together (Data Wrangling)
07:29
14
Week 2 Project Assignment
09:39
15
Week 2 Coworking Session (Spring 2025)
16
Week 2 Live Session (Spring 2025)
1:03:24
1
The Grammar of Graphics
04:39
2
Scatterplots
03:46
3
Histograms
05:47
4
Bar Charts
06:37
5
Setting color and fill Aesthetic Properties
02:39
6
Setting color and fill Scales
05:40
7
Setting x and y Scales
03:09
8
Adding Text to Plots
07:32
9
Plot Labels
03:57
10
Themes
02:19
11
Facets
03:12
12
Save Plots
02:57
13
Bring it All Together (Data Visualization)
06:42
14
Week 3 Project Assignment
03:30
15
Week 3 Coworking Session (Spring 2025)
16
Week 3 Live Session (Spring 2025)
1:02:31
1
Downloading and Importing Data
10:32
2
Overview of Tidy Data
05:50
3
Tidy Data Rule #1: Every Column is a Variable
07:43
4
Tidy Data Rule #3: Every Cell is a Single Value
10:04
5
Tidy Data Rule #2: Every Row is an Observation
04:42
6
Week 6 Coworking Session (Spring 2025)
7
Week 6 Live Session (Spring 2025)
1:02:38
1
Best Practices in Data Visualization
03:44
2
Tidy Data
02:25
3
Pipe Data into ggplot
09:54
4
Reorder Plots to Highlight Findings
03:37
5
Line Charts
04:17
6
Use Color to Highlight Findings
09:16
7
Declutter
08:29
8
Add Descriptive Labels to Your Plots
09:10
9
Use Titles to Highlight Findings
08:14
10
Use Annotations to Explain
07:09
11
Week 9 Coworking Session (Spring 2025)
12
Week 9 Live Session (Spring 2025)
59:09
1
Advanced Markdown
06:43
2
Tables
18:36
3
Advanced YAML and Code Chunk Options
05:53
4
Inline R Code
04:42
5
Making Your Reports Shine: Word Edition
04:30
6
Making Your Reports Shine: PDF Edition
06:11
7
Making Your Reports Shine: HTML Edition
06:06
8
Presentations
10:12
9
Dashboards
05:38
10
Websites
06:43
11
Publishing Your Work
04:38
12
Quarto Extensions
05:50
13
Parameterized Reporting, Part 1
10:57
14
Parameterized Reporting, Part 2
05:11
15
Parameterized Reporting, Part 3
07:47
16
Week 12 Coworking Session (Spring 2025)
17
Week 12 Live Session (Spring 2025)
57:01
You need to be signed-in to comment on this post. Login.
Rob Schoen • October 13, 2023
There are some things I'd like to do for part 3 of the assignment (change an existing variable) that I don't know how to do. For example, I'd like to change the "sex" variable to be called "female" and then change the values from female-> 1, male -> 0, and NA -> NA. How can I find the functions that will enable me to do that?
Libby Heeren Coach • October 14, 2023
Hey, Rob! The case_when() function will allow you to change values from female-> 1, male -> 0, and NA -> NA and the rename() function will allow you to rename a column. You'll get to these functions in time throughout the course once you get to the advanced data wrangling sections, but if you'd like to check out an older video clip (videos are in the process of being replaced) you can see one about case_when here.
If you'd like to try out the rename() function, try adding a line to your dplyr code like this: penguins |> mutate(continent = "Antarctica") |> rename(sex_numeric = sex_v2)
The syntax works like this here: rename(new_column_name = old_column_name)
Please feel free to message me on Discord with any questions!
Gabby Bachhuber • March 19, 2024
Is it not possible to mutate categorical variable names? I tried to mutate "species", but perhaps that isn't possible (it generated an error). I think I need to use rename() instead, as noted below?
David Keyes Founder • March 19, 2024
Yes, if you just want to rename a variable (of any type),
rename()
is your best bet.Charles Obiorah • March 25, 2024
Hi David, My progress has been slowed down by how my video plays and stops. It is no longer flowing seamlessly and I wonder where I have to touch the settings. i tried another network provider and it seems to persist. Next, I tried the assignment of creating a new variable and changing an existing one. While I could see the new variable in a new column Antarctica, I could not see the new column body_mass_lbs as it is in your video. Note that I saw this on the Console:
David Keyes Founder • March 26, 2024
Sorry about the website issues. We're working to resolve these ASAP.
To answer your question, I'd need to see your code. Please paste it in the comment.
Valliappan Muthu • April 24, 2024
Hello, I have a question
Say I have a variable “plant” and which can be either “tree” or “shrub “. Coded the data in excel or spss as 1 and 0 for tree and shrub
If I want to analyse data of only “1” that is only “trees”
What should I do?
I have day where the information is coded as 1 and 0
If I want to filter these observations containing “1”
Valliappan Muthu • April 24, 2024
Hello Sorry there was an error in my code and question number 2 was resolved when I used the code Filter (plant == “1”)
I still want to know about the first question
Thanks!
Libby Heeren Coach • April 25, 2024
Hi, Valli! All of your questions can be solved using the
case_when
function insidemutate
. You can watch this video from week 7 which has an example ofcase_when
and also lists resources below the video with further examples of how to use it in different ways.An example of it in David's code during Week 7 is this:
Valliappan Muthu • April 24, 2024
Sorry I have been coming with questions from old lessons After I started cleaning the data which I work on
Kindly suggest the ways to categorise continued variable E.g say I have income of 100 individuals as a continuous variable Now I want to analyse my parameters of interest between income more than 1000$ and less than 1000$
Can I use mutate to create categories from pre existing continuous variable?
Libby Heeren Coach • April 25, 2024
See my other response about
case_when
, but you can use it to create a categorical variable based on conditional statements, such asDenny Lu • July 10, 2024
Why does the code sometimes need quotations and other times not?
David Keyes Founder • July 11, 2024
I actually discussed this in R in 3 Months. Here's a video clip from that. Let me know if this helps!
Imelda Akurut • September 25, 2024
Hi David , i need some help , i posted the code .....penguins |> mutate(observation_station="palmer")..... for mutate and it doesn't show in the data , i get this instead
Gracielle Higino Coach • September 26, 2024
Hi Imelda! I might need a bit more of information to give you a more definitive answer, but I suspect that you're not seeing changes in the dataset because you're not assigning the operation to an object. If you just run
penguins |> mutate(observation_station="palmer")
, you'll get a sample of your dataset in the console panel, with the text you've mentioned below it. However, if you runpenguins <- penguins |> mutate(observation_station="palmer")
, your new dataset will be stored in thepenguins
object, and you'll see the result it in the data viewer panel. Let me know if this was helpful!Samreen Chhabra • November 6, 2024
i keep running into the same error as the one few have mentioned before, and do not see the mutated rows at all!
this is the code i am using:
would love some help here!
David Keyes Founder • November 7, 2024
Can you please share the full code you're using all the way from data import to the
mutate()
code?Samreen Chhabra • November 7, 2024
hi, thanks for your response! its as follows:
David Keyes Founder • November 7, 2024
Ok, I'm not seeing anything off there. I'm guessing, though, by the message you're getting, that the
height_on_weight_ratio
variable is there, just not showing up. Could you record a quick video showing yourself running your code and let me know when it's posted?Samreen Chhabra • November 11, 2024
hi, i did as instructed and it says that the video is posted, thanks!
Greg Regaignon • February 20, 2025
For creating a new var based on other vars, I am getting a function error: "The pipe operator requires a function call as RHS (:5:1)" This happens for both these lines of code: penguins |> mutate(flipperL_wt_ratio = flipper_length_mm / body_mass_g)
penguins |> mutate(both_flippers = (flipper_length_mm * 2))
Greg Regaignon • February 20, 2025
OK now the second one is working: penguins |> mutate(both_flippers = (flipper_length_mm * 2)) (Not sure why, I didn't change anything, just ran it again.) But I'm still getting the same error for the first
Gracielle Higino Coach • February 20, 2025
Hi Greg! I've seen this happen before when there were extra pipes in the script. It might be that you ran a code block with an extra pipe somewhere and R kept waiting for the rest of the code, then even if you run just those lines that you copied here - which contain no extra pipes -, R will complain to you. 😅
Queeneth Onwuka • March 19, 2025
how do i see the observation_station on the rows?
Gracielle Higino Coach • March 20, 2025
Hi Queeneth! There are a few options to do that. The first one is to create a new object with your updated dataset with the new variable and view it on the data panel. Another option is to pipe your mutate operation into a
print(width=Inf)
to show all the columns in the console. Alternatively, if you want to see only that column, you can useselect()
at the end of your mutate operation. Let us know if this answers your question!Heather Worker • March 19, 2025
How do I see the actual new variable as a column in my console? I understand that (at least in this exercise as I understand it) we are not altering our original data frame so I know I would not see the newly created variables there but how do I see basically an extended view of my columns. I don't need to see more rows beyond the 10 that are shown.
Gracielle Higino Coach • March 20, 2025
Hi Heather! There are a few options to do that. The first one is to create a new object with your updated dataset with the new variable and view it on the data panel. Another option is to pipe your mutate operation into a
print(width=Inf)
to show all the columns in the console. Let us know if this answers your question!gene trevino • March 24, 2025
When I run my code, and the solutions code, i get the following error:
Caused by error in
bill_length_mm + 1
: ! non-numeric argument to binary operatorGracielle Higino Coach • March 25, 2025
Hi Gene! We might need to see your code to debug that. Can you copy it here for us?
gene trevino • March 25, 2025
When I write the following code: penguins |> mutate(bill_length_mm = bill_length_mm + 1)
I get this error: Caused by error in
bill_length_mm + 1
: ! non-numeric argument to binary operatorGracielle Higino Coach • March 25, 2025
There might something before that that's causing the error. When you inspect your dataset, is
bill_length_mm
numeric? Do you see numbers aligned to the right side in this column?gene trevino • March 27, 2025
Using glimpse(penguins) to view the data, all the variables, except year, are character variables.
Gracielle Higino Coach • March 27, 2025
That's very odd... Could you please share your whole code?