arrange()
This lesson is called arrange(), part of the R in 3 Months (Fall 2025) course. This lesson is called arrange(), part of the R in 3 Months (Fall 2025) course.
Transcript
Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.
Loading transcript...
View code shown in video
# Load Packages -----------------------------------------------------------
library(tidyverse)
# Import Data -------------------------------------------------------------
penguins <- read_csv("penguins.csv")
# arrange() ---------------------------------------------------------------
# With arrange(), we can reorder rows in a data frame based on the values
# of one or more variables.
# R arranges in ascending order by default.
penguins |>
arrange(bill_length_mm)
# We can also arrange in descending order using desc().
penguins |>
arrange(desc(bill_length_mm))
# We often use arrange() at the end of pipelines to display things in order.
penguins |>
group_by(island, year) |>
summarize(mean_bill_length = mean(bill_length_mm, na.rm = TRUE)) |>
arrange(desc(mean_bill_length))
Your Turn
# Load Packages -----------------------------------------------------------
# Load the tidyverse package
library(tidyverse)
# Import Data -------------------------------------------------------------
# Download data from https://rfor.us/penguins
# Copy the data into the RStudio project
# Create a new R script file and add code to import your data
penguins <- read_csv("penguins.csv")
# arrange() ---------------------------------------------------------------
# Use arrange() to display the penguins data frame in order by body mass
# YOUR CODE HERE
# Now display the penguins data in descending order by body mass
# YOUR CODE HERE
# Create a pipeline that does the following:
# 1. Filters to only keep penguins on Biscoe island
# 2. Drops any rows with NA values for the body_mass_g or sex variables
# 3. Calculates the average body mass by sex
# 4. Displays the result in descending order by average body mass
# YOUR CODE HERE
Learn More
To learn more about the arrange() function, check out Chapter 3 of R for Data Science.
Have any questions? Put them below and we will help you out!
Course Content
128 Lessons
1
Welcome to Getting Started with R
00:57
2
Install R
02:05
3
Install RStudio
02:14
4
Files in R
04:33
5
Projects
07:54
6
Packages
02:38
7
Import Data
05:24
8
Objects and Functions
03:16
9
Examine our Data
12:50
10
Import Our Data Again
07:11
11
Getting Help
07:46
12
Week 1 Live Session (Fall 2025)
53:39
1
Welcome to Fundamentals of R
01:36
2
Update Everything
02:45
3
Start a New Project
02:16
4
The Tidyverse
03:34
5
Pipes
04:15
6
select()
07:25
7
mutate()
04:25
8
filter()
10:05
9
summarize()
05:59
10
group_by() and summarize()
05:54
11
arrange()
02:07
12
Create a New Data Frame
03:58
13
Bring it All Together (Data Wrangling)
07:29
14
Week 2 Project Assignment
09:39
15
Week 2 Coworking Session (Fall 2025)
16
Week 2 Live Session (Fall 2025)
59:15
1
The Grammar of Graphics
04:39
2
Scatterplots
03:46
3
Histograms
05:47
4
Bar Charts
06:37
5
Setting color and fill Aesthetic Properties
02:39
6
Setting color and fill Scales
05:40
7
Setting x and y Scales
03:09
8
Adding Text to Plots
07:32
9
Plot Labels
03:57
10
Themes
02:19
11
Facets
03:12
12
Save Plots
02:57
13
Bring it All Together (Data Visualization)
06:42
14
Week 3 Project Assignment
03:30
15
Week 3 Coworking Session (Fall 2025)
16
Week 3 Live Session (Fall 2025)
1:00:07
1
Downloading and Importing Data
10:32
2
Overview of Tidy Data
05:50
3
Tidy Data Rule #1: Every Column is a Variable
07:43
4
Tidy Data Rule #3: Every Cell is a Single Value
10:04
5
Tidy Data Rule #2: Every Row is an Observation
04:42
6
Week 6 Coworking Session (Fall 2025)
7
Week 6 Live Session (Fall 2025)
59:45
1
Best Practices in Data Visualization
03:44
2
Tidy Data
02:25
3
Pipe Data into ggplot
09:54
4
Reorder Plots to Highlight Findings
03:37
5
Line Charts
04:17
6
Use Color to Highlight Findings
09:16
7
Declutter
08:29
8
Add Descriptive Labels to Your Plots
09:10
9
Use Titles to Highlight Findings
08:14
10
Use Annotations to Explain
07:09
11
Week 9 Coworking Session (Fall 2025)
12
Week 9 Live Session (Fall 2025)
56:18
1
Advanced Markdown
06:43
2
Tables
18:36
3
Advanced YAML and Code Chunk Options
05:53
4
Inline R Code
04:42
5
Making Your Reports Shine: Word Edition
04:30
6
Making Your Reports Shine: PDF Edition
06:11
7
Making Your Reports Shine: HTML Edition
06:06
8
Presentations
10:12
9
Dashboards
05:38
10
Websites
06:43
11
Publishing Your Work
04:38
12
Quarto Extensions
05:50
13
Parameterized Reporting, Part 1
10:57
14
Parameterized Reporting, Part 2
05:11
15
Parameterized Reporting, Part 3
07:47
16
Week 12 Coworking Session (Fall 2025)
17
Week 12 Live Session (Fall 2025)
54:14
You need to be signed-in to comment on this post. Login.
Brian Slattery • September 20, 2023
I tried using arrange() on variables with string data (island, sex, etc), and it looks like it's sorting by alphabetic order. Is that an accepted usage or is there a different function that's normally used for sorting rows with strings?
Also, is there a corresponding function to desc() that makes explicit that it's sorting in ascending order? I couldn't find one by googling. I'm imagining from a readability standpoint it might be nice to make that clear if there are ascending and descending arranges all mixed together. Or, is that just something that I would write a comment to make clear if needed?
Gracielle Higino Coach • September 20, 2023
Hi Brian! Great questions!
Yes, using
arrange()to sort data alphabetically is very common and recommended. [= This should also help you find typos and hidden characters!sort()in base R works the same way, and it has an argument you can use to make it explicit if you are sorting in ascending or descending order. If you really want to make it explicit how you are arranging your data, a trick could be to always usedesc()and add a negative sign before that if you're sorting by ascending order:I hope this helps! Ping me on Discord if you want to chat more! =D
Jessica France • September 20, 2023
Hi. I answered the last question using this code : penguins |> filter(island == "Biscoe", !is.na(body_mass_g)|is.na(sex)) |> group_by(sex) |> summarise(mean_weight = mean(body_mass_g)) |> arrange(desc(mean_weight))
And I got this output:
A tibble: 3 × 2
sex mean_weight
I will like to verify whether I did anything wrong. I do not know if I am to see the 'NA' output as well.
Jessica France • September 20, 2023
I do not know if the output I copied on here is displaying. After submitting the comment, I do not see it. Kindly let me know if it can be seen on your end. Thanks.
Gracielle Higino Coach • September 21, 2023
Hi Jessica! Don't worry, we can see the formatting on the back end!
I understand your line of thought, but what you are coding translates to something like "take penguins, filter only the rows to which the column island is equal to 'Biscoe', AND the column body_mass_g is not NA, OR the column sex is NA". This final bit doesn't really do anything to your data because the logical operator allows R to include the NAs. So in the end you get the NAs in the sex column because you told R it could include them.
Alternatively, you should use the
drop_na()function after you have already filtered by island (for clarity), and proceed with the grouping and summary.Feel free to follow up on Discord if it's not clear! [=
Maria Dougherty • April 26, 2024
Hi Grace! How do I join the discord? Thank you!
Libby Heeren Coach • April 26, 2024
Hello, Maria! The Discord server you see mentioned here is for members of the R in 3 Months program! Sorry for any confusion!
gene trevino • May 22, 2025
When I run the code and the code in the solutions,
penguins |> filter(island == "Biscoe") |> drop_na(body_mass_g, sex) |> group_by(sex) |> summarize(avg_body_mass = mean(body_mass_g)) |> arrange(desc(avg_body_mass))
I get the following:
Warning message: There were 3 warnings in
summarize(). The first warning was: ℹ In argument:avg_body_mass = mean(body_mass_g). ℹ In group 1:sex = "NA". Caused by warning inmean.default(): ! argument is not numeric or logical: returning NA ℹ Run warnings()dplyr::last_dplyr_warnings() to see the 2 remaining warnings.Gracielle Higino Coach • May 23, 2025
Hi Gene! I can't reproduce the warning with the code you provided, but it seems like R is misunderstanding some of your variables. Maybe you have objects with the same name on your environment. Do you get a 2x2 tibble like this?
If so, you should ignore the warning. If not, then try refreshing your session without saving the history or RData.
Let us know if the message persists!
Kaela Scott • September 30, 2025
If I wanted to arrange and remove NAs when I arranged, can I add that in as an argument or do I need to do that first?
penguins |> arrange(body_mass_g, rm.na = TRUE)
OR
penguins |> drop_na(body_mass_g) |> arrange(body_mass_g)
Gracielle Higino Coach • October 2, 2025
Hi Kaela! That's a very interesting question!
The
na.rmargument ignores NAs when doing an operation, it doesn't actually remove them from the dataset by default. That's why you end up with the same number of rows than the original data if you runarrange(body_mass_g, na.rm = TRUE).On the other hand,
drop_na()removes the NAs from the dataset, and that's why you end up with less rows when runningdrop_na(body_mass_g) |> arrange(body_mass_g).I hope that helps! Let me know what other questions you have!
Jessica Purser • October 17, 2025
Hi! It would be helpful if you also put the results of the functions in the solution. I'm left wondering if my code, which ran, gave me the correct answers. For the last question, I used the code: penguins |> filter(island == "Biscoe") |> drop_na(body_mass_g,sex) |> summarize(mass_sex = mean(body_mass_g),.by=c(sex)) |> arrange(desc(mass_sex))
In the console, my answers were male 5105 and female 4319, but I'm still like, ah, I don't know if I did it right.
Gracielle Higino Coach • October 19, 2025
Hi Jessica! I totally get it and it does take time sometimes for us to trust R! Overall, if the process is exactly the same, it should yield the same results. This means that if the code on the solutions is the same that you are using, both would get the same numbers. The code is what's important here, and getting the same numbers is not a guarantee that you got the process right.
But I do understand the importance of "testing the machine". One thing I would do when I first started learning R, 15 years ago, would be to do a series of stats in Excel (and some by hand) and then compare the result with what I'd get with R. Excel often gives a slightly different result because of how it deals with decimals and rounding, but overall the results were the same. [=