arrange()
This lesson is called arrange(), part of the Fundamentals of R course. This lesson is called arrange(), part of the Fundamentals of R course.
Transcript
Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.
View code shown in video
# Load Packages -----------------------------------------------------------
library(tidyverse)
# Import Data -------------------------------------------------------------
penguins <- read_csv("penguins.csv")
# arrange() ---------------------------------------------------------------
# With arrange(), we can reorder rows in a data frame based on the values
# of one or more variables.
# R arranges in ascending order by default.
penguins |>
arrange(bill_length_mm)
# We can also arrange in descending order using desc().
penguins |>
arrange(desc(bill_length_mm))
# We often use arrange() at the end of pipelines to display things in order.
penguins |>
group_by(island, year) |>
summarize(mean_bill_length = mean(bill_length_mm, na.rm = TRUE)) |>
arrange(desc(mean_bill_length))
Your Turn
# Load Packages -----------------------------------------------------------
# Load the tidyverse package
library(tidyverse)
# Import Data -------------------------------------------------------------
# Download data from https://rfor.us/penguins
# Copy the data into the RStudio project
# Create a new R script file and add code to import your data
penguins <- read_csv("penguins.csv")
# arrange() ---------------------------------------------------------------
# Use arrange() to display the penguins data frame in order by body mass
# YOUR CODE HERE
# Now display the penguins data in descending order by body mass
# YOUR CODE HERE
# Create a pipeline that does the following:
# 1. Filters to only keep penguins on Biscoe island
# 2. Drops any rows with NA values for the body_mass_g or sex variables
# 3. Calculates the average body mass by sex
# 4. Displays the result in descending order by average body mass
# YOUR CODE HERE
Learn More
To learn more about the arrange()
function, check out Chapter 3 of R for Data Science.
Have any questions? Put them below and we will help you out!
Course Content
34 Lessons
1
The Grammar of Graphics
04:39
2
Scatterplots
03:46
3
Histograms
05:47
4
Bar Charts
06:37
5
Setting color and fill Aesthetic Properties
02:39
6
Setting color and fill Scales
05:40
7
Setting x and y Scales
03:09
8
Adding Text to Plots
07:32
9
Plot Labels
03:57
10
Themes
02:19
11
Facets
03:12
12
Save Plots
02:57
13
Bring it All Together (Data Visualization)
06:42
You need to be signed-in to comment on this post. Login.
Brian Slattery • September 19, 2023
I tried using arrange() on variables with string data (island, sex, etc), and it looks like it's sorting by alphabetic order. Is that an accepted usage or is there a different function that's normally used for sorting rows with strings?
Also, is there a corresponding function to desc() that makes explicit that it's sorting in ascending order? I couldn't find one by googling. I'm imagining from a readability standpoint it might be nice to make that clear if there are ascending and descending arranges all mixed together. Or, is that just something that I would write a comment to make clear if needed?
Gracielle Higino Coach • September 20, 2023
Hi Brian! Great questions!
Yes, using
arrange()
to sort data alphabetically is very common and recommended. [= This should also help you find typos and hidden characters!sort()
in base R works the same way, and it has an argument you can use to make it explicit if you are sorting in ascending or descending order. If you really want to make it explicit how you are arranging your data, a trick could be to always usedesc()
and add a negative sign before that if you're sorting by ascending order:I hope this helps! Ping me on Discord if you want to chat more! =D
Jessica France • September 20, 2023
Hi. I answered the last question using this code : penguins |> filter(island == "Biscoe", !is.na(body_mass_g)|is.na(sex)) |> group_by(sex) |> summarise(mean_weight = mean(body_mass_g)) |> arrange(desc(mean_weight))
And I got this output:
A tibble: 3 × 2
sex mean_weight
I will like to verify whether I did anything wrong. I do not know if I am to see the 'NA' output as well.
Jessica France • September 20, 2023
I do not know if the output I copied on here is displaying. After submitting the comment, I do not see it. Kindly let me know if it can be seen on your end. Thanks.
Gracielle Higino Coach • September 20, 2023
Hi Jessica! Don't worry, we can see the formatting on the back end!
I understand your line of thought, but what you are coding translates to something like "take penguins, filter only the rows to which the column island is equal to 'Biscoe', AND the column body_mass_g is not NA, OR the column sex is NA". This final bit doesn't really do anything to your data because the logical operator allows R to include the NAs. So in the end you get the NAs in the sex column because you told R it could include them.
Alternatively, you should use the
drop_na()
function after you have already filtered by island (for clarity), and proceed with the grouping and summary.Feel free to follow up on Discord if it's not clear! [=
Maria Dougherty • April 26, 2024
Hi Grace! How do I join the discord? Thank you!
Libby Heeren Coach • April 26, 2024
Hello, Maria! The Discord server you see mentioned here is for members of the R in 3 Months program! Sorry for any confusion!