Skip to content
R for the Rest of Us: A Statistics-Free Introduction comes out June 25th. Or you can read the online version today. Check it out →
R for the Rest of Us Logo

This lesson is locked

Get access to all lessons in this course.

If the video is not playing correctly, you can watch it in a new window

Transcript

Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.

View code shown in video
# Load Packages -----------------------------------------------------------

library(tidyverse)

# Import Data -------------------------------------------------------------

penguins <- read_csv("penguins.csv")

# arrange() ---------------------------------------------------------------

# With arrange(), we can reorder rows in a data frame based on the values 
# of one or more variables. 
# R arranges in ascending order by default.

penguins |> 
  arrange(bill_length_mm)

# We can also arrange in descending order using desc().

penguins |>  
  arrange(desc(bill_length_mm))

# We often use arrange() at the end of pipelines to display things in order.

penguins |> 
  group_by(island, year) |> 
  summarize(mean_bill_length = mean(bill_length_mm, na.rm = TRUE)) |> 
  arrange(desc(mean_bill_length))

Your Turn

# Load Packages -----------------------------------------------------------

# Load the tidyverse package

library(tidyverse)

# Import Data -------------------------------------------------------------

# Download data from https://rfor.us/penguins
# Copy the data into the RStudio project
# Create a new R script file and add code to import your data

penguins <- read_csv("penguins.csv")
			
# arrange() ---------------------------------------------------------------

# Use arrange() to display the penguins data frame in order by body mass

# YOUR CODE HERE

# Now display the penguins data in descending order by body mass

# YOUR CODE HERE

# Create a pipeline that does the following:
# 1. Filters to only keep penguins on Biscoe island
# 2. Drops any rows with NA values for the body_mass_g or sex variables
# 3. Calculates the average body mass by sex
# 4. Displays the result in descending order by average body mass

# YOUR CODE HERE

Learn More

To learn more about the arrange() function, check out Chapter 3 of R for Data Science.

Have any questions? Put them below and we will help you out!

You need to be signed-in to comment on this post. Login.

Brian Slattery

Brian Slattery

September 20, 2023

I tried using arrange() on variables with string data (island, sex, etc), and it looks like it's sorting by alphabetic order. Is that an accepted usage or is there a different function that's normally used for sorting rows with strings?

Also, is there a corresponding function to desc() that makes explicit that it's sorting in ascending order? I couldn't find one by googling. I'm imagining from a readability standpoint it might be nice to make that clear if there are ascending and descending arranges all mixed together. Or, is that just something that I would write a comment to make clear if needed?

Gracielle Higino

Gracielle Higino Coach

September 20, 2023

Hi Brian! Great questions!

Yes, using arrange() to sort data alphabetically is very common and recommended. [= This should also help you find typos and hidden characters! sort() in base R works the same way, and it has an argument you can use to make it explicit if you are sorting in ascending or descending order. If you really want to make it explicit how you are arranging your data, a trick could be to always use desc() and add a negative sign before that if you're sorting by ascending order:

penguins |> 
  arrange(-desc(island))

I hope this helps! Ping me on Discord if you want to chat more! =D

Jessica France

Jessica France

September 20, 2023

Hi. I answered the last question using this code : penguins |> filter(island == "Biscoe", !is.na(body_mass_g)|is.na(sex)) |> group_by(sex) |> summarise(mean_weight = mean(body_mass_g)) |> arrange(desc(mean_weight))

And I got this output:

A tibble: 3 × 2

sex mean_weight

I will like to verify whether I did anything wrong. I do not know if I am to see the 'NA' output as well.

Jessica France

Jessica France

September 20, 2023

I do not know if the output I copied on here is displaying. After submitting the comment, I do not see it. Kindly let me know if it can be seen on your end. Thanks.

Gracielle Higino

Gracielle Higino Coach

September 21, 2023

Hi Jessica! Don't worry, we can see the formatting on the back end!

I understand your line of thought, but what you are coding translates to something like "take penguins, filter only the rows to which the column island is equal to 'Biscoe', AND the column body_mass_g is not NA, OR the column sex is NA". This final bit doesn't really do anything to your data because the logical operator allows R to include the NAs. So in the end you get the NAs in the sex column because you told R it could include them.

Alternatively, you should use the drop_na() function after you have already filtered by island (for clarity), and proceed with the grouping and summary.

Feel free to follow up on Discord if it's not clear! [=

Maria Dougherty

Maria Dougherty

April 26, 2024

Hi Grace! How do I join the discord? Thank you!

Libby Heeren

Libby Heeren Coach

April 26, 2024

Hello, Maria! The Discord server you see mentioned here is for members of the R in 3 Months program! Sorry for any confusion!