Skip to content
What's New in R is a weekly email to help you up your R game. Sign up →
R for the Rest of Us Logo

Fundamentals of R

filter()

Transcript

Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.

View code shown in video
# Load Packages -----------------------------------------------------------

library(tidyverse)

# Import Data -------------------------------------------------------------

penguins <- read_csv("penguins.csv")

# filter() ----------------------------------------------------------------

# We use filter() to choose a subset of observations.

# We use == to select all observations that meet the criteria.

penguins |> 
  filter(species == "Adelie")

# We use != to select all observations that don't meet the criteria.

penguins |> 
  filter(species != "Adelie")

# We can combine comparisons and logical operators.

penguins |> 
  filter(species == "Adelie" | species == "Chinstrap")

# We can use %in% to collapse multiple comparisons into one.

penguins |> 
  filter(species %in% c("Adelie", "Chinstrap"))

# We can chain together multiple filter functions. 
# Doing it this way, we don't have create complex logic in one line.

# Complicated version

penguins |> 
  filter(species %in% c("Adelie", "Chinstrap") & island == "Torgersen")

# Simpler version

penguins |> 
  filter(species %in% c("Adelie", "Chinstrap")) |> 
  filter(island == "Torgersen")

# We can use <, >, <=, and => for numeric data.

penguins |> 
  filter(body_mass_g > 4000)

# We can drop NAs with !is.na(). 

penguins |> 
  filter(!is.na(sex))

# But the double negative is confusing.
# We can also drop NAs with drop_na().

penguins |> 
  drop_na(sex)

Your Turn

# Load Packages -----------------------------------------------------------

# Load the tidyverse package

library(tidyverse)

# Import Data -------------------------------------------------------------

# Download data from https://rfor.us/penguins
# Copy the data into the RStudio project
# Create a new R script file and add code to import your data

penguins <- read_csv("penguins.csv")
			
# filter() ----------------------------------------------------------------

# Use filter() to only keep female penguins

# YOUR CODE HERE

# Use filter() to only keep penguins NOT on Torgersen island

# YOUR CODE HERE

# Use filter() to only keep penguins on Torgersen island or Biscoe island
# Use the or logical operator (|) to do this

# YOUR CODE HERE

# Rewrite your filter() code above to keep the penguins from Torgersen island or Biscoe island
# This time, though, use the %in% operator

# YOUR CODE HERE

# Use a comparison operator to keep penguins with flipper lengths greater than or equal to 193 millimeters

# YOUR CODE HERE

# Drop any rows that have missing data in the flipper_length_mm variable

# Do this first with !is.na()

# YOUR CODE HERE

# Do this a second time with drop_na()

# YOUR CODE HERE

Learn More

To learn more about the filter() function, check out Chapter 3 of R for Data Science.

Have any questions? Put them below and we will help you out!

You need to be signed-in to comment on this post. Login.

Rachel Udow

Rachel Udow • March 17, 2024

Pending approval

Hello! Two questions about this lesson:

  1. Why is it required to use the summarize() function before using the more specific summary functions (e.g., mean())?
  2. Does the "rm" in "na.rm" stand for anything? Just asking as it might help me remember that argument if so. Thank you!
Linda Thomson

Linda Thomson • March 24, 2024

Thanks for any clarification on this: How are you viewing the result of your filter in your R script window?

penguins |> filter(sex == "female") view()

Consol: Use print(n = ...) to see more rows

view() Error in view() : argument "x" is missing, with no default

Libby Heeren

Libby Heeren Coach • March 24, 2024

Hi, Linda! You'll need to put a pipe after your filter line in order for it to feed the results of your query to the view function.

Linda Thomson

Linda Thomson • March 24, 2024

many thanks!!

Derrick Watsala

Derrick Watsala • March 25, 2024

Hi Coach, Thanks for this interesting lesson on the Tidy verse functions. I am Learning a lot! However I need to know how to save the output for reference, say after I run a filter code successfully.

David Keyes

David Keyes Founder • March 25, 2024

You'll learn how to do this in the Create a New Data Frame lesson! If you still have questions after reviewing that lesson, let me know.

Douglas Ndowo

Douglas Ndowo • April 2, 2024

Hi, Is it possible to use the !is.na or the drop_na to drop the NA from multiple variables. Let's say I wanted to drop the NAs from both the flipper_length _mm & sex variables. I've tried several codes but still can't figure it out lol

Douglas Ndowo

Douglas Ndowo • April 2, 2024

Figured out😀..the drop_na() does this so magically:

penguins |>

drop_na(flipper_length_mm, sex) |>

View () 🎉

Grace Lau

Grace Lau • September 26, 2024

Hello,

I have a question about %n%. It's not working for me. This is my code:

penguins |> filter(species %n% c("Adelie", "Chinstrap"))

I get an error message, like so:

Error in filter(): ℹ In argument: species %n% c("Adelie", "Chinstrap"). Caused by error in species %n% c("Adelie", "Chinstrap"): ! could not find function "%n%" Run rlang::last_trace() to see where the error occurred.

Gracielle Higino

Gracielle Higino Coach • September 26, 2024

Hey Grace! I know you got the answer in our live session just now, but just to keep it on record: it's a typo on your %in% operator =D you were missing the "i"

Raouf Kilada

Raouf Kilada • October 8, 2024

why do I get the error message when I use View() Error in is.data.frame(x) : argument "x" is missing, with no default

Gracielle Higino

Gracielle Higino Coach • October 8, 2024

One possibility is that you're running the function without a mandatory argument. To use View(), you must designate which dataframe you want to see. So you can write the code like this:

dataset |> View()

Or this:

View(dataset)

Replace "dataset" by the name of your data object and it should work!

Raouf Kilada

Raouf Kilada • October 8, 2024

THANK YOU....It worked