R for the Rest of Us: A Statistics-Free Introduction comes out June 25th. Or you can read the online version today. Check it out â†’

# filter()

## This lesson is locked

If the video is not playing correctly, you can watch it in a new window

### Transcript

Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.

View code shown in video
``````# Load Packages -----------------------------------------------------------

library(tidyverse)

# Import Data -------------------------------------------------------------

# filter() ----------------------------------------------------------------

# We use filter() to choose a subset of observations.

# We use == to select all observations that meet the criteria.

penguins |>

# We use != to select all observations that don't meet the criteria.

penguins |>

# We can combine comparisons and logical operators.

penguins |>
filter(species == "Adelie" | species == "Chinstrap")

# We can use %in% to collapse multiple comparisons into one.

penguins |>

# We can chain together multiple filter functions.
# Doing it this way, we don't have create complex logic in one line.

# Complicated version

penguins |>
filter(species %in% c("Adelie", "Chinstrap") & island == "Torgersen")

# Simpler version

penguins |>
filter(island == "Torgersen")

# We can use <, >, <=, and => for numeric data.

penguins |>
filter(body_mass_g > 4000)

# We can drop NAs with !is.na().

penguins |>
filter(!is.na(sex))

# But the double negative is confusing.
# We can also drop NAs with drop_na().

penguins |>
drop_na(sex)``````

``````# Load Packages -----------------------------------------------------------

library(tidyverse)

# Import Data -------------------------------------------------------------

# Copy the data into the RStudio project
# Create a new R script file and add code to import your data

# filter() ----------------------------------------------------------------

# Use filter() to only keep female penguins

# Use filter() to only keep penguins NOT on Torgersen island

# Use filter() to only keep penguins on Torgersen island or Biscoe island
# Use the or logical operator (|) to do this

# Rewrite your filter() code above to keep the penguins from Torgersen island or Biscoe island
# This time, though, use the %in% operator

# Use a comparison operator to keep penguins with flipper lengths greater than or equal to 193 millimeters

# Drop any rows that have missing data in the flipper_length_mm variable

# Do this first with !is.na()

# Do this a second time with drop_na()

To learn more about the `filter()` function, check out Chapter 3 of R for Data Science.

## Have any questions? Put them below and we will help you out!

You need to be signed-in to comment on this post. Login.

#### Rachel Udow

Pending approval

1. Why is it required to use the summarize() function before using the more specific summary functions (e.g., mean())?
2. Does the "rm" in "na.rm" stand for anything? Just asking as it might help me remember that argument if so. Thank you!

#### Linda Thomson

Thanks for any clarification on this: How are you viewing the result of your filter in your R script window?

penguins |> filter(sex == "female") view()

Consol: Use `print(n = ...)` to see more rows

view() Error in view() : argument "x" is missing, with no default

#### Libby Heeren Coach

Hi, Linda! You'll need to put a pipe after your filter line in order for it to feed the results of your query to the view function.

many thanks!!

#### Derrick Watsala

Hi Coach, Thanks for this interesting lesson on the Tidy verse functions. I am Learning a lot! However I need to know how to save the output for reference, say after I run a filter code successfully.

#### David Keyes Founder

You'll learn how to do this in the Create a New Data Frame lesson! If you still have questions after reviewing that lesson, let me know.

#### Douglas Ndowo

Hi, Is it possible to use the !is.na or the drop_na to drop the NA from multiple variables. Let's say I wanted to drop the NAs from both the flipper_length _mm & sex variables. I've tried several codes but still can't figure it out lol

#### Douglas Ndowo

Figured outðŸ˜€..the drop_na() does this so magically:

penguins |>

drop_na(flipper_length_mm, sex) |>

View () ðŸŽ‰