Get access to all lessons in this course.
- Welcome to Fundamentals of R
- Update Everything
- Start a New Project
-
Data Wrangling and Analysis
- The Tidyverse
- Pipes
- select()
- mutate()
- filter()
- summarize()
- group_by() and summarize()
- arrange()
- Create a New Data Frame
- Bring it All Together (Data Wrangling)
-
Data Visualization
- The Grammar of Graphics
- Scatterplots
- Histograms
- Bar Charts
- Setting color and fill Aesthetic Properties
- Setting color and fill Scales
- Setting x and y Scales
- Adding Text to Plots
- Plot Labels
- Themes
- Facets
- Save Plots
- Bring it All Together (Data Visualization)
-
Quarto
- Quarto Overview
- YAML
- Text
- Code Chunks
- Tips for Working with Quarto
- Bring It All Together (Quarto)
-
Wrapping Up
- An Important Workflow Tip
Fundamentals of R
filter()
This lesson is locked
This lesson is called filter(), part of the Fundamentals of R course. This lesson is called filter(), part of the Fundamentals of R course.
If the video is not playing correctly, you can watch it in a new window
Transcript
Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.
View code shown in video
# Load Packages -----------------------------------------------------------
library(tidyverse)
# Import Data -------------------------------------------------------------
penguins <- read_csv("penguins.csv")
# filter() ----------------------------------------------------------------
# We use filter() to choose a subset of observations.
# We use == to select all observations that meet the criteria.
penguins |>
filter(species == "Adelie")
# We use != to select all observations that don't meet the criteria.
penguins |>
filter(species != "Adelie")
# We can combine comparisons and logical operators.
penguins |>
filter(species == "Adelie" | species == "Chinstrap")
# We can use %in% to collapse multiple comparisons into one.
penguins |>
filter(species %in% c("Adelie", "Chinstrap"))
# We can chain together multiple filter functions.
# Doing it this way, we don't have create complex logic in one line.
# Complicated version
penguins |>
filter(species %in% c("Adelie", "Chinstrap") & island == "Torgersen")
# Simpler version
penguins |>
filter(species %in% c("Adelie", "Chinstrap")) |>
filter(island == "Torgersen")
# We can use <, >, <=, and => for numeric data.
penguins |>
filter(body_mass_g > 4000)
# We can drop NAs with !is.na().
penguins |>
filter(!is.na(sex))
# But the double negative is confusing.
# We can also drop NAs with drop_na().
penguins |>
drop_na(sex)
Your Turn
# Load Packages -----------------------------------------------------------
# Load the tidyverse package
library(tidyverse)
# Import Data -------------------------------------------------------------
# Download data from https://rfor.us/penguins
# Copy the data into the RStudio project
# Create a new R script file and add code to import your data
penguins <- read_csv("penguins.csv")
# filter() ----------------------------------------------------------------
# Use filter() to only keep female penguins
# YOUR CODE HERE
# Use filter() to only keep penguins NOT on Torgersen island
# YOUR CODE HERE
# Use filter() to only keep penguins on Torgersen island or Biscoe island
# Use the or logical operator (|) to do this
# YOUR CODE HERE
# Rewrite your filter() code above to keep the penguins from Torgersen island or Biscoe island
# This time, though, use the %in% operator
# YOUR CODE HERE
# Use a comparison operator to keep penguins with flipper lengths greater than or equal to 193 millimeters
# YOUR CODE HERE
# Drop any rows that have missing data in the flipper_length_mm variable
# Do this first with !is.na()
# YOUR CODE HERE
# Do this a second time with drop_na()
# YOUR CODE HERE
Learn More
To learn more about the filter()
function, check out Chapter 3 of R for Data Science.
You need to be signed-in to comment on this post. Login.
Rachel Udow
March 17, 2024
Hello! Two questions about this lesson:
Linda Thomson
March 24, 2024
Thanks for any clarification on this: How are you viewing the result of your filter in your R script window?
penguins |> filter(sex == "female") view()
Consol: Use
print(n = ...)
to see more rowsLibby Heeren Coach
March 24, 2024
Hi, Linda! You'll need to put a pipe after your filter line in order for it to feed the results of your query to the view function.
Linda Thomson
March 24, 2024
many thanks!!
Derrick Watsala
March 25, 2024
Hi Coach, Thanks for this interesting lesson on the Tidy verse functions. I am Learning a lot! However I need to know how to save the output for reference, say after I run a filter code successfully.
David Keyes Founder
March 25, 2024
You'll learn how to do this in the Create a New Data Frame lesson! If you still have questions after reviewing that lesson, let me know.
Douglas Ndowo
April 2, 2024
Hi, Is it possible to use the !is.na or the drop_na to drop the NA from multiple variables. Let's say I wanted to drop the NAs from both the flipper_length _mm & sex variables. I've tried several codes but still can't figure it out lol
Douglas Ndowo
April 2, 2024
Figured out😀..the drop_na() does this so magically:
penguins |>
drop_na(flipper_length_mm, sex) |>
View () 🎉