Pipe Data into ggplot
This lesson is called Pipe Data into ggplot, part of the Going Deeper with R course. This lesson is called Pipe Data into ggplot, part of the Going Deeper with R course.
Transcript
Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.
View code shown in video
# Load Packages -----------------------------------------------------------
library(tidyverse)
library(fs)
# Create Directory --------------------------------------------------------
dir_create("data")
# Download Data -----------------------------------------------------------
download.file("https://github.com/rfortherestofus/going-deeper-v2/raw/main/data/third_grade_math_proficiency.rds",
mode = "wb",
destfile = "data/third_grade_math_proficiency.rds")
# Import Data -------------------------------------------------------------
third_grade_math_proficiency <-
read_rds("data/third_grade_math_proficiency.rds") |>
select(academic_year, school, school_id, district, proficiency_level, number_of_students) |>
mutate(is_proficient = case_when(
proficiency_level >= 3 ~ TRUE,
.default = FALSE
)) |>
group_by(academic_year, school, district, school_id, is_proficient) |>
summarize(number_of_students = sum(number_of_students, na.rm = TRUE)) |>
ungroup() |>
group_by(academic_year, school, district, school_id) |>
mutate(percent_proficient = number_of_students / sum(number_of_students, na.rm = TRUE)) |>
ungroup() |>
mutate(percent_proficient = case_when(
is.nan(percent_proficient) ~ NA,
.default = percent_proficient
)) |>
filter(is_proficient == TRUE) |>
select(academic_year, school, district, percent_proficient) |>
rename(year = academic_year)
# Plot --------------------------------------------------------------------
third_grade_math_proficiency |>
filter(year == "2021-2022") |>
filter(district == "Portland SD 1J") |>
ggplot(aes(x = percent_proficient,
y = school)) +
geom_col()
Your Turn
Create a new R script file.
Download the enrollment data by race/ethnicity and create a data frame called
enrollment_by_race_ethnicity
using the starter code below.Pipe your data into a bar chart that shows the breakdown of race/ethnicity among students in Beaverton SD 48J in 2022-2023.
# Load Packages -----------------------------------------------------------
library(tidyverse)
library(fs)
# Create Directory --------------------------------------------------------
dir_create("data")
# Download Data -----------------------------------------------------------
download.file("https://github.com/rfortherestofus/going-deeper-v2/raw/main/data/enrollment_by_race_ethnicity.rds",
mode = "wb",
destfile = "data/enrollment_by_race_ethnicity.rds")
# Import Data -------------------------------------------------------------
enrollment_by_race_ethnicity <-
read_rds("data/enrollment_by_race_ethnicity.rds") |>
select(-district_institution_id) |>
select(year, district, everything()) |>
mutate(year = case_when(
year == "School 2021-22" ~ "2021-2022",
year == "School 2022-23" ~ "2022-2023",
))
Have any questions? Put them below and we will help you out!
Course Content
44 Lessons
1
Downloading and Importing Data
10:32
2
Overview of Tidy Data
05:50
3
Tidy Data Rule #1: Every Column is a Variable
07:43
4
Tidy Data Rule #3: Every Cell is a Single Value
10:04
5
Tidy Data Rule #2: Every Row is an Observation
04:42
6
Changing Variable Types
04:51
7
Dealing with Missing Data
04:55
8
Advanced Summarizing
06:25
9
Binding Data Frames
07:17
10
Functions
15:06
11
Data Merging
09:27
12
Exporting Data
04:38
13
Bring It All Together (Advanced Data Wrangling)
13:03
1
Best Practices in Data Visualization
03:44
2
Tidy Data
02:25
3
Pipe Data into ggplot
09:54
4
Reorder Plots to Highlight Findings
03:37
5
Line Charts
04:17
6
Use Color to Highlight Findings
09:16
7
Declutter
08:29
8
Add Descriptive Labels to Your Plots
09:10
9
Use Titles to Highlight Findings
08:14
10
Use Annotations to Explain
07:09
11
Tweak Spacing
05:11
12
Create a Custom Theme
03:47
13
Customize Your Fonts
08:32
14
Try New Plot Types
03:24
15
Bring it All Together (Advanced Data Visualization)
14:30
1
Advanced Markdown
06:43
2
Advanced YAML and Code Chunk Options
05:53
3
Tables
18:36
4
Inline R Code
04:42
5
Making Your Reports Shine: Word Edition
04:30
6
Making Your Reports Shine: PDF Edition
06:11
7
Making Your Reports Shine: HTML Edition
06:06
8
Presentations
10:12
9
Dashboards
05:38
10
Websites
06:43
11
Publishing Your Work
04:38
12
Quarto Extensions
05:50
13
Parameterized Reporting, Part 1
10:57
14
Parameterized Reporting, Part 2
05:11
15
Parameterized Reporting, Part 3
07:47
16
Wrapping up Going Deeper with R
You need to be signed-in to comment on this post. Login.
Kimber Carman • May 14, 2025
Hello, I'm having a problem downloading the file. Am I doing something wrong? This is the error message I'm getting:
Gracielle Higino Coach • May 15, 2025
Hi Kimber! Are you able to download the file from your browser?
What happens if you try to use
method = "curl"
ormethod = "wget"
instead ofmethod = "wb"
?Hajira Koeller • May 21, 2025
Hi1 I am getting an error importing the rds file. Error message is "Error in readRDS(con, refhook = refhook) : unknown input format". Any ideas?
Gracielle Higino Coach • May 22, 2025
Hi Hajira! This could be cause by a few different things, I'd need to look at your session to have a more precise guess. I'd recommend that you start your session over and make sure to not save the RHistory or RData. Then try downloading the RDS file again to make sure it's not corrupted, and read it in.
All these things can be sources of problems for reading RDS files (cluttered session/overloaded memory, corrupted RDS file, RHistory with conflicts...), and this process should take care of them. If you still get this error, please feel free to send me a screenshot or a video showing your screen, or book a session with me so we can try to debug it live.