Get access to all lessons in this course.
-
Welcome to R in 3 Months
- Welcome to R in 3 Months (Spring 2024)
- Complete Pre-Survey
- Sign up for Discord
-
Week 1: Getting Started with R
- Welcome to Getting Started with R
- Install R
- Install RStudio
- Projects
- Files in R
- Packages
- Import Data
- Objects and Functions
- Examine our Data
- Import Our Data Again
- Getting Help
- Getting Started with R Questions (R in 3 Months Spring 2024)
- Week 1 Live Session (Spring 2024)
-
Week 2: Data Wrangling and Analysis
- Welcome to Fundamentals of R
- Update Everything
- Start a New Project
- The Tidyverse
- Pipes
- select()
- mutate()
- filter()
- summarize()
- group_by() and summarize()
- arrange()
- Create a New Data Frame
- Bring it All Together (Data Wrangling)
- Week 2 Project Assignment
- Week 2 Live Session (Spring 2024)
-
Week 3: Data Visualization
- The Grammar of Graphics
- Scatterplots
- Histograms
- Bar Charts
- Setting color and fill Aesthetic Properties
- Setting color and fill Scales
- Setting x and y Scales
- Adding Text to Plots
- Plot Labels
- Themes
- Facets
- Save Plots
- Bring it All Together (Data Visualization)
- Week 3 Project Assignment
- Week 3 Live Session (Spring 2024)
-
Week 4: Quarto
- Quarto Overview
- YAML
- Text
- Code Chunks
- Tips for Working with Quarto
- Bring It All Together (Quarto)
- An Important Workflow Tip
- Week 4 Project Assignment
- Week 4 Live Session (Spring 2024)
-
Week 5: Catch-Up Week
- Catch-Up Week (Week 5)
-
Week 6: Git + GitHub
- What is Git? What is GitHub?
- Install and Configure GitHub Desktop
- Create a Local Git Repository
- Commits
- Commit History
- Push an RStudio Project to GitHub
- Keep RStudio and GitHub in Sync
- Week 6 Project Assignment
- Week 6 Live Session (Spring 2024)
-
Week 7: Advanced Data Wrangling (Part 1)
- Downloading and Importing Data
- Overview of Tidy Data
- Tidy Data Rule #1: Every Column is a Variable
- Tidy Data Rule #3: Every Cell is a Single Value
- Tidy Data Rule #2: Every Row is an Observation
- Changing Variable Types
- Dealing with Missing Data
- Advanced Summarizing
- Binding Data Frames
- Week 7 Live Session (Spring 2024)
-
Week 8: Advanced Data Wrangling (Part 2)
- Functions
- Data Merging
- Exporting Data
- Bring It All Together (Advanced Data Wrangling)
- Week 8 Project Assignment
- Week 8 Live Session (Spring 2024)
-
Week 9: Advanced Data Visualization (Part 1)
- Best Practices in Data Visualization
- Tidy Data
- Pipe Data into ggplot
- Reorder Plots to Highlight Findings
- Line Charts
- Use Color to Highlight Findings
- Declutter
- Add Descriptive Labels to Your Plots
- Use Titles to Highlight Findings
- Use Annotations to Explain
-
Week 10: Catch-Up Week
-
Week 11: Advanced Data Visualization (Part 2)
- Tweak Spacing
- Create a Custom Theme
- Customize Your Fonts
- Try New Plot Types
- Bring it All Together (Advanced Data Visualization)
- Week 11 Project Assignment
-
Week 12: Advanced Quarto
- Advanced Markdown
- Tables
- Advanced YAML and Code Chunk Options
- Inline R Code
- Making Your Reports Shine: Word Edition
- Making Your Reports Shine: PDF Edition
- Making Your Reports Shine: HTML Edition
- Presentations
- Dashboards
- Websites
- Publishing Your Work
- Quarto Extensions
- Parameterized Reporting, Part 1
- Parameterized Reporting, Part 2
- Parameterized Reporting, Part 3
-
Week 13: Final Project
-
Additional Lessons
- All R in 3 Months (Spring 2024) Videos
- Factors in R
- Reading documentation pages
- Add citations to Quarto documents
- Change titles of facet plots
- Population pyramid plot
- Why use Git - example case
- How to access data not on GitHub
- Dealing with merge conflicts in GitHub Desktop
- Using Function Arguments
R in 3 Months (Spring 2024)
Tidy Data Rule #3: Every Cell is a Single Value
This lesson is locked
This lesson is called Tidy Data Rule #3: Every Cell is a Single Value, part of the R in 3 Months (Spring 2024) course. This lesson is called Tidy Data Rule #3: Every Cell is a Single Value, part of the R in 3 Months (Spring 2024) course.
If the video is not playing correctly, you can watch it in a new window
Transcript
Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.
View code shown in video
# Load Packages -----------------------------------------------------------
library(tidyverse)
library(janitor)
library(readxl)
library(fs)
# Create Directory --------------------------------------------------------
dir_create("data-raw")
# Download Data -----------------------------------------------------------
# download.file(url = "https://www.oregon.gov/ode/educator-resources/assessment/Documents/TestResults2122/pagr_schools_math_tot_raceethnicity_2122.xlsx",
# mode = "wb",
# destfile = "data-raw/pagr_schools_math_tot_raceethnicity_2122.xlsx")
#
# download.file("https://www.oregon.gov/ode/educator-resources/assessment/Documents/TestResults2122/TestResults2019/pagr_schools_math_tot_raceethnicity_1819.xlsx",
# mode = "wb",
# destfile = "data-raw/pagr_schools_math_tot_raceethnicity_1819.xlsx")
#
# download.file("https://www.oregon.gov/ode/educator-resources/assessment/TestResults2018/pagr_schools_math_raceethnicity_1718.xlsx",
# mode = "wb",
# destfile = "data-raw/pagr_schools_math_raceethnicity_1718.xlsx")
#
# download.file("https://www.oregon.gov/ode/educator-resources/assessment/TestResults2017/pagr_schools_math_raceethnicity_1617.xlsx",
# mode = "wb",
# destfile = "data-raw/pagr_schools_math_raceethnicity_1617.xlsx")
#
# download.file("https://www.oregon.gov/ode/educator-resources/assessment/TestResults2016/pagr_schools_math_raceethnicity_1516.xlsx",
# mode = "wb",
# destfile = "data-raw/pagr_schools_math_raceethnicity_1516.xlsx")
# Import Data -------------------------------------------------------------
math_scores_2021_2022 <-
read_excel("data-raw/pagr_schools_math_tot_raceethnicity_2122.xlsx") |>
clean_names()
third_grade_math_proficiency_2021_2022 <-
math_scores_2021_2022 |>
filter(student_group == "Total Population (All Students)") |>
filter(grade_level == "Grade 3") |>
select(academic_year, school_id, contains("number_level_")) |>
pivot_longer(cols = contains("number_level_"),
names_to = "proficiency_level",
values_to = "number_of_students")
third_grade_math_proficiency_2021_2022 |>
mutate(proficiency_level = str_remove(proficiency_level,
pattern = "number_level_"))
third_grade_math_proficiency_2021_2022 |>
mutate(proficiency_level = recode(proficiency_level,
"number_level_4" = "4",
"number_level_3" = "3",
"number_level_2" = "2",
"number_level_1" = "1"))
third_grade_math_proficiency_2021_2022 |>
mutate(proficiency_level = if_else(proficiency_level == "number_level_4",
true = "4",
false = proficiency_level)) |>
mutate(proficiency_level = if_else(proficiency_level == "number_level_3",
true = "3",
false = proficiency_level)) |>
mutate(proficiency_level = if_else(proficiency_level == "number_level_2",
true = "2",
false = proficiency_level)) |>
mutate(proficiency_level = if_else(proficiency_level == "number_level_1",
true = "1",
false = proficiency_level))
third_grade_math_proficiency_2021_2022 |>
mutate(proficiency_level = case_when(
proficiency_level == "number_level_4" ~ "4",
proficiency_level == "number_level_3" ~ "3",
proficiency_level == "number_level_2" ~ "2",
proficiency_level == "number_level_1" ~ "1"
))
third_grade_math_proficiency_2021_2022 |>
mutate(proficiency_level = parse_number(proficiency_level))
Your Turn
Remove the “x2022_23_” portion of the
race_ethnicity
variable usingstr_remove()
.Convert all instances of the
race_ethnicity
variable to more meaningful observations (e.g. turn “american_indian_alaska_native” into “American Indian/Alaska Native”) usingrecode()
,if_else()
, orcase_when()
.
Use the following starter code to help you if necessary.
# Load Packages -----------------------------------------------------------
library(tidyverse)
library(fs)
library(readxl)
library(janitor)
# Create Directories ------------------------------------------------------
dir_create("data-raw")
# Download Data -----------------------------------------------------------
# https://www.oregon.gov/ode/reports-and-data/students/Pages/Student-Enrollment-Reports.aspx
# download.file("https://www.oregon.gov/ode/reports-and-data/students/Documents/fallmembershipreport_20222023.xlsx",
# mode = "wb",
# destfile = "data-raw/fallmembershipreport_20222023.xlsx")
#
# download.file("https://www.oregon.gov/ode/reports-and-data/students/Documents/fallmembershipreport_20212022.xlsx",
# mode = "wb",
# destfile = "data-raw/fallmembershipreport_20212022.xlsx")
#
# download.file("https://www.oregon.gov/ode/reports-and-data/students/Documents/fallmembershipreport_20202021.xlsx",
# mode = "wb",
# destfile = "data-raw/fallmembershipreport_20202021.xlsx")
#
# download.file("https://www.oregon.gov/ode/reports-and-data/students/Documents/fallmembershipreport_20192020.xlsx",
# mode = "wb",
# destfile = "data-raw/fallmembershipreport_20192020.xlsx")
#
# download.file("https://www.oregon.gov/ode/reports-and-data/students/Documents/fallmembershipreport_20182019.xlsx",
# mode = "wb",
# destfile = "data-raw/fallmembershipreport_20182019.xlsx")
# Import Data -------------------------------------------------------------
enrollment_2022_2023 <- read_excel(path = "data-raw/fallmembershipreport_20222023.xlsx",
sheet = "School 2022-23") |>
clean_names()
# Tidy and Clean Data -----------------------------------------------------
enrollment_by_race_ethnicity_2022_2023 <-
enrollment_2022_2023 |>
select(district_institution_id, x2022_23_american_indian_alaska_native:x2022_23_percent_multi_racial) |>
select(-contains("percent")) |>
pivot_longer(cols = -district_institution_id,
names_to = "race_ethnicity",
values_to = "number_of_students")
Learn More
Bob Rudis has a semi-complex article on how he uses case_when()
to work with data related to his work on internet security.
You might also find this video by Sharon Machlis on case_when()
helpful.
And, in case you think I’m the only one who loves case_when()
check out this love letter to the function by Matt Kerlogue.
You need to be signed-in to comment on this post. Login.
Joseph Lehman
April 23, 2024
We may get to this in the future or already covered it, but is there a way to drop characters either at the beginning or end of a cell? I am thinking something like the "RIGHT" and "LEFT" functions in Excel.
David Keyes Founder
April 24, 2024
Yup, check out this quick video! Code I used in it is below:
Hope this helps!