Bring it All Together (Data Wrangling)

This lesson is called Bring it All Together (Data Wrangling), part of the Fundamentals of R course. This lesson is called Bring it All Together (Data Wrangling), part of the Fundamentals of R course.

Transcript

Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.

View code shown in video

# Load Packages -----------------------------------------------------------

library(tidyverse)
library(janitor)

# Import Data -------------------------------------------------------------

# Data from https://github.com/rstudio/r-community-survey

survey_data <- read_tsv("2020-combined-survey-final.tsv") |> 
  clean_names()

survey_data |> 
  select(contains("enjoy"))

survey_data |> 
  filter(is.na(qr_enjoyment)) |> 
  select(qr_enjoyment)

survey_data |> 
  glimpse()

avg_r_enjoyment <- survey_data |> 
  drop_na(qr_enjoyment) |> 
  group_by(qcountry) |> 
  summarize(avg_enjoyment = mean(qr_enjoyment),
            n = n()) |> 
  filter(n >= 10) |> 
  arrange(desc(avg_enjoyment))

Learn More

If you want to see a visual representation of how the various dplyr functions you've learned in this section of the course work, check out the Tidy Data Tutor website.

A less visual, though equally useful, approach is the tidylog package. It gives you feedback on each step of your pipeline, showing the data was transformed.

Have any questions? Put them below and we will help you out!

You need to be signed-in to comment on this post. Login.

Valliappan Muthu • May 17, 2024

Similar to select (var1:var2) is there a way to do drop_na (var1:var2)?

David Keyes Founder • May 17, 2024

I believe you can select a range in drop_na() though I've never actually tried it myself. Give it a shot and let me know if it works!

Valliappan Muthu • May 17, 2024

Hi. It works!

but the problem is I had one or more missing data in almost all observations, and I had zero observations after drop_na, Probably I need to recode NA to something else for a meaningful analysis.

David Keyes Founder • May 18, 2024

Yes, sounds more like an issue with your data at this point!

Zachary Li • December 20, 2024

Hi, David, where is the link to download 2020-combined-survey-final as mentioned in the lecture? Thank you for your time.

David Keyes Founder • December 20, 2024

You can find the data itself here. It is part of this GitHub repo. Hope that helps!

Gaurab Pradhan • March 15, 2025

Is there a read Excel function in the tidyverse package, or do I need to install the readxl package?

Gracielle Higino Coach • March 16, 2025

Hi Gaurab! The {readxl} package is part of the Tidyverse! If you install {tidyverse}, {readxl} will be installed as well. However, as it's not a "core" Tidyverse package, you still might need to load it separately.