Get access to all lessons in this course.
-
RMarkdown
- Why Use RMarkdown?
- RMarkdown Overview
- YAML
- Text
- Code Chunks
- Wrapping Up
-
Data Wrangling and Analysis
- Getting Started
- The Tidyverse
- select
- mutate
- filter
- summarize
- group_by
- count
- arrange
- Create a New Data Frame
- Crosstabs
- Wrapping Up
-
Data Visualization
- An Important Workflow Tip
- The Grammar of Graphics
- Scatterplots
- Histograms
- Bar Charts
- color and fill
- scales
- Text and Labels
- Plot Labels
- Themes
- Facets
- Save Plots
- Wrapping Up
-
Wrapping Up
- You Did It!
Fundamentals of R
The Tidyverse
This lesson is locked
This lesson is called The Tidyverse, part of the Fundamentals of R course. This lesson is called The Tidyverse, part of the Fundamentals of R course.
Transcript
Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.
Your Turn
First, download the course project using the following code:
install.packages("usethis")
library(usethis)
use_course("http://bit.ly/fundamentals-of-r-course")
Then, do the following:
Open the data-wrangling-and-analysis-exercises.Rmd file
Load packages
Import NHANES data and use the
clean_names
function on it.
Learn More
To see the most downloaded R packages, check out the trends page of the RDocumentation website. Note that because tidyverse is a collection of packages, you will see it show up among the most downloaded packages as well as individual packages that are part of it (e.g. ggplot2).
To learn about the clean_names function, check out the janitor package docs. Note that there are many options for how you can format your variable names. I like snake_case myself so use that (it's also the default).
To see a great set of slides on Tidyverse basics, check out these from Alison Hill.
You need to be signed-in to comment on this post. Login.
Hatem Kotb
March 20, 2021
I've followed the same code but I'm getting an error message as shown below. Any idea why? Thanks! :)
-- Column specification -------------------------------------------------------------------------------------- cols( .default = col_character(), ID = col_double(), Age = col_double(), Weight = col_double(), Height = col_double(), BMI = col_double(), DaysPhysHlthBad = col_double(), DaysMentHlthBad = col_double(), SleepHrsNight = col_double(), PhysActiveDays = col_double(), TVHrsDay = col_logical() ) i Use
spec()
for the full column specifications.4859 parsing failures. row col expected actual file 5001 TVHrsDay 1/0/T/F/TRUE/FALSE 2_hr './data/nhanes.csv' 5002 TVHrsDay 1/0/T/F/TRUE/FALSE More_4_hr './data/nhanes.csv' 5003 TVHrsDay 1/0/T/F/TRUE/FALSE 4_hr './data/nhanes.csv' 5004 TVHrsDay 1/0/T/F/TRUE/FALSE 4_hr './data/nhanes.csv' 5005 TVHrsDay 1/0/T/F/TRUE/FALSE 1_hr './data/nhanes.csv' .... ........ .................. ......... ................... See problems(...) for more details
Hatem Kotb
March 20, 2021
Seems like it's a warning message rather, not an error message. Would still like to understand more how to fix/navigate it. Thanks! :)
David Keyes
March 20, 2021
Yup, so that message is just telling you how R is interpreting the variables.
col_character() = string col_double() = numeric col_logical() = TRUE/FALSE
Hope that helps!
Hatem Kotb
March 21, 2021
(phew), thanks David! :)
Oindrila Bhattacharyya
March 24, 2021
While trying to download the janitor package, I am getting an error message "Error in library(janitor) : there is no package called ‘janitor’ " Can you please tell me what to do now?
Oindrila Bhattacharyya
March 24, 2021
I figured out the issue. Never mind!
Harold Stanislaw
March 31, 2021
I had the same problem. Not sure how you solved it Oindrila, but in my case I went to the "Packages" tab of the files pane, clicked on Install, and told it to download janitor from CRAN. After that I was able to load janitor in the R Markdown code chunk.
Vuk Sekicki
March 27, 2021
Just a word of advice from beginner. I just watched: https://vimeo.com/320973807 02:47. this is essential slide and explanation for me. I could not compute which of the forthcoming functions work on rows and which on columns.
Erin Guthrie
March 31, 2021
For a number of attempts, I kept getting this error: Error: package or namespace load failed for ‘tidyverse’ in loadNamespace(j library(tidyverse) Error: package or namespace load failed for ‘tidyverse’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]): there is no package called ‘cellranger’
After some googling, I just installed "cellranger" separately and that seemed to resolve the issue. Is this common or do I have something set up incorrectly?
David Keyes
March 31, 2021
This type of thing doesn't happen that often, but it does occasionally. One thing you could do when you install the tidyverse is use this code:
install.packages("tidyverse", dependencies = TRUE)
The dependencies = TRUE part will make sure that any packages the tidyverse needs in order to run (like cellranger) will also be installed. This should happen even without adding dependencies = TRUE, but sometimes doesn't, as you've found out.
Naomi Nichols
April 13, 2021
Should I be worried about all of this red text? : Warning: 4859 parsing failures. row col expected actual file 5001 TVHrsDay 1/0/T/F/TRUE/FALSE 2_hr 'data/nhanes.csv' 5002 TVHrsDay 1/0/T/F/TRUE/FALSE More_4_hr 'data/nhanes.csv' 5003 TVHrsDay 1/0/T/F/TRUE/FALSE 4_hr 'data/nhanes.csv' 5004 TVHrsDay 1/0/T/F/TRUE/FALSE 4_hr 'data/nhanes.csv' 5005 TVHrsDay 1/0/T/F/TRUE/FALSE 1_hr 'data/nhanes.csv' .... ........ .................. ......... ................. See problems(...) for more details.
David Keyes
April 13, 2021
Nope! Despite the fact that it's red, it's just informational, telling you that there are some observations whose data type R wasn't certain of when importing.
Juan Clavijo
September 27, 2021
Hello! I'm getting this error message: Error in library(janitor) : there is no package called ‘janitor’
Not sure what to do about it to be able to use clean names.
Thanks! Juan
Juan Clavijo
September 27, 2021
Never mind, solved using an answer from above, thanks!
David Keyes
September 27, 2021
For anyone who needs it, you can use this URL in place of the bit.ly one to access the course materials as a zip file: https://codeload.github.com/rfortherestofus/fundamentals/zip/refs/heads/master
David Keyes
September 27, 2021
You can definitely do that, though I think there are some caveats to consider. I recorded a video to explain my thinking and show you how I'd do things. The code I show in the video is here.
Israel Johnson
October 4, 2021
After executing the 2nd code chunk instructions the following message was generated: Column specification Delimiter: "," chr (13): SurveyYr, Gender, AgeDecade, Race1, Education, MaritalStatus, HHIncome, HomeOwn, Work, HealthGen, ... dbl (9): ID, Age, Weight, Height, BMI, DaysPhysHlthBad, DaysMentHlthBad, SleepHrsNight, PhysActiveDays
i Use
spec()
to retrieve the full column specification for this data. i Specify the column types or setshow_col_types = FALSE
to quiet this message.Should I be worried about these last two rows?
David Keyes
October 4, 2021
No need to worry! It's just giving you some info about the data that it read in. Here's a quick video explanation.
Dana Loll
March 31, 2022
Hi there! Getting this code when loading janitor. The following objects are masked from ‘package:stats’:
I know that this may be ok for our purposes now but I love chi square & fisher's tests! Any way to keep them? Or are they done in another package?
Carolyn Ford
April 2, 2022
Whenever I try these "usethis" commands, I always get this feedback:
Error in curl::curl_download(url = url, destfile = destfile, quiet = quiet, : Recv failure: Connection was reset
Carolyn Ford
April 2, 2022
Never mind - ended up using the codeload from github.
Charlie Hadley
April 5, 2022
Hi Carolyn, Glad you were able to find a work around. There are some known issues with usethis::use_course() on corporate networks. It's just a convenience function for downloading the exact same Zip that you got from Github directy. Cheers, Charlie
Tatiana Bustos
July 27, 2022
Hi there
I keep getting errors about the janitor package. I dont see it listed in the tab for packages either. Any other suggestions?
> library(janitor) Error in library(janitor) : there is no package called ‘janitor’ > library (janitor) Error in library(janitor) : there is no package called ‘janitor’ > install.packages (janitor) Error in install.packages : object 'janitor' not found > install.packages(janitor) Error in install.packages : object 'janitor' not found
Tatiana Bustos
July 27, 2022
Nevermind - it appeared when I updated! Weird.
David Keyes
July 28, 2022
Just FYI, when you install a package you need to put its name in quotes:
install.packages("janitor")
Once you've installed it, you can load it using its name without quotes:
library(janitor)
Hope that helps!
Kirstin O'Dell
September 29, 2022
I keep getting this error message: Error in read.csv("data/nhanes.csv") %>% clean_names() : could not find function "%>%"
I have the following libraries loaded with the code below. What am I missing?
Kirstin O'Dell
September 29, 2022
looks like the code didn't copy correctly, trying again: nhanes % clean_names()
Kirstin O'Dell
September 29, 2022
will create a gist and email it - sorry
Ellen Wilson
January 23, 2023
This all worked for me when I did the sample exercise. I am now trying to work with some SurveyMonkey data I have from an actual survey I am working on, and having some difficulty. First of all, the variable names are insanely long (all the words of every question). I don't know if there is a better way to deal with it, but I was thinking that I would just rename all of them (e.g., rename " x2_what_grade_are_you_in" to "grade"). This is tedious, but maybe necessary? Making it more tedious, though, is the fact that even though I ran clean_names, the autocomplete function doesn't seem to be working. Any thoughts on what the problem could be? Typing out all the words in every one of these very long variable names would be a huge pain... Thanks for any guidance!
David Keyes
January 23, 2023
SurveyMonkey data is very annoying that way! Check out the
mash_colnames()
function from theunheadr
package. Charlie shows an example of using it in the Survey Monkey section of this blog post.Chelsea Ruder
March 23, 2023
Where do I find the data-wrangling-and-analysis-exercises.Rmd file ??
David Keyes
March 23, 2023
It's in the fundamentals project that you downloaded. It's probably on your Desktop. Does that help?
Amanda Braley
March 25, 2023
Hmmm I have an error getting started: > library(usethis) Warning: package ‘usethis’ was built under R version 4.2.3Error: package or namespace load failed for ‘usethis’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): namespace ‘rlang’ 1.0.6 is already loaded, but >= 1.1.0 is required
David Keyes
March 27, 2023
The issue here is that the
rlang
package needs to be updated. Can you try runninginstall.packages("rlang")
and then see if it works?Amanda Braley
March 28, 2023
No luck. I'm really struggling to get the tidyverse loaded. Error: package or namespace load failed for ‘tidyverse’: .onAttach failed in attachNamespace() for 'tidyverse', details: call: NULL error: package or namespace load failed for ‘ggplot2’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): namespace ‘rlang’ 1.0.6 is already loaded, but >= 1.1.0 is required
Amanda Braley
March 28, 2023
Something terrible has happened. I tried to re-run the week 2 exercises that I was able to do last week and now they won't run either... > # Load Packages ----------------------------------------------------------- > > # Load the tidyverse and skimr packages using the library function > library(tidyverse) Error: package or namespace load failed for ‘tidyverse’: .onAttach failed in attachNamespace() for 'tidyverse', details: call: NULL error: package or namespace load failed for ‘ggplot2’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): namespace ‘rlang’ 1.0.6 is already loaded, but >= 1.1.0 is required In addition: Warning message: package ‘tidyverse’ was built under R version 4.2.3 >
Andrew Paquin
March 26, 2023
When I try to do the last step, I get this message" Error in read_csv("data/nhanes.csv") %>% clean_names() : could not find function "%>%" Not sure what I'm doing wrong. I'm having a lot of trouble with RStudio tonight.
Andrew Paquin
March 26, 2023
Okay, I just re-ran everything above it, and I did not get that message. I did get this: Rows: 10000 Columns: 22── Column specification ────────────────────────────────────────────────────────────────── Delimiter: "," chr (13): SurveyYr, Gender, AgeDecade, Race1, Education, MaritalStatus, HHIncome, Home... dbl (9): ID, Age, Weight, Height, BMI, DaysPhysHlthBad, DaysMentHlthBad, SleepHrsNigh... ℹ Use
spec()
to retrieve the full column specification for this data. ℹ Specify the column types or setshow_col_types = FALSE
to quiet this message. Is this success?David Keyes
March 27, 2023
On this, the issue is almost certainly that you hadn't run
library(tidyverse)
before running this line. The pipe comes from the tidyverse so it won't work unless you load the tidyverse first.Zain Asaf
March 31, 2023
Hi David, Previously I was able to get half-way through the select exercise assignments. However, notwwhen I run the code from the start, run the clean_name function- I use the following codeL nhanes% clean_names() But I get the following error message: Use
spec()
to retrieve the full column specification for this data. i Specify the column types or setshow_col_types = FALSE
to quiet this message. >Use
spec()
to retrieve the full column specification for this data. i Specify the column types or setshow_col_types = FALSE
to quiet this message.Zain Asaf
March 31, 2023
Sorry, this is the code I use:
nhanes% clean_names()
Charlie Hadley
April 1, 2023
Hi Zain, Please note that not all red text in the console is an error message. This is informative message generated by the function, it does not mean it's not working. Thanks, Charlie
Abdulkadir Lokhandvala
April 7, 2023
Hi David/Charlie - I am continuously getting below error and not able to progress in my course materials. Warning: package ‘tidyverse’ was built under R version 4.2.3Error: package or namespace load failed for ‘tidyverse’: .onAttach failed in attachNamespace() for 'tidyverse', details: call: NULL error: package or namespace load failed for ‘ggplot2’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): namespace ‘rlang’ 1.0.6 is already loaded, but >= 1.1.0 is required
David Keyes
April 7, 2023
Can you please try to install the rlang package manually using
install.packages("rlang")
and then let me know if it's fixed?Innocent Ouko
June 3, 2023
Hello, I have been trying to practice lesson but on a single file and this is the error message I got after trying to knit the entire document, which, as a result, makes no output:
Error: The name of the input file cannot contain the special shell characters: [ ()|:&;#?*'] (attempted to copy to a version without those characters 'Fundamentals-of-R.Rmd' however that file already exists) Execution halted
Innocent Ouko
June 4, 2023
I solved it. No big deal
David Keyes
June 4, 2023
Glad you figured it out!