Overview
This lesson is called Overview, part of the R in 3 Months (Fall 2022) course. This lesson is called Overview, part of the R in 3 Months (Fall 2022) course.
In this first section of the course, you'll learn to clean and tidy data as well as do some analysis of it.
As I probably don’t need to tell you, you rarely receive data in sparkling format so having the skills to wrangle your data is key to your success with R. Fortunately, R has some great tools to do this. The dplyr
and tidyr
packages are your main friends here, with a wide range of functions to get your data into the format you need it in.
The Learn More section on most lessons has resources to help you, well, learn more about any topic. Many of the links there are to R for Data Science , the free book that is the tidyverse bible, and to Stat 545 , a course taught by Jenny Bryan at the University of British Columbia whose materials are available for free online.
If you want to learn more about data cleaning in R more generally, here are a few resources:
Crystal Lewis gave a presentation to R-Ladies St Louis in November 2019 on data cleaning in R.
Gina Reynolds has put together a flipbook on common data cleaning techniques in R.
Sharla Gelfand has an extremely thorough overview of tidying Toronto Transit Commission data.
There are also a series of example walkthroughs starting in Chapter 7 of the book Data Science in Education Using R. They go step-by-step, importing, cleaning, tidying, and analyzing data. They're great examples to learn from.
Have any questions? Put them below and we will help you out!
Course Content
142 Lessons
You need to be signed-in to comment on this post. Login.
Amanda Braley • April 25, 2023
I love that you've shared examples from women in data science! Thanks! I don't know why, but my employer's network flagged the Sharla Gelfand link as a threat to network security.