When you work with most data analysis tools, the first step in any project is to download your data. Your workflow might look something like this:
Go to the website where the data is located
Find the data you need
Download the data to your computer
Copy the data to where you need to in order to begin analysis
With R, it’s different. You can download data from within R itself, as I show in this video demonstration (which, incidentally, is from my upcoming Going Deeper with R course).
There are several benefits of being able to download data directly in this way.
First, you save time by not having to manually download and copy data to where you need it to go.
You also make your work reproducible by writing code that you or anyone else could rerun to get access to the exact same data you worked with. Your entire workflow is completely transparent, making it easy for everyone to understand.
And, when you get really advanced in your R skills, you can automate your work, as Saint Louis University Sociology Professor Chris Prener shows below, to bulk import dozens of CSV files.
Just used the #rstats package `purrr` to read in 53 individual csv files and combine them all easy peasy. I've used this approach before, and it never ceases to make me smile. 5 years ago I would have combined them all by hand in Excel. My how far I've come. pic.twitter.com/JFR7fHVJwe— Chris Prener (@chrisprener) December 20, 2019
Downloading data directly within R is just one more example of how this incredible tool can have a dramatic impact on your workflow. It demonstrates the trade-off involved in learning to use R, which Ryan Estrellado, Emily Bovee, Jesse Mostipak, Joshua Rosenberg, and Isabella VelÃ¡squez spell out in their new book, Data Science in Education Using R:
The beginning of the learning journey is particularly challenging because it feels slow. If you have experience as an educator or consultant, you already have efficient solutions you use in your day-to-day work. Introducing code to your workflow slows you down at first because you won’t be as fast as you are with your favorite spreadsheet software. However, you’re probably reading this book because you realize that learning how to analyze data using R is like investing in your own personal infrastructure–it takes time while you’re building the initial skills, but the investment pays off when you start solving complex problems faster and at scale.
There are many more tips like this to improve your R workflow in my upcoming Going Deeper with R course. It will be released in May 2020, but if you sign up before it's out, you'll get 33% off!