Import Data
This lesson is called Import Data, part of the R in 3 Months (Fall 2022) course. This lesson is called Import Data, part of the R in 3 Months (Fall 2022) course.
Transcript
Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.
Your Turn
Create a new R script file and save it as
import.R
Add the line
library(tidyverse)
at the top of your R script file and run it to load the tidyverse package.Use the
read_csv()
function (notread.csv()
) to import thepenguins_data.csv
file
Have any questions? Put them below and we will help you out!
Course Content
142 Lessons
1
Welcome to Getting Started with R
00:57
2
Install R
02:05
3
Install RStudio
02:14
4
Projects
07:54
5
Files in R
04:33
6
Packages
02:38
7
Import Data
05:24
8
Objects and Functions
03:16
9
Examine our Data
12:50
10
Import Our Data Again
07:11
11
Getting Help
07:46
12
Wrapping Up
13
R in 3 Months Fall 2022 - Introductions thread!
14
R in 3 Months Fall 2022 Week 1 Live Session
59:22
1
Getting Started
03:01
2
The Tidyverse
12:11
3
select
05:48
4
mutate
04:08
5
filter
10:26
6
summarize
03:20
7
group_by
02:56
8
count
02:06
9
arrange
03:58
10
Create a New Data Frame
02:42
11
Crosstabs
06:58
12
Wrapping Up
13
R in 3 Months Fall 2022 Week 3 Project Assignment
02:33
14
R in 3 Months Fall 2022 Week 3 Drop-in Session
1:03:00
15
R in 3 Months Fall 2022 Week 3 Live Session
1:01:42
1
An Important Workflow Tip
05:00
2
The Grammar of Graphics
06:08
3
Scatterplots
05:15
4
Histograms
02:31
5
Bar Charts
06:32
6
color and fill
03:58
7
scales
09:14
8
Text and Labels
08:04
9
Plot Labels
06:06
10
Themes
03:56
11
Facets
05:57
12
Save Plots
05:17
13
Wrapping Up
14
You Did It!
15
R in 3 Months Fall 2022 Week 4 Project Assignment
04:32
16
R in 3 Months Fall 2022 Week 4 Drop-in Session
1:02:15
17
R in 3 Months Fall 2022 Week 4 Live Session
1:00:17
1
Welcome, Logistics, Course Materials, and Additional Resources
2
What is Git? What is GitHub?
02:23
3
Why Should You Learn to Use Git and GitHub?
03:04
4
Update Everything
07:34
5
Install Git
04:04
6
Configure Git
02:10
7
Create a Local Git Repository
03:16
8
Commits
06:00
9
Commit History
04:28
10
GitHub Repositories
04:47
11
Connect RStudio and GitHub
05:06
12
Push an RStudio Project to a GitHub Repository
02:57
13
Pull a GitHub Repository to an RStudio Project
02:52
14
Keep RStudio and GitHub in Sync
02:27
15
R in 3 Months Fall 2022 Week 6 Project Assignment
07:55
16
R in 3 Months Fall 2022 Week 6 Drop-in Session
17
R in 3 Months Fall 2022 Week 6 Live Session
1:02:01
1
Overview
2
Importing Data
15:45
3
Tidy Data
08:11
4
Reshaping Data
10:18
5
Dealing with Missing Data
04:56
6
Changing Variable Types
05:30
7
Advanced Variable Creation
19:26
8
Advanced Summarizing
10:00
9
Binding Data Frames
06:50
10
R in 3 Months Fall 2022 Week 7 Drop-in Session
1:06:06
11
R in 3 Months Fall 2022 Week 7 Live Session
1:01:49
1
Data Visualization Best Practices
04:51
2
Tidy Data
04:01
3
Pipe Data Into ggplot
04:50
4
Reorder Plots to Highlight Findings
06:09
5
Line Charts
04:10
6
Use Color to Highlight Findings
08:25
7
Declutter
10:47
8
Use the scales Package for Nicely Formatted Values
03:42
9
Use Direct Labeling
11:43
10
R in 3 Months Fall 2022 Week 9 Project Assignment
11
R in 3 Months Fall 2022 Week 9 Drop-in Session
22:04
12
R in 3 Months Fall 2022 Week 9 Live Session
1:02:50
1
Use Axis Text Wisely
03:15
2
Use Titles to Highlight Findings
03:33
3
Use Color in Titles to Highlight Findings
03:51
4
Use Annotations to Explain
04:52
5
Tweak Spacing
05:11
6
Customize Your Theme
02:48
7
Customize Your Fonts
08:18
8
Try New Plot Types
11:50
9
R in 3 Months Fall 2022 Week 11 Project Assignment
03:23
10
R in 3 Months Fall 2022 Week 11 Drop-in Session
11
R in 3 Months Fall 2022 Week 11 Live Session
57:28
1
Advanced Markdown Text Formatting
10:52
2
Tables
19:33
3
Advanced YAML
11:49
4
Inline R Code
06:57
5
Making Your Reports Shine: Word Edition
06:53
6
Making Your Reports Shine: HTML Edition
06:32
7
Making Your Reports Shine: PDF Edition
08:21
8
Presentations
04:21
9
Dashboards
06:28
10
Other Formats
04:57
11
You Did It!
12
R in 3 Months Fall 2022 Week 12 Drop-in Session
13
R in 3 Months Fall 2022 Week 12 Live Session
1:04:50
1
All R in 3 Months Fall 2022 Videos
2
Reading documentation pages
05:20
3
Working with file paths and RStudio Projects
05:37
4
Styling RMarkdown docs
5
Structuring large projects (and dealing with slow knitting of Rmd files)
03:47
6
Quarto vs RMarkdown
02:56
7
How to get lesson and lecture slides
03:15
8
{lubridate} for working with dates and times
07:28
9
Statistical Tests
10:17
You need to be signed-in to comment on this post. Login.
S. Revi Sterling • March 17, 2021
> faketucky <- read.csv(data/faketucky.csv) Error in read.table(file = file, header = header, sep = sep, quote = quote, : object 'faketucky.csv' not found
Atlang Mompe • March 29, 2021
Hi David,
In your example you have double quotes around your syntax, but it wont work on my computer (using windows), unless I have single quotes, is that normal? This is the code that works for me: faketucky <-read_csv ('data/faketucky.csv')
Faythe Aiken • March 30, 2021
Hi David - I'm unable to load the read_csv function from tidyverse. When trying to install the tidyverse package, I get the following failure to download either the binary or source files. What's puzzling is I can download them directly in my browser but in R Studio. > install.packages("vctrs", type="binary") Installing package into ‘\pdcnt19/AikenF$/My Documents/R-local’ (as ‘lib’ is unspecified)
There is a binary version available (and will be installed) but the source version is later: binary source vctrs 0.3.6 0.3.7
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.6/vctrs_0.3.6.zip' Warning in install.packages : InternetOpenUrl failed: 'The operation timed out' Error in download.file(url, destfile, method, mode = "wb", ...) : cannot open URL 'https://cran.rstudio.com/bin/windows/contrib/3.6/vctrs_0.3.6.zip' Warning in install.packages : download of package ‘vctrs’ failed > install.packages("vctrs", type="source") Installing package into ‘\pdcnt19/AikenF$/My Documents/R-local’ (as ‘lib’ is unspecified) trying URL 'https://cran.rstudio.com/src/contrib/vctrs_0.3.7.tar.gz' Warning in install.packages : InternetOpenUrl failed: 'The operation timed out' Error in download.file(url, destfile, method, mode = "wb", ...) : cannot open URL 'https://cran.rstudio.com/src/contrib/vctrs_0.3.7.tar.gz' Warning in install.packages : download of package ‘vctrs’ failed
Lisa Janz • March 31, 2021
I can't figure out why, but the keep throwing the following error code: Error: object 'faketucky' not found Here is what I have done:> library(tidyverse) -- Attaching packages ---------------- tidyverse 1.3.0 -- v ggplot2 3.3.3 v purrr 0.3.4 v tibble 3.1.0 v dplyr 1.0.5 v tidyr 1.1.3 v stringr 1.4.0 v readr 1.4.0 v forcats 0.5.1 -- Conflicts ------------------- tidyverse_conflicts() -- x dplyr::filter() masks stats::filter() x dplyr::lag() masks stats::lag() > library(skimr) > faketucky->read_csv("faketucky.csv") Error: object 'faketucky' not found > setwd("C:/Users/ArcticFox/Desktop/getting-started-master/data") > faketucky->read_csv("faketucky.csv") Error: object 'faketucky' not found
Lisa Janz • March 31, 2021
And it doesn't work if I put the arrow going in the right direction either. I have used R pretty regularly and tried several things with this, but for some reason, I really can't get it to open the file.
Lisa Janz • March 31, 2021
> library(tidyverse) -- Attaching packages ---------------- tidyverse 1.3.0 -- v ggplot2 3.3.3 v purrr 0.3.4 v tibble 3.1.0 v dplyr 1.0.5 v tidyr 1.1.3 v stringr 1.4.0 v readr 1.4.0 v forcats 0.5.1 -- Conflicts ------------------- tidyverse_conflicts() -- x dplyr::filter() masks stats::filter() x dplyr::lag() masks stats::lag() > faketucky<-read_csv("faketucky.csv") Error: 'faketucky.csv' does not exist in current working directory ('C:/Users/ArcticFox/Desktop/getting-started-master').
Josh Rodriguez • May 14, 2021
Hey David, It appears I am getting the common issue noted in the comments here. That "faketucky does not exist in the current working directory." I looked at your response as a way to resolve the matter but it doesn't appear that faketucky is in my Rproj by default. This is where my R session is attempting to pull the data from by default
Scott Clark • July 18, 2021
Hi David. Tidyverse was installed and loaded. I could see and use read.csv, but not read_csv. I noticed readr wasn't listed in the packages:
> library(tidyverse) -- Attaching packages ---------------------------------------------------- tidyverse 1.3.1 -- v ggplot2 3.3.5 v dplyr 1.0.7 v tibble 3.1.2 v stringr 1.4.0 v tidyr 1.1.3 v forcats 0.5.1 v purrr 0.3.4
I was able to install and load readr separately to get around this, but is there a reason why it might not have installed with the rest of tidyverse? Could I be missing any other packages that I might need later?
Christine Mahoney • August 22, 2021
Difficult having issues. I keep receiving Error: 'faketucky.csv' does not exist in current working directory ('/Users/christinemahoney/Desktop/getting-started-master').
Prince Baawuah • October 12, 2021
I mostly work with very very large datasets. Are there any packages and/or tips on how to efficiently import and work with very very large datasets quickly(e.g. if parallel processing?) on the desktop?
Lukas Harringer • March 10, 2022
Hi, when I run the read_csv function, the data appears in my Console not in the Environment section.
Michael Steinhoff • March 17, 2022
could not find function "read_csv". Looked back at error code from loading tidyverse and have this: ** Error: package or namespace load failed for ‘tidyverse’ in loadNamespace(j = 0.7.6 is required ** Seems like something is not up to date, but i'm not sure what
Jessica Brewer • October 5, 2022
What is meant by "the working directory"? The main folder in the Files environment?
Amy Williams • October 10, 2022
Hi, Im trying to import the data but I have a message saying the file is not in my current working directory ,library(tidyverse) > library(skimr) > #open up data file use code below > faketucky <-read_csv("data/faketucky.csv") Error: 'data/faketucky.csv' does not exist in current working directory
not sure how to change this?
Thank you
Hani Alnakhli • January 19, 2023
Hi David, I have got this text! Enter an item from the menu, or 0 to exit not pretty sure what was my mistake
Mike Horton • March 9, 2023
Hi, I'm not sure what is going wrong here, but I am getting this error message in response to my syntax - please note that I am putting a < and then a - in the syntax, but it convert this into an arrow when I type them within this question box faketucky <- read_csv(“data/faketucky.csv”) Error: unexpected input in "faketucky <- read_csv(“"
Any ideas? Thanks!
Mercy Abarike • March 26, 2023
I get this feedback anytime I try importing the faketucky data faketucky <-read_csv("data/faketucky.csv") Error: 'data/faketucky.csv' does not exist in current working directory ('C:/Users/Mrs.Mercy/OneDrive/Desktop/Nat 1').
ashwath gadapa • April 26, 2023
Hi David ,
I'm unable to load read_csv function . i have the below log for your reference
> install.packages("skimr") WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding:
https://cran.rstudio.com/bin/windows/Rtools/ Installing package into ‘C:/Users/Admin/AppData/Local/R/win-library/4.3’ (as ‘lib’ is unspecified) trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.3/skimr_2.1.5.zip' Content type 'application/zip' length 1236705 bytes (1.2 MB) downloaded 1.2 MB
package ‘skimr’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in C:\Users\Admin\AppData\Local\Temp\Rtmpm8G3tz\downloaded_packages > library(skimr) > library(skimr) > faketucky faketucky <- read_csv("data/faektucky.csv") Error in read_csv("data/faektucky.csv") : could not find function "read_csv"
Tuhin CHATURVEDI • April 29, 2023
For Posit Cloud Users: Posit Cloud allows us to go to the file "faketucky.csv". When we left-click on the file, it gives us an option to "Import Dataset". When we choose "Import Dataset", it loads the (readr) package [via library(readr)] and then automatically imports faketucky.csv using the self-generated code [faketucky < - read_csv("~/getting-started-master/data/faketucky.csv")]. Very neat!
Gabriela Elizondo • July 1, 2023
Hi David, I cannot get it to work. I have tried restarting R, loading the packages and it keeps giving me warnings and errors. Restarting R session...
> install.packages("tidyverse") Installing package into ‘/cloud/lib/x86_64-pc-linux-gnu-library/4.3’ (as ‘lib’ is unspecified) trying URL 'http://rspm/default/linux/focal/latest/src/contrib/tidyverse_2.0.0.tar.gz' Content type 'application/x-gzip' length 425237 bytes (415 KB)
downloaded 415 KB
The downloaded source packages are in ‘/tmp/RtmpdYRVTq/downloaded_packages’ > faketucky load("/cloud/home/r2101164/getting-started-master/data/faketucky.csv") Error in load("/cloud/home/r2101164/getting-started-master/data/faketucky.csv") : bad restore file magic number (file may be corrupted) -- no data loaded In addition: Warning message: file ‘faketucky.csv’ has magic number 'stude' Use of save versions prior to 2 is deprecated > install.packages(“tidyverse”)
Gabriela Elizondo • July 1, 2023
Tuhin CHATURVEDI's comment worked! Thank you!
Maia Volk • August 26, 2023
Hi David,
I'm having an issue loading the tidyverse package. I keep getting this message:
Error: package or namespace load failed for ‘tidyverse’: .onAttach failed in attachNamespace() for 'tidyverse', details: call: NULL error: package or namespace load failed for ‘ggplot2’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]): there is no package called ‘fansi’
Can you help me? Thank you!
David Keyes Founder • August 26, 2023
What happened is that, when you tried to install the
tidyverse
package, one of its dependency packages (packages that the tidyverse needs to run) did not install correctly. I'd manually try to install that package using this code:Try that and let me know if it helps.
Valerie Kaster • September 11, 2023
I am getting an error. I went back and started over to make sure I did all the steps and same response.
help please
David Keyes Founder • September 11, 2023
When did you start the course? I made some changes to it recently that may be confusing you because you may have watched old lessons previously.
Archana Joshi • September 13, 2023
My current working directory that R Studio shows is C:\users\username\Documents
When I created a new R script file - import and followed the above steps to read the penguins file, it gives me an error - 'penguins_data.csv' does not exist in current working directory ('C:/Users/Rajeev Joshi/Documents'). I saved the import.R in getting-started-main folder.
How do I change the current working directory to getting-started-main?
Please help
Libby Heeren Coach • September 15, 2023
Hi, Archana! As long as you're inside an R Project, your working directory will be the project, so make sure you're inside the getting-started-main project before typing the library and read_csv code into your import.R file (which is saved in the project folder).
I made a short video to demonstrate what it should look like.
Bhumika Bhattacharya • September 18, 2023
I have installed tidyverse packagebut when I am running the code read.csv("penguins_data.csv") it is showing this on the console:
Bhumika Bhattacharya • September 18, 2023
it is working for the tibble only