Get access to all lessons in this course.
Getting Started With R
Import Our Data Again
This lesson is locked
This lesson is called Import Our Data Again, part of the Getting Started With R course. This lesson is called Import Our Data Again, part of the Getting Started With R course.
If the video is not playing correctly, you can watch it in a new window
Transcript
Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.
Your Turn
Adjust your
read_csv()
code so that you import the data againUse the na argument to tell
read_csv()
what data should be treated as missingUse the
col_types
argument to make sure thatsex_v2
gets imported as character data
You need to be signed-in to comment on this post. Login.
Betsy Dalton
September 15, 2023
Hm, when I try to run
penguins_data <- read_csv("penguins_data.csv", na = "-999")
I keep getting the following error:
David Keyes Founder
September 15, 2023
This appears to be a bug in RStudio (others have seen the same thing, as have I). You can safely ignore it. If you update RStudio in a few weeks, my guess is it will be fixed.
Betsy Dalton
September 15, 2023
Thanks!
Alberto Cabrera
November 5, 2023
penguins <- read_csv("https://raw.githubusercontent.com/rfortherestofus/rin3-fall-2023/main/data-raw/penguins.csv")
Retrieves a csv file from a GitHub account, subfolder data-raw and creates a data frame labeled penguins
Kiara Sanchez
September 17, 2023
I keep running my code: penguins_data <- read_csv("penguins_data.csv", na = c("-999", "NA"), col_types = cols(sex_v2 = "c") with teh skim function but the only column that changes to NA is the sex column.
Kiara Sanchez
September 17, 2023
Nevermind all fixed
Shubhra Murarka
March 1, 2024
Error when running: penguins_data <- read_csv("penguins_data.csv",
Error in base::nchar(wide_chars$test, type = "width") :
lazy-load database '/Users/shubhra/Library/R/x86_64/4.1/library/cli/R/sysdata.rdb' is corrupt In addition: Warning messages: 1: In base::nchar(wide_chars$test, type = "width") : restarting interrupted promise evaluation 2: In base::nchar(wide_chars$test, type = "width") : internal error -3 in R_decompress1
David Keyes Founder
March 1, 2024
Hmm, I'm not quite sure what's going on. One quick question, though: did you just install R/RStudio for this course or had you installed it previously?
Alyssa Jeffers
March 14, 2024
Hi there, when I ran the argument for changing sex_v2 to character data, I noticed that col_types showed up as a new data item in my Data Environment, but I noticed on your demo screen, it didn't show up as an additional item after running the code, you only have the penguins_data. Should this not have happened? This was my code: penguins_data <- read_csv("penguins_data.csv", na = c("-999", "NA"), col_types = cols(sex_v2 = "c"))
David Keyes Founder
March 14, 2024
Very strange! I've not seen that. Can you record a quick video using this and show me what you're seeing? Please email me after you upload the video so I know to look for it.
David Keyes Founder
March 19, 2024
Ok, I've watched your video and I've got a solution for you! Check out this video. Let me know if this helps!
Alyssa Jeffers
March 20, 2024
Got it, thanks for explaining that! I bet that's what I did.
Sandra Virgo
March 20, 2024
When I run the read_csv() code again to deal with the -999 values, it does not completely work.
Looking at the data using view(penguins_data), I still have some -999 values in case 4, as well as some -999.0 values in case 4.
I have adapted the code to read read_csv("penguins_data.csv", na = "-999, -999.0") to deal with the ones with the decimal point and the zero, but even after that there are still these issues, including in sex_v2
I can see that in sex there are some NA values now, which means the code has partially worked, I guess.
Apologies, I don't seem to be able to get a screenshot in here.
Libby Heeren Coach
March 20, 2024
Hi, Sandra! I made a short video going over the process of replacing values with NA in the data. Please take a look at it and let me know if you have any questions! https://muse.ai/v/4i7KhQx
Recap: -999 and -999.0 are the same value, just displayed differently. When using the na argument, you can either use one value inside quotations, like
"-999"
, or two values inside thec()
function, each in their own set of quotation marks, like this:c("-999", "NA")
Sandra Virgo
March 20, 2024
Thanks so much, Libby. The moment you said that the -999.0 values were the same as the -999 values it got me back on the right track again. I thought I had tried every combination of solutions but I clearly hadn't tried the easiest one. The video was very helpful - thanks so much!
Libby Heeren Coach
March 20, 2024
Woo! That's a win! So glad it helped, and thanks for asking questions! They help everyone :)