Skip to content
R for the Rest of Us Logo

This lesson is locked

Get access to all lessons in this course.

Transcript

Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.

Your Turn

Complete the scatterplot sections of the data-visualization-exercises.Rmd file.

Learn More

Scatterplot Resources

Claus Wilke talks about scatterplots in Chapter 12 of his book Fundamentals of Data Visualization. Michael Toth also has a long blog post about all of the ins and outs of making scatterplots in ggplot.

You can also find examples of code to make scatterplots on the Data to Viz website , the R Graph Gallery website , and in Chapter 5 of the R Graphics Cookbook.

Have any questions? Put them below and we will help you out!

You need to be signed-in to comment on this post. Login.

Jeff Shandling

Jeff Shandling

April 4, 2021

Getting the following error: Error in loadNamespace(name) : there is no package called ‘farver’ when I add the mapping code, the variables are not auto-populating

David Keyes

David Keyes

April 5, 2021

That's super weird. It seems that the farver package, which I believe is a dependency of ggplot (i.e. a package that ggplot relies on), didn't get installed. Try installing it with install.packages('farver') and let me know if it works after that!

Abby Isaacson

Abby Isaacson

April 6, 2021

We may not be there yet, but for axis labeling if we wanted to add units to the labels, is that easy?

David Keyes

David Keyes

April 6, 2021

Can you clarify what you mean by adding units to the labels? I'm not sure I quite get it.

Abby Isaacson

Abby Isaacson

April 6, 2021

Sure! I mean simply height (cm) and age (years).

David Keyes

David Keyes

April 6, 2021

Ah, ok! Yes, you use the labs() function for this, which you'll learn about in the plot labels lesson.

Jimmy Frickey

Jimmy Frickey

April 25, 2021

Hi David,

Here are 2 versions of code that both produce the scatterplot of height vs weight from nhanes dataset. The first is from your solutions, and the second if following r4ds text. Can you briefly comment on why they both "work"? Is one better than another?

ggplot(data = nhanes, mapping = aes(x = weight, y = height)) + geom_point()

ggplot(data = nhanes) + geom_point(mapping = aes(x = weight, y = height))

David Keyes

David Keyes

April 26, 2021

This is a great question! I made a short video to show the difference between the two approaches. Hope that's helpful!

Tatiana Bustos

Tatiana Bustos

July 28, 2022

Great explanation!

Juan Clavijo

Juan Clavijo

October 11, 2021

You mentioned that ggplot will automatically remove observations with missing data. If I'm plotting average test scores for mid-term and final exams, for example, and one student took the final but did not take the mid-term, will ggplot remove that student's data from the graph completely, or will it just plot the final exam and omit the mid-term score that does not exist?

Esther Okoye

Esther Okoye

April 5, 2022

Hello, Please i cant the data visualization exercise on my studio, do i have to do anything?

Esther Okoye

Esther Okoye

April 5, 2022

hey, thank you, i have figured it out

Charlie Hadley

Charlie Hadley

April 5, 2022

Hi Esther,

Could you follow the instructions in this video and tell me what you see?

Thanks, Charlie

Acarilia Eduardo

Acarilia Eduardo

April 18, 2022

Hi Charlie,

Out of curiosity, I clicked on the link to the video above and got a 404 error message.

Charlie Hadley

Charlie Hadley

April 19, 2022

Hi Acarilia, I've changed the settings for the video so you can see it too.

Elijah Phillips

Elijah Phillips

October 13, 2022

Where do we get the .rmd file for this?

Charlie Hadley

Charlie Hadley

October 13, 2022

Hi Elijah! The .Rmd is from this Github repository - https://github.com/rfortherestofus/fundamentals. You can download the project onto your machine with this link https://github.com/rfortherestofus/fundamentals/archive/refs/heads/master.zip

Cheers, Charlie

Ellen Wilson

Ellen Wilson

October 31, 2022

It seems like the clean_names function didn't work for me--when I start typing the code for the scatterplot, it isn't suggesting the variable names. This is what I put for clean_names

nhanes %
  clean_names()

And then I got this message (which looks different from what you got)

Rows: 10000 Columns: 22── Column specification ────────────────────────────────────────────────────────────────────────────── Delimiter: "," chr (13): SurveyYr, Gender, AgeDecade, Race1, Education, MaritalStatus, HHIncome, HomeOwn, Work, H... dbl (9): ID, Age, Weight, Height, BMI, DaysPhysHlthBad, DaysMentHlthBad, SleepHrsNight, PhysActiv... ℹ Use spec() to retrieve the full column specification for this data. ℹ Specify the column types or set show_col_types = FALSE to quiet this message.

Charlie Hadley

Charlie Hadley

November 1, 2022

Hello Ellen,

It looks like you've mistyped the pipe, in the code you provided you have:

nhanes %
clean_names()

Whereas the pipe should be written as %>%, ie

nhanes %>%
clean_names()

Please also not that you will need to make an assignment if you want the effect of clean_names() to be applied to the object, when we run code like this the only outcome we have is printing the output to the console. Assignments are how we change objects.

Thanks,

Charlie

Oscar Tetteh

Oscar Tetteh

March 27, 2023

Please could you email me the nhames data set? This is my mail: bismarktetteh25@gmail.com