Fundamentals of R



View code shown in video
# Load Packages -----------------------------------------------------------


# Import Data -------------------------------------------------------------

penguins <- read_csv("penguins.csv")
# Histograms --------------------------------------------------------------

# We use geom_histogram() to make a histogram.

ggplot(data = penguins,
       mapping = aes(x = bill_length_mm)) +

# How does ggplot know what to plot on the y axis? 
# It's using the default statistical transformation for geom_histogram, 
# which is stat = "bin".

# If we add stat = "bin" we get the same thing. 
# Each geom has a default stat.

ggplot(data = penguins,
       mapping = aes(x = bill_length_mm)) +
  geom_histogram(stat = "bin")

# We can adjust the number of bins using the bins argument. 

ggplot(data = penguins,
       mapping = aes(x = bill_length_mm)) +
  geom_histogram(bins = 100)

Your Turn

# Histograms --------------------------------------------------------------

# Make a histogram that shows the distribution of the body_mass_g variable.


# Adjust your histogram so it has 50 bins.


Learn More

You can find examples of code to make histograms on the Data to Viz website , the R Graph Gallery website , and in Chapter 6 of the R Graphics Cookbook , and Chapter 7 of the Fundamentals of Data Visualization.

To learn about more statistical transformations, Chapter 9 of R for Data Science has a discussion of them.

Have any questions? Put them below and we will help you out!

gene trevino

gene trevino • January 20, 2025

  • geom_histogram() does not work. It only works like this: + geo_histogram(stat = "count")

Why ?

David Keyes

David Keyes Founder • January 21, 2025

I'm really not sure. It's always worked for me without the stat = "count". In fact, adding stat = "count" will yield more like a bar chart because it is counting unique observations, not putting observations into bins like geom_histogram() does.

gene trevino

gene trevino • January 21, 2025

I realized that the body_mass_g variable is a character variable. I changed it to a numeric variable and the histogram worked. I still don't know why it won't work without stat = "count" Gene