Get Data Into the Right Format to Create your First Table

This lesson is called Get Data Into the Right Format to Create your First Table, part of the Making Beautiful Tables with R course. This lesson is called Get Data Into the Right Format to Create your First Table, part of the Making Beautiful Tables with R course.

Transcript

Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.

This lesson is about getting the data to the right format before we can even begin to build the table. Regardless of what package you use to create the table, it will just take your data frame and put it into a table as it is. Thus, you need to rearrange that data in your data frame before even calling, say, flextable().

The most powerful function to get this job done are pivot_wider() and pivot_longer(), they are able to transition the data frame from long format to white format and vice versa.

Your Turn

In this exercise, we want to rearrange the data in the data set gapminder from the gapminder package. This data set contains life expectancies for different countries all over the world from different points in time. We want to consider only a couple of years and a couple of countries from this data set. Take the full gapminder data set, i.e. use:

gapminder::gapminder |> 
  janitor::clean_names() # optional but makes nicer column names

You should get this:

## # A tibble: 1,704 × 6
##    country     continent  year life_exp      pop gdp_percap
##    <fct>       <fct>     <int>    <dbl>    <int>      <dbl>
##  1 Afghanistan Asia       1952     28.8  8425333       779.
##  2 Afghanistan Asia       1957     30.3  9240934       821.
##  3 Afghanistan Asia       1962     32.0 10267083       853.
##  4 Afghanistan Asia       1967     34.0 11537966       836.
##  5 Afghanistan Asia       1972     36.1 13079460       740.
##  6 Afghanistan Asia       1977     38.4 14880372       786.
##  7 Afghanistan Asia       1982     39.9 12881816       978.
##  8 Afghanistan Asia       1987     40.8 13867957       852.
##  9 Afghanistan Asia       1992     41.7 16317921       649.
## 10 Afghanistan Asia       1997     41.8 22227415       635.
## # ℹ 1,694 more rows

From there, extract the correct data and rearrange it so that you get a data set like this:

# A tibble: 10 × 8
   continent country       year1957 year1967 year1977 year1987 year1997 year2007
   <fct>     <chr>            <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>
 1 Africa    Egypt             44.4     49.3     53.3     59.8     67.2     71.3
 2 Africa    Sierra Leone      31.6     34.1     36.8     40.0     39.9     42.6
 3 Americas  Jamaica           62.6     67.5     70.1     71.8     72.3     72.6
 4 Americas  Nicaragua         45.4     51.9     57.5     62.0     68.4     72.9
 5 Asia      Singapore         63.2     67.9     70.8     73.6     77.2     80.0
 6 Asia      Syria             48.3     53.7     61.2     67.0     71.5     74.1
 7 Europe    Netherlands       73.0     73.8     75.2     76.8     78.0     79.8
 8 Europe    United Kingd…     70.4     71.4     72.8     75.0     77.2     79.4
 9 Oceania   Australia         70.3     71.1     73.5     76.3     78.8     81.2
10 Oceania   New Zealand       70.3     71.5     72.2     74.3     77.6     80.2

Once you’ve done that you can create your first basic table with flextable(). Hint: In this exercise, you can use this vector as a helpful tool (so that you don’t have to type out the countries).

selected_countries_vector <- c(
  'Egypt', 'Sierra Leone', 'Nicaragua', 'Jamaica', 'Syria', 'Singapore',
  'Netherlands', 'United Kingdom', 'New Zealand', 'Australia'
)

Have any questions? Put them below and we will help you out!

You need to be signed-in to comment on this post. Login.

Alberto Cabrera • May 17, 2024

To save time, I saved the 'penguins_wider' df as an RDS file. Unfortunately, it had the effect of replacing the white spaces, standing for missing values, with zeroes. What mutate routine could I use to restore the original spaces. Thanks.

David Keyes Founder • May 20, 2024

Could you please share the full code you used so I can run it myself?

Alberto Cabrera • May 20, 2024

Hola David.

I am surprised and much impressed by your reply. I was expecting to hear from Albert. My question deals with the Add Summaries Lesson. I was exploring an alternative approach to the one followed by Albert in merging the data files containing the summary statistics to be joined with a wide version of the penguins dataset (see lines 272 thru ). After binding together the files containing the total and maximum and total summaries across species (see lines 475-500), I noticed the resulting file had lots of NAs. I studied Albert's book on using gt(), and assumed I had to recode the NAs before rendering the table. It turns out I did not have to do recode the missing values. Unlike gt(), flextable codes as spaces. I am sending you the Quarto file over email.