Making Beautiful Tables with R
Get Data Into the Right Format to Create your First Table
This lesson is called Get Data Into the Right Format to Create your First Table, part of the Making Beautiful Tables with R course. This lesson is called Get Data Into the Right Format to Create your First Table, part of the Making Beautiful Tables with R course.
Transcript
Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.
This lesson is about getting the data to the right format before we can even begin to build the table. Regardless of what package you use to create the table, it will just take your data frame and put it into a table as it is. Thus, you need to rearrange that data in your data frame before even calling, say, flextable()
.
The most powerful function to get this job done are pivot_wider()
and pivot_longer()
, they are able to transition the data frame from long format to white format and vice versa.
Your Turn
In this exercise, we want to rearrange the data in the data set gapminder
from the gapminder
package. This data set contains life expectancies for different countries all over the world from different points in time. We want to consider only a couple of years and a couple of countries from this data set. Take the full gapminder
data set, i.e. use:
gapminder::gapminder |>
janitor::clean_names() # optional but makes nicer column names
You should get this:
## # A tibble: 1,704 × 6
## country continent year life_exp pop gdp_percap
## <fct> <fct> <int> <dbl> <int> <dbl>
## 1 Afghanistan Asia 1952 28.8 8425333 779.
## 2 Afghanistan Asia 1957 30.3 9240934 821.
## 3 Afghanistan Asia 1962 32.0 10267083 853.
## 4 Afghanistan Asia 1967 34.0 11537966 836.
## 5 Afghanistan Asia 1972 36.1 13079460 740.
## 6 Afghanistan Asia 1977 38.4 14880372 786.
## 7 Afghanistan Asia 1982 39.9 12881816 978.
## 8 Afghanistan Asia 1987 40.8 13867957 852.
## 9 Afghanistan Asia 1992 41.7 16317921 649.
## 10 Afghanistan Asia 1997 41.8 22227415 635.
## # ℹ 1,694 more rows
From there, extract the correct data and rearrange it so that you get a data set like this:
# A tibble: 10 × 8
continent country year1957 year1967 year1977 year1987 year1997 year2007
<fct> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Africa Egypt 44.4 49.3 53.3 59.8 67.2 71.3
2 Africa Sierra Leone 31.6 34.1 36.8 40.0 39.9 42.6
3 Americas Jamaica 62.6 67.5 70.1 71.8 72.3 72.6
4 Americas Nicaragua 45.4 51.9 57.5 62.0 68.4 72.9
5 Asia Singapore 63.2 67.9 70.8 73.6 77.2 80.0
6 Asia Syria 48.3 53.7 61.2 67.0 71.5 74.1
7 Europe Netherlands 73.0 73.8 75.2 76.8 78.0 79.8
8 Europe United Kingd… 70.4 71.4 72.8 75.0 77.2 79.4
9 Oceania Australia 70.3 71.1 73.5 76.3 78.8 81.2
10 Oceania New Zealand 70.3 71.5 72.2 74.3 77.6 80.2
Once you’ve done that you can create your first basic table with flextable()
. Hint: In this exercise, you can use this vector as a helpful tool (so that you don’t have to type out the countries).
selected_countries_vector <- c(
'Egypt', 'Sierra Leone', 'Nicaragua', 'Jamaica', 'Syria', 'Singapore',
'Netherlands', 'United Kingdom', 'New Zealand', 'Australia'
)
Have any questions? Put them below and we will help you out!
Course Content
16 Lessons
You need to be signed-in to comment on this post. Login.
Alberto Cabrera • May 17, 2024
To save time, I saved the 'penguins_wider' df as an RDS file. Unfortunately, it had the effect of replacing the
white
spaces, standing for missing values, with zeroes. What mutate routine could I use to restore the original spaces. Thanks.David Keyes Founder • May 20, 2024
Could you please share the full code you used so I can run it myself?
Alberto Cabrera • May 20, 2024
Hola David.
I am surprised and much impressed by your reply. I was expecting to hear from Albert. My question deals with the Add Summaries Lesson. I was exploring an alternative approach to the one followed by Albert in merging the data files containing the summary statistics to be joined with a wide version of the penguins dataset (see lines 272 thru ). After binding together the files containing the total and maximum and total summaries across species (see lines 475-500), I noticed the resulting file had lots of NAs. I studied Albert's book on using gt(), and assumed I had to recode the NAs before rendering the table. It turns out I did not have to do recode the missing values. Unlike gt(), flextable codes as spaces. I am sending you the Quarto file over email.