# Crosstabs
Sometimes you want your results in a crosstab. We can use the `tabyl` function in `janitor` package to make crosstabs automatically.
Create a crosstab of `gender` and `health_gen`.
```{r}
nhanes %>%
tabyl(gender, health_gen)
```
Add a `drop_na` before your line with `tabyl` to get rid of all NAs.
```{r}
nhanes %>%
drop_na(gender, health_gen) %>%
tabyl(gender, health_gen)
```
`janitor` has a set of functions that all start with `adorn_` that add a number of things to our crosstabs. We call them after `tabyl`. For example, `adorn_totals`.
Use the code above and then add totals using `adorn_totals` in the rows and columns.
```{r}
nhanes %>%
drop_na(gender, health_gen) %>%
tabyl(gender, health_gen) %>%
adorn_totals(where = c("row", "col"))
```
We can add `adorn_percentages` to add percentages.
Use the code above and then add percentages using `adorn_percentages`.
```{r}
nhanes %>%
drop_na(gender, health_gen) %>%
tabyl(gender, health_gen) %>%
adorn_totals(where = c("row", "col")) %>%
adorn_percentages()
```
We can then format these percentages using `adorn_pct_formatting`.
Use the code above and then format the percentages using `adorn_pct_formatting`. Add arguments so that the percentages are rounded to 1 digit. Note that R uses the "half to even" rounding method by default so if you want to round, say, 14.5 to 15 you must use the `rounding` argument (type ?adorn_pct_formatting in the console to learn more).
```{r}
nhanes %>%
drop_na(gender, health_gen) %>%
tabyl(gender, health_gen) %>%
adorn_totals(where = c("row", "col")) %>%
adorn_percentages() %>%
adorn_pct_formatting(digits = 1,
rounding = "half up")
```
If we want to include the n alongside percentages, we can use `adorn_ns`.
Use the code above and then add a line with `adorn_ns` to include the n.
```{r}
nhanes %>%
drop_na(gender, health_gen) %>%
tabyl(gender, health_gen) %>%
adorn_totals(c("row", "col")) %>%
adorn_percentages() %>%
adorn_pct_formatting(digits = 1,
rounding = "half up") %>%
adorn_ns()
```
We can add titles to our crosstabs using `adorn_title`.
Use the code above and then add a title using `adorn_title`. Use the `placement` argument and see what you get.
```{r}
nhanes %>%
drop_na(gender, health_gen) %>%
tabyl(gender, health_gen) %>%
adorn_totals(c("row", "col")) %>%
adorn_percentages() %>%
adorn_pct_formatting(digits = 0,
rounding = "half up") %>%
adorn_ns() %>%
adorn_title(placement = "combined")
```
We can also do three (or more) way crosstabs automatically by adding more variables to the `tabyl` function.
Use the code above, but add a third variable (`age_decade`) to the line with `drop_na` and the line with `tabyl`. You should get a series of crosstabs.
```{r}
nhanes %>%
drop_na(gender, health_gen, age_decade) %>%
tabyl(gender, health_gen, age_decade) %>%
adorn_totals(c("row", "col")) %>%
adorn_percentages() %>%
adorn_pct_formatting(digits = 0, rounding = "half up") %>%
adorn_ns() %>%
adorn_title()
```
You need to be signed-in to comment on this post. Login.
Daniel Sossa
March 14, 2021
I noticed that the rendered table you get when knitting after adding a third variable is not the same you get when doing only two (it is only text, not the nice table format). Is there a way to get the same type of table rendered?
David Keyes
March 15, 2021
Alright, so this is not my strongest area, but I tried to explain this a bit in this video. Hope it helps!
Vuk Sekicki
March 29, 2021
This video does not have sound. Tnx
David Keyes
March 29, 2021
Oops, sorry! I'll re-record later today.
David Keyes
March 29, 2021
Ok, the video linked now from the comment above is updated. Hope it helps!
Abby Isaacson
March 30, 2021
I had this same question, thanks for addressing it (even though it makes me want to avoid that for now!).
Jyoni Shuler
March 26, 2021
To solve the error message and remove the N/As, would we use the 'drop_na' function then?
David Keyes
March 28, 2021
Yup, that works! You can run that before doing the crosstab, which will get rid of any missing values and thus should get rid of the error message.
Tatiana Bustos
July 28, 2022
For the last coding solution - what do the $ indicate next to the age_decade? Will this show up in the table in the final report (once we convert to word or other)? Is there a way to remove that and just show the actual age_decade labels?
David Keyes
July 28, 2022
I'm not sure exactly why there is a $, but it will show up in your Word document when you knit. To be honest, I don't use the
tabyl()
functions for final outputs that I would share with others, just for exploring the data on my own when starting out. As a result, I don't care that much about having the ugly $.Emma Spielfogel
October 5, 2022
Is there a way to show both column & row percentages using tabyl? Or is there another crosstab function that could do this? (I'm thinking similar to a proc freq in SAS, which has frequency, percent, column percent and row percent for each cell)
David Keyes
October 5, 2022
I believe the adorn_percentages() function should do this, unless I'm missing something about your question.
Emma Spielfogel
October 5, 2022
Hi David! I figured it out, I just had to mess with the denominator to get what I was looking for: adorn_percentages(denominator = "col")
Thanks!
David Keyes
October 6, 2022
Great!