R for the Rest of Us Podcast Episode 15: Nicola Rennie
In this episode, I talk with Nicola Rennie about making data viz for mobile devices. Nicola is a lecturer in health data science based within the Center for Health Informatics, Computing, and Statistics at Lancaster University in the UK.
She recounts her initial encounter with R and how she got deeper into data visualization in R as a means of creative expression. Amidst the plethora of programming languages available, Nicola sheds light on why she chose R specifically for data visualization. Additionally, she offers valuable advice for people wanting to get started with data visualization using R.
Connect with Nicola on her website, X, LinkedIn, Mastodon, and ORCID.
Learn More
If you want to receive emails when we publish new podcast episodes, sign up for the R for the Rest of Us newsletter. And if you're ready to learn R, check out our courses.
Audio Version
Video Version
In the video version, Will gives a code walkthrough of how the {targets} package works.
Resources Discussed
Transcript
[00:00:25] David Keyes: Well, I'm delighted to be joined today by Nicola Rennie. Nicola is a lecturer in health data science based within the Center for Health Informatics, Computing, and Statistics at Lancaster University in the UK. Her research interests include applications of statistics and machine learning to health care.
[00:00:40] care, communicating data through visualization and understanding how we teach statistical concepts. Nicola also has experience in data science consultancy, collaborates closely with external research partners, and is co author of the Royal Statistical Society's Best Practices for Data Visualization Guide.
[00:00:58] She can often be found at data science meetups, presenting at conferences, and is the Our Ladies Lancaster Chapter Organizer. Nicola, welcome, and thanks for joining.
[00:01:07] Nicola Rennie: Thank you very much for having me, I'm very happy to be here.
[00:01:11] David Keyes: Great. Um, well, we learned a little bit just now about, um, kind of what you do now. And I want to come back to that in a second, but I'm curious kind of how you got into R in the first place.
[00:01:21] Nicola Rennie: So I think the first time I ever used R was about nine or ten years ago in an undergraduate statistics class and I hated it so much. I'd never done any sort of programming before and I very much felt like I was sort of thrown in to the deep end of it and I didn't, it just didn't click at all. And I think sort of for the rest of my degree program, I actually mainly used Python before I finally came back to R, maybe about sort of six years or so ago.
[00:01:53] And I think that was more when I was working with data, and I found the sort of data wrangling processes in R a little bit easier. And That was sort of during my PhD program and I did my PhD in partnership with industry and I actually saw people using R in industry and sort of this is how we actually do things in the real world with R and it all just sort of started to click a little bit there.
[00:02:19] So I think, yeah, originally nine or ten years ago but actually in practice about six years ago was probably when I started properly using R.
[00:02:28] David Keyes: That's interesting because you hear so much, you know, especially as Posit, the company, has kind of made the shift into Python. You hear a lot about kind of, People going the opposite direction, but it's interesting that it sounds like you actually had worked with Python first But then moved into R Because you felt it was it was better suited to to working with data It's just a contrast to what you hear so much these days I wonder if you could give me just kind of an overview of what your your day to day Use of R looks like I know you do a bunch of things from You know, teaching to data visualization, which we'll talk about in a bit.
[00:03:06] But yeah, on a day to day basis, what does your, your usage of R look like?
[00:03:11] Nicola Rennie: So I'm a lecturer in health data science so in terms of the sort of data processing side of it, you know, I'm doing stuff like?
[00:03:17] processing and wrangling health related data. Um, a lot of that comes with applying statistical models, um, some machine learning as well, and then things like making data visualizations for papers or reports or presentations. I think What I use R the most for at the moment is actually for developing teaching materials in combination with R Markdown or Quarto. So I make things like lecture slides, lecture notes, course websites, sort of simulated data for teaching as well, because I work with health data most of the time. You can't actually share that with students, so you have to sort of simulate data and R is really helpful for that. The other thing I really like about using R for this is that I can make parameterized reports, um, in quarto. So I make parameterized tutorials for students so I Can make a version for students that doesn't have the answers in it. And then I can make a version for myself that has the full solutions in R as well.
[00:04:17] Um, and it means I know that the code that I'm giving to students actually works as well when you're sort of making a live documents every week, um, which is really nice. And then more,
[00:04:27] David Keyes: you about that? Sorry to interrupt you, but I'm very curious about that. So you make a, say, a quarto document. And I assume you have a ver or you put all the code in that has the, um, solutions. And then the param parameterization of it involves rendering a version where you, like, set echo equals false or something like that so that it doesn't show the code.
[00:04:52] Is that is that accurate?
[00:04:54] Nicola Rennie: Uh, yes, pretty much. Um, so I have, uh, That's our parameter that's like, hide or show answers, and then I actually use, uh, sort of Conditional displays in quarto, so you can sort of say, you know, if, if it's, if, if the parameter is true, then show this content or hide this content, um, because that means that rather than using the echo is that I can also put sentences and sort of explanations for why I'm using code in the same sort of hidden or, or shown section.
[00:05:24] Uh, yeah,
[00:05:25] David Keyes: Oh, that's really interesting. I mean, I've heard about parameterization. I mean, we do, we do a lot of parameterized reporting, but never in that context. That's, that's a fantastic use case that I never would have thought of. So, um, but I interrupted you. You were talking about other things that you do, uh, with R on a daily basis.
[00:05:42] I'll let you continue with that list.
[00:05:44] Nicola Rennie: so the other, the other thing I
[00:05:45] did, which you kind of mentioned is sort of data visualization. Um, some of that is mainly for fun and for hobbies, um, but other. Part of it is, sort of, with the Royal Statistical Society's best practices for data visualisation guidance. That is also all built in quarto, and most of the examples in there are built with R as well.
[00:06:05] And part of that was sort of developing R packages so that people who are writing, um, For publications, don't have to know all the ins and outs of R code to make their, their plots look a particular way, you know, start building those helper functions so that people can implement the styling that we're asking them to as easily as possible.
[00:06:26] David Keyes: mean, you're clearly very into data visualization overall. I'm curious kind of where that that interest has come from.
[00:06:34] Nicola Rennie: So I think for me, at first glance, most people will probably see the sort of PhD in statistics and assume that because I work with data a lot, I've probably been making histograms and whatnot for, for a long time. But I think that's actually, Um, not really how I got into data visualization, um, so actually it's, I think at the time when I started working more on visualization, I was looking for more of a creative outlet.
[00:07:00] So if you go all the way back to high school, my favorite subjects were art and music, not maths and computer science. Um, so I kind of got to a point where, you know, I started doing a lot of statistics And a lot of data analysis, and I wanted a more creative outlet, but I also wanted to get better at programming.
[00:07:18] So I was looking for something that sort of satisfied both of those things, where I could do something reasonably creative or artistic and sort of design focused, but also learn how to use R a little bit more in a way that wasn't just working with the same data that I was working with every single day.
[00:07:34] Because I felt like when I was working with R, You know, you, you see the same data set all the time. I know exactly how to process this, but it's sort of, well, if you throw me some other data that I've never seen before, how do I process that and find something in it? So it was a combination of those two things, I think, that actually got me properly into, to visualization.
[00:07:54] David Keyes: Yeah, that makes a lot of sense. And I think, you know, you mentioned before people assume came from doing your PhD in statistics, but I think A lot of people kind of naively assume that people who are good at statistics are also good at data visualization, and those are two very, very different skill sets.
[00:08:12] So, um, I've always been struck that you're, I don't know if I want to say rare, but you know, relatively, the relatively rare person who is able to do both of those things. Well, I don't know your work on the statistics side, but on the data viz side, you do really nice work there, which is unusual. So, um, I'm, I'm curious with, like, why do you think R, you know, why do you use R?
[00:08:38] I mean, you mentioned before, for example, you came. Um, to R from Python. So what makes R well suited to doing the type of high quality data visualization that you like to do? And maybe if you want to even contrast it a bit with like Python, I don't do Python myself, but you know, there's a lot of discussion of it.
[00:08:56] I'm curious why you choose to do that work in R.
[00:09:00] Nicola Rennie: I think there's, there's sort of two parts to, to this question. The first part in my head is why would you choose to use a programming language to make visualizations at all? You know, why would you choose to write code when you could do something that's more click and drag? And I think, you know, if you're working in industry or you're working for a company, quite a lot of the time you are making similar plots sort of every month or every year.
[00:09:23] And if you're using a programming language, then once you've made it once, you've essentially made it. every time you might possibly need it. So it does sort of stop the repetitive elements of visualization. And I think if you're sort of more on the academic side, then it is just more reproducible as well.
[00:09:40] You know, when you come back to, to work six months later, you can rerun it without having to remember exactly what you did. And then, and then like you say, there are choices of programming languages. It's not necessarily default to R. And I think for me, there's a couple of reasons why I prefer R. Some of them are Just the sort of wide range of packages that exist.
[00:10:05] So there's, initially when I started out in R, you know, you'd get maybe like 90 percent of the way into data visualization and then there'd be the sort of final touches you wanted to make of sort of adding annotations or sort of logos or that kind of thing. And it always initially felt, okay, I might just have to edit this outside of R for a little bit.
[00:10:23] But I think in the last sort of few years, those. extra packages that help you do get that last 10 percent have really come a long way. And now I can pretty much make my entire visualizations in R because of those extra packages. Um, when I've tried to do the same thing in. Other languages, it's always sort of felt like sort of hacky solutions to try and get exactly what I wanted within a programming language.
[00:10:47] Um, but within R there are, there's such a wide range of, of packages and I think the community support as well for data visualization in R is, is really broad. There's a lot of people doing, um, really interesting things. It's also very easy to sort of build your own styling packages. So, you know, if you do want to implement the same styling for every single plot you do, then you can make a package and it sort of adds a couple of lines to your code and it does it all for you. It also does interface with other languages or other formats reasonably well, so there will be some people that no matter how Nice, your plot is in R, they still want it in an Excel spreadsheet or they still want some sort of data and it's reasonably easy to, to export those into other formats as well.
[00:11:31] So I think it does sort of give you that combination of reproducibility and overall sort of customizability with the option to, to put it into different formats. Yeah, that's great. I'm curious, you talked about, you know, the packages that you use for that kind of last 10%. Are there particular packages that you would highlight that have made it possible for you to do everything in R?
[00:11:54] So I think for me, my sort of favorite add on package is probably ggtext. Um, so for, for working with. Text in R and ggplot2 is really nice because it does those sort of small things that I wanted to do sort of really easily, so things like automatically wrapping your text so it fits in the area of the plot, rather than having to manually add line breaks to plots.
[00:12:19] Things like sort of custom annotations with different colored text. Um, you know, if you want to replace a legend with some colored text in a paragraph, again, it makes it really easy. And you can add sort of html and markdown, uh, text to, to comments and it just Seems to work. Um, so that's been really nice for doing even stuff like adding icons, um, into plots because it, it processes HTML code and into, uh, ggplot2, you can do that very quickly and easily without actually having to think about sort of icons or images or things like that. Yeah, that's great. So I'm curious, what do you recommend, you know, if someone is thinking hey, like I've seen the the type of data viz that you do or other people do in R, and it looks really great, and I'd like to do that. What do you recommend for someone who wants to to get into making that type of data viz in R?
[00:13:11] so I think if you've, if you've never really done much data visualization in R?
[00:13:16] the easiest thing is just to start doing it. Um, so that is essentially how I got started, I sort of stumbled across TidyTuesday Um, so you start a new data set every single week, people make visualizations, you see what other people Have found with the same data set or you sort, you can see the code of how they made their visualizations.
[00:13:36] So it was a sort of like constant influx of new data and, and new ideas, which is really good. The other thing I think is quite useful is looking at things like news articles and thinking, okay, what's the technical aspects of how do I actually remake this same plot in R or are there. better ways of visualizing that same data and doing it that way, sort of doing a sort of new and improved version of news article visualizations that you see. I think in terms of sort of thinking about how do you improve your data visualizations once you've you've gotten started a little bit, I think looking at What other people do with the same data is really useful. Um, I know that I personally have a list of sort of ggplot packages and functions that I want to work through and a list of sort of different types of visualizations that I'd like to, to work on.
[00:14:26] Um, so what I tend to do when I I first start looking at data is actually sit down with a pen and paper and sort of sketch out some ideas before you sort of jump into wrangling data and making plots as what's the actual idea, what's the story you're, you're trying to tell with the data and thinking about it from that side.
[00:14:44] David Keyes: Yeah, you know, that's interesting. I mean so many people I've talked to Um, who do really good data viz, have said that, have talked so much, when I ask them, okay, how do you make good data viz in R, their answers are actually less, way less about R than you would expect. They'll say things like, look at news articles, or, I was talking to, um, And for my book, Georgios Karamanis, who recommended just like looking outside of any kind of data viz things.
[00:15:13] You know, he talked, he's, I think he's also a photographer and he talked about like looking in nature and seeing kind of color palettes and that kind of thing. Um, and your, you know, example of not sitting down and getting into the code right away, but sketching things out with pen and paper, I think is also a really good example of how the, the kind of final product in our, is, you know, comes after a whole process.
[00:15:37] It's not the first thing that you're doing, which is really interesting.
[00:15:41] Nicola Rennie: Yeah, definitely. I mean, R is sort of, I mean, it's a tool that you're using to, to build it. It's not necessarily the, the starting point, you know, it is essentially something that can draw lines and rectangles and circles. And you just start figuring out how do those shapes fit together on a page to tell a story.
[00:15:57] David Keyes: Yeah, which is interesting. You can correct me if my understanding of ggplot is wrong, but my understanding is ggplot was developed as a way to kind of like quickly, you know, make plots, um, to do kind of exploratory data Um, analysis and, you know, obviously it's been around long enough at this point that it's extremely mature and people now use it to make, you know, high quality production ready plots.
[00:16:24] Um, so it's interesting to see that kind of trajectory over time. Um, cool. Well, the, the specific topic that I brought you on to talk about was, um, making data viz for mobile devices, because I think it's, you know, especially as more and more, um, media and just. Anything is, is consumed on mobile devices.
[00:16:49] It's important to, to think about that. And, um, you wrote a blog post that, um, we'll, we'll link to in the show notes, but I was also hoping to have you kind of put your screen up and just walk through, um, an example and give some tips because I think you have some really good thoughts on how you can make data viz that works well on mobile as well as desktop, of course.
[00:17:11] So if you want to put your screen up now, that would be absolutely. So, Nicola, I'll mostly just kind of let you, like, walk through. I mean, I know I have some questions, but, like, I'll mostly just let you kind of walk through and, like, interject with questions as you go, if that's okay.
[00:17:31] Yeah, that's fine. So, I'll run through a sort of data visualization I made for TidyTuesday in, uh, around November of last year. So I have some pre prepared code here that I'll just sort of Run through, it's quite long code, uh, so I didn't want to have to type it all out, uh, during the session. So Using sort of three different packages, so tidyverse for basics or data wrangling, I use a showtext package to import other fonts to use in ggplot2, and then the ggtext package that I mentioned earlier.
[00:18:07] Uh, for adding sort of custom annotations and, And wrapping text. So we just sort of load the data in.
[00:18:19] And I will say for anyone watching this who's never heard of TidyTuesday, it's a social data project that releases data every week and then people like Nicola can, you know, grab that data and make any kind of, we'll do analysis and then visualization. And I will say also the, the package you're using there.
[00:18:36] So there's a TidyTuesday R package, right, that like helps you to access that data without having to kind of manually grab it. Is that, is that what's going on on line 10?
[00:18:44] Yeah, yeah, exactly. So it is sort of saved as CSV files on GitHub. Um, so you can copy out the long URL with the CSV file, but the tidy Tuesday R package means all you really need to put in is the date that the data was released on. So it makes it really easy to sort of access it each week. So we have some data.
[00:19:08] The other thing I'm going to do is load in some, uh, other fonts to work with. Uh, so I do use the show text package for fonts quite a lot, um, mainly because I really like using it with Google fonts. So the font at Google is actually from the sort of sys fonts package, which imports it. And it means that you don't have to worry about importing or installing font files on your laptop.
[00:19:31] It's sort all handled within R, uh, which I think is, uh, really nice. Oh, I was waiting for you to pronounce the name of the font Oh, I'm, I'm not going to that. Okay, I'll give you a pass.
[00:19:49] And then what I also tend to do is think about defining colors and fonts as variables in R. So here I'm sort of defining a background color, a text color, and a highlight color. Um, so I'm making the data that we have is data around. RLADYS chapters and events. So this is a sort of RLADYS, uh, purple color.
[00:20:13] And then saving the font I've installed as a variable as well. And the reason that I tend to do that is because you quite often want to use the same color in multiple places in your plot. And if you decide at the last minute that actually you want to change that color to something else, all you have to do is change the value in the variable.
[00:20:31] You don't have to sort of copy and paste or go through and find all the places that you've used, uh, those variables. So I can sort of I save these as variables and I do the same thing with text as well. Um, so quite often when you're writing things like text, it does take up quite a bit of space. Um, you're keeping that inside all of your ggplot code, it feels like your code gets very long.
[00:20:56] So I sort of take it outside of the plot as well and save these. as different variables. And what you can see here in sort of line, uh, 34 is that I am using that markup, uh, that markdown syntax to sort of style, uh, some bold font around the word data and around the word graphic as well. And I find that much easier than trying to sort of use other methods of getting bold font and aligning things. yeah. And so you're going to use that, I assume, then with the ggtext to make it actually bold.
[00:21:31] Yeah, exactly. And then we have some data wrangling here. So this is just sort of, uh, processing the data a little bit to pick out, uh, Just the columns, um, I want, and also formatting the labels of the dates. So, we have, uh, different dates. What I want to get out is the day of the week, using, uh, wday from lubridate.
[00:21:59] And I want that, that labeled, and then counting up the number of, uh, locations on each weekday per year. Um, and whether or not they are in person or online. So then, what I'm going to do is I'm going to plot that with ggplot2. I'm just going to keep it simple and do some points, so essentially making it sort of scatterplot, and I want to facet it.
[00:22:25] based on whether it was in person or online. So if I run this code, So we have our plot here, and I have I've edited a sort of few elements of the plot. As you kind of mentioned, I'm using ggtext to process, uh, the different text elements in the plot. So I'm applying element textbox simple from ggtext to the title, subtitle, and caption. And one thing And what this does is it wraps the text automatically, so if you do have a subtitle that takes, you know, three or four lines, you don't have to manually put in those line breaks to split it into three or four lines, it doesn't run off the end of the plot, it just sort of wraps it automatically, which is what I tend to use it for most often. And you can see at the wait, can I ask you about that? Because I've never actually used that. So, for example, when you use element textbox simple, um, what do I, I I use, um, is it just element markdown, I want to say, or something like that? Is that another, the other option? But it sounds like that doesn't necessarily wrap it, whereas this Does?
[00:23:38] Is that, are there, am I understanding that, the difference correctly?
[00:23:42] Yeah, yeah, that's exactly the difference. So you can use element, markdown, and that does the sort of markdown processing, um, you know, things like getting bold font. Element Textbox Simple does that and wraps it into essentially a text box as wide as the plot area as well, um, so there's no sort of manual line breaks, which I've definitely done in the past, or sort of using, um, string wrap to sort of cut it off Oh yeah. number of No, I mean. That's so helpful. Like I was thinking about that because when I make titles or more often like subtitles, but they're kind of longish, you know, I'll have to do like, have a line up above that has STR wrap and I'll, you know, test out, well, if I do it at 40, does that work? No, I need to do 35 and I keep going.
[00:24:28] So this is a, a uh, tipped.
[00:24:31] You can maybe see it if I resize the plot. I think it's going to take a little while to, to change, but, um, it does sort of resize and rescale the plot, um. And wrap that text, you can see it there. So it's gone on to two lines automatically and it just fits the width the plot, um, which is really nice. Oh, that is really nice. Well, I mean, obviously, spoiler alert, especially if you're thinking about mobile, that obviously makes a ton of sense. So I won't, I won't steal your thunder though.
[00:25:05] Um, but Yeah,
[00:25:06] it's one of those, it is one of the reasons it's one of my favourite R packages, is because it has saved so much time of trial and error just going back and forth. You know, like you say, trying to choose exactly the right number of characters, or manually inserting line breaks. It seems to just work and just do exactly, um, as I want it to do, which is, is perfect.
[00:25:29] that's great.
[00:25:31] Nicola Rennie: So we start thinking a little bit more about visualizing data on a sort of mobile screen. I think there's a couple of things you typically start thinking about. The first obvious one is a sort of aspect ratio. Um, so sort of, you know, how wide versus, uh, how tall is the visualization you're creating, and a lot of the time when you're making visualizations for things like reports or papers or websites, they do tend to be landscape.
[00:26:03] You know, if you're printing something on an A4 sheet of paper, you don't want it to take up the whole page, so it goes landscape and takes up half the page. Whereas when we're thinking about mobile, we're typically looking for something that's more portrait oriented. You also typically just have a lot less space to work with, so when you're sort of working with, quite often when you view other visualizations on mobile, you end up having to sort of zoom in on your phone and it's really hard to read and you're sort of moving it around and trying to read what the title was because it was designed for something, a much larger screen, and then it gets shrunk down and you can't read anything.
[00:26:43] So it's thinking about less. Create a visualization that is roughly the size of your sort of standard, uh, phone screen. So, my favorite package, uh, for doing this is Actually, the camcorder package in R. So if you want to make plots of a particular size, you can set up new plotting windows. So there's a sort of dev.
[00:27:10] new function and you can say, okay, I want it to be two inches wide, four inches tall, but it sort of opens in a separate window. And I've always found that a little bit sort of clunky and difficult to work with in practice.
[00:27:22] David Keyes: mm hmm,
[00:27:22] Nicola Rennie: What the camcorder package does is it essentially, it actually opens up, uh, the viewer tab and you can set sort of output format you want, do you want a sort of PNG file for example, you can set the width and height of the plot that you're trying to create, and you can set the resolution, you can set the DPI.
[00:27:44] Because this is the other problem I had, is that you might spend a long time making and sort of your visualization R and getting everything all laid out. exactly in the plot pane, and then you'd go to save it with ggsave, and everything would be sort of, uh, all over the place and not where you wanted it to be.
[00:28:06] Because when you're viewing them in the plot window, the resolution is lower. With ggsave, you are using that sort of higher resolution, the 300 dpi. So if you can view the visualization in the same resolution as. So, if you're working on your final version that you want to save, it means you're, you're seeing exactly the final version when you're working on it, which I think is really, um, nice and really helpful. Yeah, and it seems like this is useful, I mean, we're talking about it in the context of, you know, making mobile, like, data viz for mobile, but anytime you're, you know, making a plot and you need to make sure that it's going to look exactly the way you want it, you know, in these dimensions at this DPI, it seems like this technique would, would work well for that.
[00:28:52] Yeah, absolutely. I think I, I genuinely use this for sort of every visualization I make because it just, it's, it's that sort of, the resolution and the sizing and if it, ggrecord does automatically save. Uh, an image of every plot that you make, so you don't need to use ggsave, but you can pass the same parameters into ggsave if you want to, and you know that it's going to look exactly the same as it's looking on your screen when you're working in RStudio.
[00:29:23] So, if we rerun exactly the same plot, As we had before, on this sort of much smaller screen, you can see that it's, it's definitely not working well, sort of, out of the box. Um, our text is, is taking up quite a lot of space. Our plots have actually, um, entirely disappeared because there's just not enough space for them. And that's because, sort of, the, the default layouts in ggplot2, I think, are designed for landscapes. You know, when you look at the facet, uh, wrap that I'm using, I'm not specifying number of columns or number of rows. By default, it just puts them in, in one row. Whereas actually when you're thinking about visualization for mobile, one column makes a lot more sense. The other thing we might want to think about is the text size, okay, so we have gone from something that was sort of maybe five inches wide down to something that's a space that's only two inches wide, so we do want to reduce, uh, that text size as well. And this is one of the, uh, sort of tips I've, I've gradually learned over the years, is that you can set a base size and then use relative sizing to set each of the, the individual text elements. So our original base size was 24, we can reduce that to 12. And you can see in here that I have changed the size of the sort of script text for the facets or the plot title. But everything is set relative to that base size. And so again, it means that when you want to change the size of all the text on your plot, you don't have to manually go through every single element where you set the text size and change it.
[00:31:09] You can just change the one value. And we can also change that facet to be one column as well.
[00:31:19] And can I just ask on the relative sizing thing, so for example, the plot title there on line 53, you say size equals rel 2. So I assume that means that the size, because you set the base size to be 12, the plot title will then be 24 because it's 2 times 12, is that, is that right? I think so. Yeah. Okay.
[00:31:47] So you can see here that when we sort of changed our base size, everything is sort of shrunk down. Not just sort of the elements where you have set the relative size, but even things like the axis text labels and the sort of legend text is also scaled down as well. So even when you're not explicitly setting that relative size, everything in ggplot2 does.
[00:32:08] Does sort of have its own, uh, relative size there as well. So the other things we might want to think about changing here are sort of the size of, of the points of these sort of bubbles here. So we might want to make them a little bit smaller. So we can do that using the sort of scale size area functions and set a maximum area size for this.
[00:32:32] David Keyes: Mm
[00:32:33] Nicola Rennie: you can look at sort of scale as area and then, uh, max size. is, uh, say three, for example. I think the standard maximum size is six. The other things we might want to change here is where the legend is. So again, this is sort of the same problem as facets automatically go side by side. There is a default position for the legend. It's always on the right hand side. And again, that's sort of assuming that you have space on the right hand side of your plots to add a legend, but this is sort of.
[00:33:12] wasting quite a lot of space here and everything is still looking quite sort of squashed on our plot. So what we might want to do is stack this either on top or underneath our first plot. So we can do that using quite a few of the sort of legend options within the theme of ggplot2, and this is something that has quite recently changed in ggplot2, uh, so the sort of newer version of, uh, ggplot2, uh, 3.
[00:33:44] 5 came out quite recently, and there's a whole lot of new, uh, options for how you deal with legends, um, which I really like. So you can sort of change the legend position, and you can put it, uh, on the bottom of the plot, you could put it on the top instead if you wanted to. You can also say whether you want it sort of left or right aligned on the bottom of the plot.
[00:34:15] So I think in the older versions of ggplot, if you put the legend on the bottom, it always just sort of came up in the middle. There's now a legend justification option, which means you can say, Okay, actually, I want this right aligned on the legend, uh, on the bottom of the plot. So we can have a little look at what that looks like. That's really cool. I haven't played around with, um, ggplot 3. 5 yet. So this is, this is the first time I'm seeing, I mean, I've kind of read the blog post, but this is the first time I'm actually seeing some new code from it in action.
[00:35:02] Yeah, I think one of the things I like the most about the changes to the legend is that you can style individual legends separately. So if you have multiple legends on a plot, you can put them in multiple different positions and you can have sort of different styling for each of them, which is really nice. Yeah, that
[00:35:21] David Keyes: makes sense.
[00:35:23] Nicola Rennie: So you can see sort of straight away here how much clearer this is when you're looking at it. compared to the first version we had, um, where everything sort of just looks squished up and we can't really see anything on the plot. And I think this is one of the things where the cha the different changes you might make to a plot to make it suitable for mobile visualization really depends on what's in that plot.
[00:35:52] So this example is, I guess, quite straightforward. You have a facet. And that were sort of next to each other, and you can quite easily sort of change it around to stack one on top of the other. You might have that situation where you have things like, uh, for example, maps, where the aspect ratio of, sort of, geography limits you a little bit in how much you can rearrange it.
[00:36:16] So if you're thinking of something like a map of the world, and it's on a landscape plot, it's going to take up most of the space on that plot. Where you then switch it to mobile, it's probably only going to take up So you then have to think about, well, what goes in the other half of that plot to sort of take up the full amount of space on, on mobile. So I think, Yeah.
[00:36:41] there's different things you want to think about when you're, you're changing from a sort of landscape visualization for a report or a paper to thinking about something that's specifically designed for mobile. I think one of the things is. useful to think about is what sort of output format is it?
[00:37:01] So if you're building something like a Shiny app or a sort of quarto document, if you have this example with the facets, You can also ask yourself, does it need to be in facets, or could these actually be two separate plots? So you can think about in quarto, makes use of the sort of grid layouts, which automatically stack when you switch from viewing something on desktop to viewing it on mobile.
[00:37:28] If you can make them two separate plots and make use of the grid layouts. you don't have to worry quite as much about designing it specifically for one type of screen size. So thinking a little bit about what, what's the output format and how much of that output format can you utilize to take care of the mobile visualization and how much do you sort of have to do in ggplot2 on an individual plot. Yeah, you know, it's interesting because it seems like in many ways it actually Thinking about mobile just forces you to think about kind of general like usability Questions that go beyond just mobile, you know thinking about how are people going to use this? in you know in different contexts and it gives you You know the opportunity to consider Well should this again like be you know a single ggplot or should I if I'm?
[00:38:25] Should I use the built in columns there? It just seems like it forces you to think about larger questions than just the specifics of, you know, how do I make this mobile friendly?
[00:38:36] Absolutely. And I think One of the particular things is, because you have so much less space to work with, you know, if you're building something for a desktop and you're making a dashboard, you can have lots of different plots, you can have, you know, lots of different text for explaining things. If you're working with just the amount of space you have on a sort of mobile screen, you start to think about what is actually the most important thing. You know, if you only have one plot, one image, to get your point across, you start thinking about, Well, what's the best way of doing that, rather than thinking, okay, I can add this and I can add that, and I can put this over here to help support that. It just makes you think, what's Actually important to go into this visualization versus what's extra add ons that could probably be left out.
[00:39:25] Well, and also in general, it seems like when I see a lot of data visualization that I feel like, you know, doesn't kind of isn't quite as effective. A lot of times it's overly complicated. And so, you know, this kind of like forcing you to strip down to think about what is the most important thing and, you know, make each visualization focus on one particular thing is generally like good practice no matter what size screen you're working with.
[00:39:53] So again it does seem like it just Forces you to think about, um, those kind of general practices in ways that you might not have thought about when you were, you know, not constrained by, by size. It's, it's like the classic thing that, like cons, constraint breeds creativity. Um, and it forces you to, to, to rethink how you're doing things.
[00:40:16] Yeah, absolutely. I completely agree. Um, it's just, yeah, like you say, when you start to constrain you can do, you automatically start filtering out the stuff that's actually not that important, that was just sort of nice to have rather than something you actually need in a visualization.
[00:40:34] Yeah.
[00:40:35] Great, well this was really helpful, Nicola, so thank you again for joining us and thank you for talking about designing data viz on mobile.
[00:40:43] Thank you very much for having me. I really enjoyed chatting.
Sign up for the newsletter
Get blog posts like this delivered straight to your inbox.
You need to be signed-in to comment on this post. Login.