Skip to content
R for the Rest of Us Logo

R for the Rest of Us Podcast Episode 6: Aaron R. Williams, Livia Mucciolo and Safia Sayed

In this episode, I talk with Aaron R. Williams, Livia Mucciolo and Safia Sayed about their experience using parameterized reporting to make State Fiscal Briefs. They discuss their approach and share advice for those contemplating parameterized reporting.

Connect with Aaron on LinkedIn and Twitter, Livia on LinkedIn, and Safia on LinkedIn.

Learn More

If you want to receive emails when we publish new podcast episodes, sign up for the R for the Rest of Us newsletter. And if you're ready to learn R, check out our courses.

Audio Version

Watch the Video Version

The video version has a detailed explanation of how they applied parameterized reporting to make State Fiscal Briefs at Urban Institute.

Resources Discussed

Transcript

[00:00:00] David: Hi, I'm David Keyes and I run R for the rest of us. You may think of R as a tool for complex statistical analysis, but it's much more than that from data visualization to efficient reporting, to improving your workflow. R can do it all on this podcast. I talk with people about how they use R in unique and creative ways.

[00:00:18] Join me and learn how art can help you.

[00:00:21] /I'm joined today, uh, by three folks, two who are currently at, and one formally, uh, from the Urban Institute, um, who work there and do a lot of parameterized reporting. So I'm delighted to be joined by Aaron R. Williams, Livia Mucciolo and Safia Sayed. I am delighted to have you here and thanks all three of you, uh, for joining. Maybe if we can just start out, um, I'll have you kind of one by one. Um, tell us your position at urban, um, what you do there and also kind of how you initially got into R so maybe Aaron I'll have you, uh, start first.

[00:01:01] Aaron: you, David. It's great to be here. So I'm a senior data scientist in the income and benefits policy center at the urban I. And an adjunct professor in the Macor school of public policy at Georgetown university. And I work with the data science team here at the urban Institute. It's really great because urban does a bunch of great policy work and a bunch of different domains.

[00:01:21] And I get to bounce around and wear a lot of different hats. And so I've worked a lot with the tax policy center on data privacy issues, and that's how I got hooked up with them on this state and local finance initiative, uh, in this sort of parameterized reporting with, uh, Safia and Livia.

[00:01:38] David: Cool. Thanks. And maybe, uh, Livia we'll have you kind of introduce yourself and then I'll come back to you in a minute to have you talk about how you got into our, so Livia, why don't you tell us a bit about yourself and, and what you do.

[00:01:50] Livia: Yes. Uh, so I'm a research assistant at the urban Institute, specifically in the tax policy center. I've been there for almost two years. Um, and I kind of work across the board, uh, within our center. So both on the federal tax, um, micro simulator, where we analyze tax bills and provisions, um, and also on the state and local side, which is where these COVID pages are housed.

[00:02:16] Um, and I got started, uh, through our. Uh, through this, this project so, uh, Safia kind of pulled me in, uh, after I started working there and that's how I learned our, it was all on the job.

[00:02:30] David: Wow. Impressive. Um, great. Safia, you wanna go ahead?

[00:02:40] Safia: tax policy center as well. And I similarly worked with the state and local finance initiative a lot, and also doing some of the federal attacks stuff more broadly. I got into our, I took a class in college and my background was in economics and econ research geared towards policy. So the work that I was doing was more about like regressions and econ research things.

[00:03:01] And so my first exposure to parameterized reporting similarly was with this project. And I think that was like a Testament to how flexible and easy to learn are is for so many things beyond just. Progression modeling. So that was a really cool opportunity. Erin, um, pitched the idea of using R for the, for that purpose and learned a lot since.

[00:03:23] David: Nice. That's awesome. Aaron, what, what, remind me, when did you kind of first get into R and how did that come about?

[00:03:30] Aaron: So I studied economics in undergrad and I took an econometrics class and we learned data. And I got, uh, you know, not forever state a license and I graduated and I had no money and my state of license expired. Uh, but I was really interested in this idea of working with data. I think it was actually my parents who had heard of this thing called R and I should probably follow up with them and figure out how they knew about R they said, yeah, it's free and it's pretty popular.

[00:03:57] And I guess you could make cool data visualizations with it. And I was like, oh, it's really interesting. And so I learned through Corsera. Roger Pang and Brian Kao and, uh, all the people at Johns Hopkins, uh, maybe eight years ago now. So it's, um, at the time I had no idea of all the types of things that you could do with it.

[00:04:16] And that's one of the great things about ours. Like you just keep figuring out new things to do with it.

[00:04:20] David: That's awesome. Yeah. And I mean, the, so the book that I'm writing is tentatively called R without statistics, and it's all about, again, like all the things you can do that people, I mean, people tend to think of R as, you know, data analysis and there's much more that you can do. So, um, So, and I got connected to you all through, uh, an article that you wrote on the, um, urban Institute blog about using our markdown to, um, produce a set of reports.

[00:04:50] The article talks about two different reports, the, um, state fiscal briefs, and then reports looking at the impact of COVID on, on state budgets. Um, So let's focus on the latter on the, the impact that COVID on the state budget reports.

[00:05:06] Um, can you at a very broad level, um, talk through what it looks like to create these reports because you have to produce one for each state. So how does our, our markdown parameters reporting work to be able to do something like that?

[00:05:23] Aaron: The main thing is there's a lot of different data sources that come into these reports. And as long as you have data structures, where. You have a variable where each row corresponds to a different state, then you can put together, these are markdown documents with parameters at the top, um, where then you use the render function from the rmarkdown package and you say, Virginia render the R markdown, you know, Vermont render the R markdown and you just go through all the different states.

[00:05:50] Um, that's at the high level, but I think one of the really cool things about this specific implementation. Like all the stuff going under the, on, under the hood to pull data from APIs and clean it and visualize it. So maybe Safia or Livia can talk some about that.

[00:06:05] Livia: Safia. I think you could probably talk more about the process of, you know, It was created, cuz I, so, um, to give some clarity, I joined into this project after the pages were more or less, you know, like ready um, but uh, kind would go off of what Aaron was saying. Um, I think when we first started creating it and determining what we wanted to include in them, we made sure to kind of first create a general template of, um, the types of data that we would like to include the general text, kind of like boiler plate language that would fit, um, for, you know, across all states and that we would be able to have data for all of them.

[00:06:47] Um, so that was kind of the like first basis of it. And that's, I think what allows it to be parameterized um, Yeah. And then it's just a matter of like rendering and, and iterating through those.

[00:07:02] Safia: Yeah. And more like at a high level of what, why we did what we did and why we used our, I think as the pandemic was starting, a lot of our colleagues were tracking different things that were happening in the states with like unprecedented unemployment insurance claims and one area, and then governors doing different things with regards to lockdowns and different public health mandates.

[00:07:22] And we suspected that there might be a relationship between what the. Policy decisions were and what the economics were and the different relationships between the industrial makeups of states and who are their workers and what their industries are. If they rely on tourism and things like that, and how that might be playing out in terms of unemployment and all these different things.

[00:07:42] But we needed to bring all that data together in a cohesive way for people to be able to understand how it all fit together. And so our markdown provided the perfect way to have a cohesive. Set of information, all feeding in with each state, being an observation and having all these different variables and a very clear way to visualize different relationships.

[00:08:05] David: So I'm guessing. I mean, tell me if this is wrong, but I'm guessing there are kind of, it sounds like there are three, like phases. One is gathering the data. And I don't know if you do that, like in our script or like a series of R scripts is what I'm guessing you do. You go out, you gather all that data, wrangle it, get it into consistent formats, save that.

[00:08:29] Then you have an R markdown document, which has the parameter at the top, and then you have finally have a separate R script file. Where you actually have the code that renders all of these reports, is that, is that an accurate summary of your overall process?

[00:08:48] Aaron: It's exactly right.

[00:08:50] David: Okay, cool. Um, so. So when I look at this, all I see is a single website or a single, you know, page on the website, but under the hood, it's actually not that like we talked about before, this is actually being served from somewhere else. So talk about.

[00:09:10] The process of serving the content in a way, like integrating it within the, the urban website and how that works.

[00:09:20] Livia: So, um, through our end, so, and by ours, I mean, uh, mine, Safia or Erin's end um, we would, um, basically. You know, render these files and we would, um, create HTML outputs. Um, and so once those are created for each individual state, um, we would then send it off to our comms team at urban and they would pull in the HTML files and upload it to the actual page on the website.

[00:09:52] So that step, I truly do not know how that happens. Um, but we use GitHub to. Uh, track our changes and also to send, um, these updated HTMLs to the people that then upload it. Um, is that kind of what you were asking?

[00:10:11] David: Yeah. And even, so my understanding from reading your article is the state fiscal briefs. You actually. So for example, if I click on Oregon, say you actually copied the HTML. Into the CMS from the, the rendered reports, whereas with this, I mean, and I can even look at the, the code to see, but I, um, I think I checked this before.

[00:10:35] Um, you actually have, yeah. An eye frame here where you are serving this. Um, it doesn't actually okay. Gets this isn't necessarily directly from. GitHub, I'm guessing you then somehow push these from GitHub or, um, is that right to somewhere on the urban website and then this is actually embedded. So for example, I mean, I'm guessing it would work if I replaced North Carolina with South Carolina.

[00:11:10] Aaron: Yep.

[00:11:11] David: And so. That's how you make, like you have, you know, in a folder somewhere on the urban website, like 51 of these, or however many I don't is that, is that right? And then, and then they're embedded.

[00:11:24] Aaron: Yeah, that that's a hundred percent correct. And I think there's a historical reason for why the process differs a little bit. So I think originally for the. State pages the intention is to, or was to update them maybe quarterly. And so, you know, copying and pasting HTML code for 51 states quarterly, it's not the most fun thing to do, but it's manageable when the pandemic happened.

[00:11:47] We wanted to have, I think, to be useful a much more frequent flow of information. And so we wanted to have something where someone's not copying and pasting something 51 times every two weeks, which I believe is about how often the COVID pages are update. And so that's why this iframe system, uh, came about just one other thing to focus on we've we've posted pages in a bunch of different ways.

[00:12:11] PDFs, GitHub pages, clearly with the CMS, with iframes. Uh, and this is probably the most mature process that we've used. And this case it's actually pulling in all the CSS from the like urban.org. So if the cm, if the, sorry, if the CSS evolves on the urban end, pages should evolve to keep up with that.

[00:12:32] And so when Safia was developing, um, these pages and I was helping her, you know, we tried to basically emulate the CSS in our studio to match the urban website. We developed one version, you know, to sort of see what it would look like on the website. And then we finally iterated it and, and moved them all into this sort of, I frame.

[00:12:55] David: So are you saying like, does this then. Not have any CSS on its own. And it just inherits from the urban website or you created CSS that matches what's on the urban website.

[00:13:06] Aaron: yes. So we, I think we have CSS locally so that we know what it looks like. And I believe it's actually embedded in the HTML by default. We have a script that we run, that's called trim HTML. So, you know, you, you niche your document and it's, you. 10,000 lines of HTML code with CSS and JavaScript, all jumbled in there.

[00:13:28] And basically this trim HTML function goes in and says, give me exactly lines, 1 0 8 to 8,749, because that's all that, you know, our communications team wants. That's what we give to them. And then sort of everything else is then handled by the urban.org.

[00:13:46] David: Gotcha. And when I think it's cool, I mean, So, so your output, when you are working in our markdown what's, is the output format just straight HTML document or is it like HTML fragment or what's the, what's the output in the YAML. Do you know?

[00:14:03] Aaron: oh, it's just HTML document.

[00:14:06] David: Okay.

[00:14:07] Aaron: That's because we want it. We want it to look pretty when we're looking at it.

[00:14:12] David: sure

[00:14:12] Aaron: chop it down to be ugly. And then the website makes it like pretty

[00:14:16] David: makes it pretty again. Um, I mean, that's really interesting too, because it just shows like a, a way, you know, that you can use R to both generate all these reports, but also, you know, like you said, you have a function, the trim HTML function, which gets rid of all the excess, which is not, you know, again, if people are thinking about R as data analysis, um, you're showing a way that you can automate this process.

[00:14:43] That goes way beyond. You know, the kind of data analysis, I think people associate with our, um, are there other parts of this process that you think are, um, kind of interesting or, or that we should highlight? I.

[00:15:00] Aaron: Safia hinted at this, but I mean, I do think it's interesting. One of the key challenges with this. The language. So do you wanna unpack maybe a little bit more about how all the like, narrative changes based on the data?

[00:15:14] David: Yeah. I mean, I'm guessing, you're talking about, for example, here ranks North Carolina, 24th, highest in the nation. I'm guessing Safia. You're not going in and manually typing 24th each time, this update. So yeah. Do you mind just talking about how that, how that works?

[00:15:31] Safia: And I mean, I think one of the. Kind of funnest and hardest parts of this was making sure the grammar and all these things work correctly in a parameterized document where the numbers were really easy because R can calculate what the biggest number is and what the smallest number is. And then we had to think through basic rules of English that we don't even think about as native English speakers about like, when do you use is, and when do you use R and when to use a instead of Ann.

[00:15:55] And so those were all things that. Mostly embedded with like, if else's in terms of, if, whether numbers were plural or not. With the 24th, we had a North Carolina being the 24th highest in the nation with COVID cases, we had. Ranking where we were calculating based on the daily COVID cases from the CDC.

[00:16:13] And then we could just populate. And then again, we had to the, whether it would be 24th or 32nd or whatever were things that we had to think about as, um, just grammar roles and programming that into our was kind of weird challenge. The,

[00:16:28] David: So, did you have like a data frame that says, you know, in one column it's like rank and in the next column, it's like, I don't know what you even call that. Like, It, uh, whatever it's called, like rank order, like the 24th, basically. So it would be, you know, in that row it would be 24 and then the next column would be 24th.

[00:16:48] And then you use that to, to put that text in. Is that how it worked?

[00:16:52] Safia: I don't. I think we, so the 24, we had a column that was a rank and I think the, the th or the ND or whatever it would be, I think we had in the template itself. And so we had, um, we were populating the variable from that column. And then we had, if it ends in four, then it's th if it ends in a two, then it's ND. and then we would discover our errors and realize that there was a case we didn't think of and had to make one up for that too.

[00:17:20] Aaron: Yeah. Like two versus 12. Right? Second 12th. You have to there's special cases. Always.

[00:17:27] David: That's that's hilarious thinking about how you have to parse English, like at that very granular level to be able to write the code that implements it. So it sounds natural. Um,

[00:17:40] Aaron: it's definitely a little bit more work the first time you do it. But now that this has been going on for two years, I think that that effort is like worth it. sort of solved the problem once and then it solved for all times in the.

[00:17:53] David: When did you, so did, when did you start. um, this particular report, I mean, it says may six, but that's obviously not the first time you did

[00:18:02] Safia: So we started this well, the state fiscal briefs were the first sort of time that we attempted something on this level. Um, And I think there had been within the state and local finance initiative. People had wanted to have background pages on each of the states for a very long time because they, we work with states.

[00:18:23] Recognize that each states have their own history and demographics and all these things really matter for the policies. Um, but it was just that project never went off the ground because it was so much work. And there was like a word document of one potential page. And the idea of creating that in word for 51 states just was insane.

[00:18:44] And then also having to update it constantly. I think, um, Aaron mentioned that we update these pretty often. I think once a month. At first, it was like, okay, this is something we can update once a year and already that's gonna be a very heavy lift. And so when I started, this was something that got handed off to me and it was like, okay, this is like a classic, like research assistant task of having to update all these pages and fill out all the numbers.

[00:19:07] And, but then, um, once we integrated it with, with art, it became so much easier, so much faster. We were able to update these much quick. Um, and have information flow much faster and also much more accurate because I mean, having things be accurate is something that we put a high, we value a lot at urban and the idea of having someone checking every single number once we had done it.

[00:19:30] And the, the room for human error, when you have 51 of these and are checking off of PDFs and Excel files and BLS website and all these things, every single number, it would've been impossible. And there would've been a lot of errors before. And so. Using our markdown made it not only quicker, but much more accurate.

[00:19:47] And we were able to pull in more information because of that and make more interesting calculations and observations.

[00:19:55] David: Yeah,

[00:19:55] Aaron: And, have it be much. Right. I mean, there's all these visualizations, which is, which is really important too.

[00:20:01] David: Yeah, let's actually talk about that in one second, but I think Safia, that's such a great point, how it's not just more efficient, but it's more accurate because you're automating it. I mean, obviously you have to write the code correctly and you probably wanna, you know, take a good look at the code that you're developing and have other people check it.

[00:20:16] But once you're, you're certain that the code is, is working as you intended to, you don't have to worry about those kind of copy paste errors that. Always happened, no matter how careful you are. Um, let's

[00:20:30] Livia: if, if I could just add something to kind of on the topic of this, of, um, you know, using our markdown makes it a lot more efficient is we have another resource, um, in the state and local finance page, which is backgrounders. And these are like, Similarly fact sheets about, um, different types of tax revenues and expenditures and those, um, I think there's maybe like 20 or 15 of them.

[00:20:55] Um, we do through like word or, or through Excel and then we upload the HTML to, um, or yeah, up update it in the backend of the website. And it's very cumbersome even with just. 15 of them. And we're only updating once a year, it takes several weeks and several

[00:21:16] David: Oh,

[00:21:17] Livia: RAs or, you know, people to be fact checking and updating numbers and, and things of that sort.

[00:21:22] So our markdown definitely helps with making it more efficient and, um, also like less costlier , um, cuz you know, the more people that, and the more time that's spent on it, the more that they cost. So, um, I think that's like. Another added benefit.

[00:21:42] David: That makes a ton of sense. And, and clearly you all spent a bunch of time, you know, when you first put this together, as, as you said before, but I'm curious now, do these, what's the process like for updating these? Is it fairly automatic?

[00:21:56] Livia: Yeah. I, I can talk about that. um, so I update them, um, every couple weeks or so I think now that the pandemic is. You know, silly edging towards an endemic. Um, we don't update them as often as we used to. Um, but it's very simple. Um, a lot of the numbers, um, that like we pull in, I just download the spreadsheet or, you know, I call the APIs, I run some code.

[00:22:27] Um, I knit, uh, one of the states to check and make sure things look okay. And then I iterate the rendering. Um, and then, like I said, I kind of push 'em off to the communications team so that they can upload it to the website. Um, and so it doesn't take more than half an hour. Um, so very quick. Um, I, I, I've also been doing it a lot that , it feels really easy, you know, I know where I'm going and know where to get things.

[00:22:55] Um, but. Even then we've also removed, um, some data. So there, there used to be more data specifically on unemployment claims. Um, but we've taken those out. And so, um, you know, as more time has gone by, um, and we are shifting to other priorities, uh, within our center, you know, we've kind of taken a step back also and done a little less updating.

[00:23:18] Um, but a lot of the, you know, the numbers that you see are, are, are as up to date as we can find.

[00:23:25] David: Yeah. When I think you make a good point. I mean, even though you said, you know, oh, I've been doing this for a while, so it's really quick for me. There there's, you know, it's possible to automate this to the point where you can do it quickly. If you are doing this by hand, no matter how fast you are, it's never gonna be anywhere close to half an hour.

[00:23:43] It's gonna be hours and hours and hour. So, um, I think that's something to important to keep in mind. Um, Aaron, you talked, you started to talk briefly about kind of how it became more visual in the data. Viz. Can you talk like, are there ways that switching to our markdown and parameterized reporting made it possible to be more visual or like what's the relationship between the parameterized reporting and, and having the reports be more visual?

[00:24:13] Aaron: Mm-hmm . I mean, I think the, this is true of not just parameterized reporting, but anytime you use R markdown ggplot2 is just right there. And so it's just such a natural fit to make data visualizations instead of, you know, using narrative or, or even creating tables or things like that. I think, um, just by reducing, it takes, it takes a long time to create a data visualization.

[00:24:39] And if you can, you. get it to the point where you can iterate it as easily as what Livia just said, it makes it something that's more sustainable. You're not having to plug the data into Excel, make sure your cell references are correct. There's always a cell reference error. and then, you know, copy that as a PNG or whatever into the document.

[00:24:59] Maybe you have to do that eight times per state, right? That's just not sustainable. There's just such a natural fit between R markdown and ggplot2 I, I almost think of them as two tools within the, as, as sort of one. tool Right. Uh, they're kind of linked in my mind, anytime. I think about one, I think about the other.

[00:25:14] And so it's just such a natural fit in this application.

[00:25:17] David: Yeah. That makes a ton of sense. Um, so you all talked about this, but um, talk about maybe just even a bit more about how much more like if you hadn't done this in our markdown and it sounds like there are other kind of similar, um, iterated or reports at urban that are done in a way that doesn't involve our markdown.

[00:25:40] Like, can you expand a bit on the, on the. Process and what that looks like compared to the process that you all have set up here.

[00:25:49] Livia: I can talk about that and Safia feel free to chime in. Cause I know you've done work on the backgrounders before, too. Um, but yeah, this other, um, um, resource or product that, uh, I was speaking about, um, you know, we pull in data yearly from the census of governments. Um, and relating to, um, like I mentioned, tax revenues and, and, um, different sources of tax expenditures too.

[00:26:18] And, um, it's very time intensive. And so we have kind of assigned a person per backgrounder, and then they have to go in and update all of the charts. And, um, if there's any tables or graphs, um, and so pulling in the data. Um, you know, from the census of governments dashboard that, um, we have, and then, uh, adding it to Excel and kind of updating it.

[00:26:45] Um, and then also updating the text and you know, where certain numbers are referenced. And so it can be very tedious , uh, process and leaves a lot of room for error as well. And so once one person has a go at. Um, it we have multiple people also checking and looking at the numbers and, you know, the edits that have been made, uh, to make sure that, you know, we're correctly, um, calling the different the different, um, uh, data that's changed.

[00:27:19] And so, um, yeah, I think it's, it's vastly different. Uh, that process versus using R markdown, um, Like I said requires a lot more people and a lot more time. And even then, like, it has to go through multiple rounds of editing to make sure that we have everything correct. And, and even then we still can catch mistakes, um, later on.

[00:27:42] And so I think it it's, it's just a lot easier with R markdown. I think there's a, um, uh, you know, upfront, there's a lot more, uh, labor and time that's put into it because you're, you know, writing all the code, you're cleaning out the data. So that you have everything that you need. Um, but in the long run, it's a lot faster.

[00:28:04] David: Makes a ton of sense. Aaron, you haven't convinced other everybody else at urban

[00:28:12] Aaron: I've done. Um, so I will say, I mean, our, our use at the urban Institute has, has gone up quite a bit in the last few years. Um, there's also been a lot of people who have done this type of parameterized reporting. Maybe I can give a little bit of background here. Um, I mean the first time we did this was five or six years ago and it was a huge lift.

[00:28:33] Then a couple of things have changed that have simplified that process over time. The first one. back then there were lots of in person convenings. And so people really wanted us to create PDFs. And anytime you have page breaks, it becomes a much bigger hassle. It looks great for Iowa. And then you go to Massachusetts and all of your page breaks are messed up.

[00:28:53] Right? So, um, that was a big challenge, but we, you know, created a lot of law tech code that sort of, uh, made it so that the next person who creates a template is starting from, you know, an advanced, uh, We also have an R package for Gigi plot two called urban themes. Right. So that makes all the plots consistent.

[00:29:12] Um, we have a, our package called urban templates, right? So we actually have our own sort of branded HTML documents with the associated CSS. So now that we have that, it makes things that much, um, easier. So, I mean, we did this 7, 8, 9 different times before we collaborated on this project together. Um, one.

[00:29:33] The pandemic was terrible. One thing that changed during the pandemic is that there's no more in person convenings. Um, and so it was a much easier sell to say, Hey, we can do this as HTML. Uh, and so then you're no longer fretting about page breaks, which honestly it like saves a lot of time. And then also if you're gonna be doing something every two weeks, uh, you don't have to like look through the page breaks, which is not fun and also like really challenging.

[00:29:56] Um, I think there's one other thing. That's helped. Well, there's a lot more people at urban that know are that makes it a lot easier to do this, uh, which is like a huge success from the past couple of years. And then the final thing is sort of, all right, we've made these pages. They could be for states 51, but we've also made 'em for counties.

[00:30:16] So 3,142. I mean that by hand would just be entirely impossible, but there's this, then there's this sort of question of like, oh, well what do we do with these? And so working with comms, we have all of that templated now. We have the us map landing page, where you can pick a state. We have a us map with a landing page where you can pick a county and we've worked out some of this backend stuff.

[00:30:38] So instead of having to have a bunch of meetings, every time we want to do this and like, figure out what are we doing now? It's like, oh, Hey, we want to do this project again. And everyone's like, okay. We know exactly how to handle that. So, um, it's a process that's evolved and it's a, it's much simpler now than it was five years ago.

[00:30:54] David: Yeah, that makes a ton of sense. I'll actually just say as an aside on the page break. So on the consulting side of my business, we do a lot of parameterized reporting and a lot of clients still want PDFs. And, um, so we, we actually use the page down package,

[00:31:10] Aaron: right. Yeah. Oh

[00:31:12] David: Which has been better. I don't, I don't actually know about tech and I've just heard horror stories from people.

[00:31:16] So I know HTML and CSS, so that's been much easier, but yeah, we had one report, um, on census outreach efforts and. You know, we had maps and charts and text and with maps, like the shapes of the states all vary. And so everything, we, we had to write some custom code that would be like, if it's this state put a page page break before this, I mean, obviously things that you, uh, are familiar with, but yes, dealing with HTML is I, I envy being able to, to work in HTML.

[00:31:52] Aaron: and there are things you can do with CSS to make them a little more printable. It'll never be as nice as the highly formatted PDF. Um, people really want that stuff to be print. But you're right. I mean, it is, it is envious because like that's not fun to think about. And like, I know that Livia and I know met Safia and I know that everyone in TPC, like they wanna focus on the content, right.

[00:32:13] They don't want to spend all their time thinking about formatting. So, uh, they wanna spend as much time on the content. So that's, that's a nice thing about the R markdown with HDML. It's it's much. Hmm.

[00:32:22] David: Yeah. Well, let me ask then one last question, which is, um, do you all have kind of advice, um, things you think about if people are considering doing parameterized reporting? I mean, just as an example, I, my, my general advice for people is typically keep your data visualization simple, because if you, if they get too complex or if you have too many annotations, you just don't know what it's gonna look like when you iterate it 50 or a hundred times or whatever.

[00:32:51] Anyway, I'm curious if you all having done this for a while now, have things that you've come to realize, or kind of best practices when doing parameterized reporting.

[00:33:01] Livia: Well, one thing I, I don't think we've talked a lot about is kind of, um, making sure that you're aware of exceptions. Um, and in the state pages, uh, one really big one for example, is Washington DC. So technically not a state, but we include it, uh, as a part of our pages. So there's 51 in total. And that, because DC, you know, it's not the state isn't.

[00:33:26] North Carolina it's. We reference it as the district of Columbia that, um, ended. Like we had to add in a lot of exceptions in our text, or, um, even for example, instead of having a governor, it has a mayor. Um, and so I think that's really important when you're kind of figuring out a project that you want to use.

[00:33:46] Um, parameterization is to think of, okay. Are there gonna be a lot of exceptions, you know, or is it only a few, how is it gonna impact the data that I wanna use or the text that I wanna have? Um, cuz that's like a big source of. Um, our markdown file is, is, uh, it's pretty long file. Um, right. Cuz they're, they're in certain aspects of a budget or, um, data that they release, you know, there are these exceptions as well.

[00:34:14] Okay.

[00:34:15] David: That makes a ton of sense. Aaron Safia, anything you'd add.

[00:34:21] Aaron: Uh, definitely. So, I mean, the first thing is if, for any parameter that you have, um, definitely try sort of the shortest value and the longest value. So look at Iowa, look at the district of Columbia or Massachusetts, right? Even when you don't have page breaks, titles can get clipped in data visualizations and things like that.

[00:34:40] We work a lot with people who aren't our programmers. So you always have to sort of be the translator between their vision and sort of what gets implemented. Definitely have the people who are coming up with what the design is going to be, try and sort of create a static image of what they want things to look like.

[00:34:57] I find it's a lot easier to sort of implement a well defined vision than it is to kind of endlessly tinker with a vision. And then when you're collaborating with someone else, some things are really easy. Seem like they're easy. Some things are really hard and seem like they're hard. And then some things seem like they should be really easy, but are really hard.

[00:35:20] And so it's a conversation with the people that you're working with that may not understand everything that goes into creating the fact sheets and just being like, how much are you? We, to this one specific idea, cuz it's gonna be really hard. And they're always like a little surprised by that. But then more times they're like often surprised by like, oh, that part was like really easy.

[00:35:38] And I thought it would be really hard. You know, take the wins when you can, and then try to, you know, avoid the really challenging things when you can.

[00:35:46] David: That makes a ton of sense. Safia, anything you done?

[00:35:49] Safia: Sure. I mean, I think the part of parameterized reporting as the idea of copying pasting the same thing over and over. So I think in trying to think of your vision as like, what, what am I copying. and pasting. And what, what are the themes throughout and what are the things that make each page distinct?

[00:36:05] And that sort of helped us think through what the template's gonna look like and what the commonalities are and where, as Livia said, we wanna carve out these exceptions and highlight important differences, that really helped us figure out the structure helped us figure out how to market the whole set of pages together.

[00:36:23] Aaron: Yeah. And I think that I just wanna piggyback on one thing there. Right? Mentioned kind of draw out a picture of what you want it to be like. And then I often encourage people to like, take a highlighter and highlight every single thing they think would change. Or as Safia said, sort of copy and paste.

[00:36:37] And sometimes you, like, people realize like, oh, I'm highlighting everything. And it's like, well then maybe that's not like a good candidate for this tool. Right. And so then they refine it to where it's like, okay, I'm only highlighting certain words. Oh no, we have to change the th to an ND right. And little things like that.

[00:36:53] But that act of actually highlighting, I think. It forces the content creators to maybe pair things down a little bit.

[00:37:02] David: That is great advice. I'm thinking about that for the next time I'm working with a client who wants to do some kind of parameterized reporting.

[00:37:09] So, um, great. Well, thanks. Uh, all three of you, I really appreciate you taking the time to chat and, uh, look forward to sharing sharing this with, uh, people to help them learn more about parametize reporting.

[00:37:22] Aaron: you.

[00:37:24] Thanks again for listening. I hope you found this conversation. Interesting. If you have any feedback, I'd love to hear it, David, at our, for the rest of us.com. Thanks.

Sign up for the newsletter

Get blog posts like this delivered straight to your inbox.

Let us know what you think by adding a comment below.

You need to be signed-in to comment on this post. Login.

David Keyes
By David Keyes
April 5, 2023

Sign up for the newsletter

R tips and tricks straight to your inbox.