Office Hours Megathread

  • Office Hours Megathread

     David updated 5 months, 4 weeks ago 3 Members · 27 Posts
  • David

    Organizer
    November 4, 2020 at 4:37 pm

    I’ve decided to centralize all office hours discussions here. If you have topics you’d like to discuss during office hours, add them below!

  • David

    Organizer
    November 4, 2020 at 4:38 pm

    On the October 9 office hours session, we discussed parameterized reporting.

    You can find the code from this session here.

    The multireport package that I have developed can be found here.

    The article from the Urban Institute that I referenced throughout is here.

    https://vimeo.com/466674095/bfe6ff63b2

    • clint.thomson

      Member
      January 4, 2021 at 3:56 pm

      Hi David,

      Happy New Year! Hope you had a wonderful holiday. Quick question regarding iterating to Word from a Markdown template. If I leave a parameter set in my Markdown document (i.e. State = “Washington”) and then run the iteration R script to create reports for all states – will this create an issue? For example, might data unique to Washington find its way into the reports for all other states? It doesn’t seem to, but I’d welcome your response.

      Thanks!

    • David

      Organizer
      January 4, 2021 at 8:56 pm

      Great question, Clint! Yes, the parameter in the YAML doesn’t apply if you knit through an external R script. It only applies if you are knitting by hitting the knit button. Hope that answers your question!

    • clint.thomson

      Member
      January 5, 2021 at 2:39 pm

      Great! Thanks, David. The reason I asked is because it appears I need to run my Markdown file first in order to obtain a list of “states” (or programs in my case) before using them as the parameters in the render function in my iteration R script. Is this a recommended workflow? Thanks again.

    • David

      Organizer
      January 5, 2021 at 3:53 pm

      I would just separate it out so that you get the list of states/programs both in your RMarkdown file and in the separate render.R file. Just copy the code to read in the data from which you get the list of programs in two places. Does that make sense?

    • clint.thomson

      Member
      January 5, 2021 at 4:27 pm

      Yes – thank you👍

  • David

    Organizer
    November 4, 2020 at 4:39 pm

    On October 23, we discussed a range of topics:

    • Ordering in plots vs ordering in tables (4:27)
    • How to set the column names when making a table (32:00)
    • Custom pagedown templates (see an example here) (37:28)
    • Making parameterized reports (see also this blog post) (40:55)

    The recording of the session can be found below.

    https://vimeo.com/471566556

    Parameterized Reporting with RMarkdown

  • David

    Organizer
    November 4, 2020 at 4:40 pm

    On November 6, we’ll discuss creating a custom pagedown template. I’ll show one that I’ve developed and talk about how you can customize it for your organization.

    Other topics to discuss? Comment below!

    • clint.thomson

      Member
      November 5, 2020 at 2:20 pm

      Hi David, Quick update on the flextable formatting. I ran with your suggestions and produced the code below.

      data <- tibble(

      survey_item = c("Item #1 has to do with food people like to eat", "Item #2 is all about travel and places people like to go", "Item #3 is about movies people like to watch at home and used to see in theatres"),

      percent_program_agree = c(0.78, 0.6, 0.42),

      n_program_agree = c(45, 34, 46)

      ) %>%

      mutate(agree_text = str_c(percent(percent_program_agree),

      " (n=",

      n_program_agree,

      ")"))

      data

      ft1 <- data %>%

      flextable(col_keys = c("survey_item", "code", "agree_text")) %>%

      bg(j = "code",

      i = ~ percent_program_agree >= 0.6,

      bg = "blue") %>%

      bg(j = "code",

      i = ~ percent_program_agree < 0.6,

      bg = "red") %>%

      width(j = "code", width = .05) %>%

      width(j = "agree_text", width = 1) %>%

      width(j = "survey_item", width = 4) %>%

      align(j = c("survey_item", "agree_text"), align = "left", part = "all") %>%

      padding(j = "code", padding.top = 1, part = "body")

      ft1

      What I’d hope to get happening pertains to the newly-created “code” column. Is there any way to put spaces between the color codes between rows? I tried to use padding but to no avail. If there might be some additional time to discuss this, I’d really appreciate it!

    • David

      Organizer
      November 5, 2020 at 3:18 pm

      Yes! We can definitely discuss this tomorrow!

  • David

    Organizer
    November 6, 2020 at 11:25 am

    Hi folks, I enjoyed today’s session. You can find that below.

    We talked about creating a custom pagedown template. You can see the code that I created here (just look at the pagedown-template.Rmd file and the associated CSS files). Feel free to copy those and adapt.

    We also talked about conditional formatting of columns using the flextable package. I hope that was helpful!

    https://vimeo.com/476410496

  • David

    Organizer
    November 16, 2020 at 11:22 am

    The next R for the Rest of Us office hours session will be this Friday, November 20 at 10:00am Pacific time. I’ve got two things I want to go over (but please also submit ideas below and I’ll do my best to cover them):

    Heather Lewis, a primary school teacher, has asked how she could use R to produce regular reports on her students’ progress. I’ll take some sample data she has sent me and generate an RMarkdown report, which she will then be able to rerun at any point when she has new data. In addition to providing a refresher on RMarkdown (or an intro if you’re new to it!), I’ll go over a couple concepts that should be helpful even if you don’t work in education: pivoting data from wide to long and using the newish across() function in the dplyr package.

    The second thing I’ll go over is recreating one of the most interesting visualizations I saw in the wake of the US presidential election. This New York Times visualization shows the the swing in county-level votes for president from 2016 to 2020.

    • Paul McElroy

      Member
      November 19, 2020 at 9:28 am

      Hi David, this got my attention as I’ve been wondering about how to make the transition from (sometimes data intensive) crystal reports to Markdown in my organisation. There could be anything from dozens to tens of thousands of lines in the output when the crystal report is opened in Excel.

      Is it possible to use Markdown for this volume of data, and is there any way of being able to filter or play around with Markdown output as is easily possible in Excel?

      Would be great to be able to set up reproducible Markdown reports to replace crystal development for each new report.

      I will try to join the call, but I’m 8 hours ahead in Ireland, so 6pm will be giddy kids time so may have to catch up on it after.

      Thanks a million,

      Paul

    • David

      Organizer
      November 19, 2020 at 4:37 pm

      Hope you can join, Paul! I totally understand giddy kids, though (I’ve got 4 year-old twins). The session will be recorded so you can watch that later.

      I’ll do some future sessions at earlier times to be more accessible for you and others!

    • David

      Organizer
      November 19, 2020 at 4:39 pm

      To your question, I’ll show you how you can export the data from RMarkdown to Excel files for people who want to see those.

  • David

    Organizer
    November 20, 2020 at 2:48 pm

    Here’s our session from today!

    I went over a question from Heather Lewis about how R can help her to improve the efficiency of her workflow. I demonstrated how she could use RMarkdown to automatically generate reports on her students’ progress. In the process of doing this, I demonstrated pivoting data from wide to long in addition to showing how RMarkdown works. The code I generated for this report is here.

    In addition, I demonstrated how to recreate this New York Times visualization that shows the the swing in county-level votes for president from 2016 to 2020. It was a bit more involved than I expected! You can watch this starting at 56:30 and you can see my code here.

    https://vimeo.com/481871952

  • David

    Organizer
    December 4, 2020 at 11:53 am

    It was great talking with everyone today.

    We started off talking about a few tips based on the data that @jordan-trachtenberg shared, including:

    We also discussed making functions (at around 22:00). We did this using an example from this project I’ve been working on related to wildfires here in Oregon (the rendered version of this is here). The RMarkdown file I showed is here (cc: @meena-patil ). I also have a blog post about making your own functions, which is a good place to start if you’re new to the concept.

    Finally, we worked on helping @brandi.collins work with her data (around 41:00). We worked on reshaping her data to make it tidy in order to make analysis of it simpler. The code we created is here. If you want to read more about tidy data in general, check out this article.

    https://vimeo.com/487375375/2f13c0fea1

    How to Make Functions in R

  • clint.thomson

    Member
    December 17, 2020 at 1:29 pm

    Hi David!

    Hope this finds you well. I am looking forward to attending your office hours tomorrow. I do have one question which sounds like it might fit with your topic:

    Lately, I’ve been generating lots of summary tables. One aspect varying from table to table is the variables I enter into the group_by function. For example, if you were working with Census data, you may wish to have:

    -Table 1 reporting average age and % unemployment by State

    -Table 2 reporting reporting average age and % unemployment by State and County

    -Table 3 reporting reporting average age and % unemployment by State, County, and City

    If the group by variables are different for each table but the statistics reported are otherwise identical – could you use a single function to generate the three tables mentioned above?

    I’ve tried to create some basic code to conceptually illustrate what I’m hoping to do:

    data %>%

    group_by(state, county, city)

    myfunction <- function(group_by_vars) {

    data %>%

    group_by(group_by_vars)

    }

    myfunction(group_by_vars = state)

    myfunction(group_by_vars = c(state, county))

    myfunction(group_by_vars = c(state, county, city))

    If you have any existing resources that touch upon this, please do direct me to them. I look forward to exploring some solutions to this question soon!

    Take care,

    Clint

  • David

    Organizer
    December 17, 2020 at 4:53 pm

    Yes, this is a great question! We can definitely discuss this tomorrow.

    If you want to read up on this before tomorrow, check this out.

  • David

    Organizer
    December 18, 2020 at 1:42 pm

    Hi folks, here’s a recap of our session today.

    We began by making a function to gather race/ethnicity data from the American Community Survey. We adapted the get_acs() function from the tidycensus package in a way that will automatically bring in this data whenever I need it. We discussed using the … to pass arguments from a user-generated function to one in another package. Code for this is here.

    We then talked about using variable names as arguments in functions. We created a function that can pass one or more variables to use to along with a group_by(). Code here.

    We then talked about using the flextable package and used this as an example of how to learn new packages.

    The full video is below for your viewing pleasure!

  • clint.thomson

    Member
    February 2, 2021 at 9:10 am

    Hi David,

    Quick question. If I have a dataset of 40+ variables and several rows that are entirely NA, how I can create a variable that will easily identify these rows so that I can remove them? I suppose this would be similar to using the COUNTA function in Excel and removing all with a COUNTA of 0.

    Thanks!

  • David

    Organizer
    February 2, 2021 at 12:48 pm

    Great question that I can go over on Friday! If you want to get a head start, check out the get_dupes()
    and remove_empty() functions from the janitor package.

  • clint.thomson

    Member
    February 3, 2021 at 10:21 pm

    Hi David,

    Thanks! I’ll give these a try. Unfortunately, I may not be able to attend the session on Friday – will the recording be posted afterward? If you could go over how to remove entirely blank rows, and rows that are blank save for perhaps an ID code (and how the remove these rows), that would be great. This would be a useful trick to learn.

    Take care,

    Clint

  • David

    Organizer
    February 4, 2021 at 3:17 pm

    Yup, I’ll post it in this thread!

    Do you have a sample dataset you could share with me so I could use it tomorrow?

Viewing 1 - 15 of 17 replies

Log in to reply.

Original Post
0 of 0 posts June 2018
Now