Replace all NA in dataframe (parameterized reporting)

  • Replace all NA in dataframe (parameterized reporting)

     jordan.trachtenberg updated 1 week, 2 days ago 2 Members · 3 Posts
  • jordan.trachtenberg

    April 6, 2021 at 8:24 am

    I’m struggling with how to replace all NAs with “” in a dataframe with a variable number of columns. I’m creating parameterized reports of graduates, and depending on the department, they might have a variable number of major columns (major1, major2, major3). Since I don’t know how many columns there are when generating my report, I’d like to be able to remove NAs from the dataframe before exporting the list to a .csv file. I was getting errors with a case_when statement, and this code doesn’t work either. If there’s a way to ensure that I get all three columns after I pivot_wider (even if there is no data), that would probably be helpful.

    I’ve attached an anonymous sample file of how I would like the data to look.

    # Contact list to import to SurveyMonkey

    contact_list <- exit_survey_data %>%

    filter(department == params$department_category) %>%

    mutate(name_to_split = full_name) %>%

    separate(col = name_to_split,

    into = c("last_name", "first_name"),

    sep = ", ") %>%

    distinct(email, first_name, last_name, exp_grad_date, major, major_code) %>%

    arrange(major_code, major, last_name) %>%

    pivot_wider(id_cols = c(email, first_name, last_name, exp_grad_date),

    names_from = major,

    values_from = major_code) %>% # Results in up 1-3 Major columns, depending on params$department_category

    clean_names() %>%

    mutate_each(funs(replace(.,, ""))) %>% # Need a way to replace all NA from the dataframe, regardless of how many Major columns there are. I get an error message: nm must be NULL or a character vector the same length as `x

    purrr::set_names("Email", "First Name", "Last Name", "Custom 1 (Grad Date)", "Custom 2 (Major 1)", "Custom 3 (Major 2)", "Custom 4 (Major 3)")

    write_csv(contact_list, file = stringr::str_c("data-clean/", params$department_category, " Grads 2021.csv"))

  • David

    April 9, 2021 at 4:52 pm

    Ok, I gave it my best shot by recording this video for you. Let me know if that’s helpful!

    • jordan.trachtenberg

      April 12, 2021 at 1:10 pm

      Hi David, Thanks for your attempt. I should have clarified that the reason I’m doing an untidy data setup is because I have to follow the structure that SurveyMonkey requires to import a contact list (also the reason why I’m changing the column names). So my original dataset is tidy, but I have some departments that don’t have double and triple majors, so that when I try to create the SurveyMonkey csv file, it’s not detecting students with major2 and major3 because they don’t exist in the filtered dataset. I realized after testing that my case_when statement actually works, but the set_names function does not because it’s not detecting major2 and/or major3 for some departments. I’m wondering if I should just create a blank row for each of those so that I can get those additional columns, but I don’t know the best way to do that. Attached is an example dataset that returns an error when I try to run the above code to get my desired output for SurveyMonkey.

Log in to reply.

Original Post
0 of 0 posts June 2018