Data frame error message

  • Data frame error message

     David updated 1 week, 6 days ago 4 Members · 9 Posts
  • jordan.trachtenberg

    Member
    October 20, 2020 at 7:55 am

    Hi everyone,

    Here is my code for the “Create a New Data Frame” video:

    mental_health_over_30 <- nhanes %>% 
    filter(age >= 30) %>%
    group_by(gender, age_decade) %>%
    drop_na(age_decade) %>%
    summarise(mean_bad_mental_health_days = round(mean(days_ment_hlth_bad, na.rm = TRUE), 1)) %>%
    arrange(desc(mean_bad_mental_health_days))

    I’m getting an error message that I’m not sure how to de-bug. Thanks for your help!

    summarise() regrouping output by ‘gender’ (override with .groups argument)

  • diego.catalan molina

    Member
    October 20, 2020 at 9:30 am

    Hi Jordan! I haven’t watched the video, but it seems like the message you get is a warning, not an error. If you run this pipe, do you see mental_health_over_30 in your global environment?

    summarise() is really useful, but often drives me crazy because the grouping variables are carried over to other dataframes/tibbles or further steps in a pipe. When you want to override them (e.g., arrange rows by a different variable), then you’ll get warnings/errors like this one. Just in case, you have control over keeping/dropping grouping variables using .group.

    • jordan.trachtenberg

      Member
      October 20, 2020 at 10:37 am

      @diego-catalan-molina , I now see the mental_health_over_30 in my global environment. I’m not sure I fully understand what you mean by summarise() being carried over. Do you mean that summarise() has a default order, and the group_by() is attempting to override it, which causes the warning?

      • diego.catalan molina

        Member
        October 21, 2020 at 11:02 am

        Did you watch David’s video? If so, then you saw how the grouping variables can be “carried over” as hidden information about your df (data frame). This hidden information sorts your rows in a specific way (first by gender, then by age_decade). So when you then try to sort your data in a different way by using arrange(), you may be triggering the warning. Not 100% sure though.

        In the end, it doesn’t really matter because you still get the df that you wanted. But David’s suggesting of using the .group argument within summarise() is really useful, especially if you want to merge your new df with other data and you get errors related to your grouping variables.

  • David

    Organizer
    October 20, 2020 at 7:51 pm

    Hey Jordan, I made a short video to help you understand what’s going on here. This is something that I’m getting asked about A LOT right now (this warning message is new). Hope this helps!

    https://vimeo.com/470432075

    • jordan.trachtenberg

      Member
      October 21, 2020 at 11:10 am

      Thank you, @dgkeyes . This is very helpful. So when I skim nhanes, it shows me that there are no group variables, but when I skim mental_health_over_30, it shows me that I have 2 group variables (gender, age_decade) based on what I initially set up in my dataframe. I’ll have to play around with the .groups settings to see how they change the grouping.

  • David

    Organizer
    October 21, 2020 at 3:08 pm

    You can play around with that argument. You can also use the ungroup() function to remove all grouping.

    • clint.thomson

      Member
      November 17, 2020 at 1:55 pm

      Hi David,

      Hope you are well. I realize this thread is a few weeks old, but I wanted to clarify I’m interpreting this message correctly after watching your video. If I receive the “regrouping output by” message, it’s essentially reminding me that I’ve grouped my data in a certain way, and that I can use the .groups argument in summarise to override the grouping I’ve set in group_by. Is this accurate? Thanks so much.

      • David

        Organizer
        November 18, 2020 at 2:24 pm

        Yup, exactly right!

Log in to reply.

Original Post
0 of 0 posts June 2018
Now
The R for the Rest of Us community is live! Join regular office hours, ask questions in the forum, and more!