Data frame error message

  • Data frame error message

  • jordan.trachtenberg

    Member
    October 20, 2020 at 7:55 am

    Hi everyone,

    Here is my code for the “Create a New Data Frame” video:

    mental_health_over_30 <- nhanes %>% 
    filter(age >= 30) %>%
    group_by(gender, age_decade) %>%
    drop_na(age_decade) %>%
    summarise(mean_bad_mental_health_days = round(mean(days_ment_hlth_bad, na.rm = TRUE), 1)) %>%
    arrange(desc(mean_bad_mental_health_days))

    I’m getting an error message that I’m not sure how to de-bug. Thanks for your help!

    summarise() regrouping output by ‘gender’ (override with .groups argument)

  • diego.catalan molina

    Member
    October 20, 2020 at 9:30 am

    Hi Jordan! I haven’t watched the video, but it seems like the message you get is a warning, not an error. If you run this pipe, do you see mental_health_over_30 in your global environment?

    summarise() is really useful, but often drives me crazy because the grouping variables are carried over to other dataframes/tibbles or further steps in a pipe. When you want to override them (e.g., arrange rows by a different variable), then you’ll get warnings/errors like this one. Just in case, you have control over keeping/dropping grouping variables using .group.

  • jordan.trachtenberg

    Member
    October 20, 2020 at 10:37 am

    @diego-catalan-molina , I now see the mental_health_over_30 in my global environment. I’m not sure I fully understand what you mean by summarise() being carried over. Do you mean that summarise() has a default order, and the group_by() is attempting to override it, which causes the warning?

  • David

    Organizer
    October 20, 2020 at 7:51 pm

    Hey Jordan, I made a short video to help you understand what’s going on here. This is something that I’m getting asked about A LOT right now (this warning message is new). Hope this helps!

    https://vimeo.com/470432075

  • diego.catalan molina

    Member
    October 21, 2020 at 11:02 am

    Did you watch David’s video? If so, then you saw how the grouping variables can be “carried over” as hidden information about your df (data frame). This hidden information sorts your rows in a specific way (first by gender, then by age_decade). So when you then try to sort your data in a different way by using arrange(), you may be triggering the warning. Not 100% sure though.

    In the end, it doesn’t really matter because you still get the df that you wanted. But David’s suggesting of using the .group argument within summarise() is really useful, especially if you want to merge your new df with other data and you get errors related to your grouping variables.

  • jordan.trachtenberg

    Member
    October 21, 2020 at 11:10 am

    Thank you, @dgkeyes . This is very helpful. So when I skim nhanes, it shows me that there are no group variables, but when I skim mental_health_over_30, it shows me that I have 2 group variables (gender, age_decade) based on what I initially set up in my dataframe. I’ll have to play around with the .groups settings to see how they change the grouping.

  • David

    Organizer
    October 21, 2020 at 3:08 pm

    You can play around with that argument. You can also use the ungroup() function to remove all grouping.

  • clint.thomson

    Member
    November 17, 2020 at 1:55 pm

    Hi David,

    Hope you are well. I realize this thread is a few weeks old, but I wanted to clarify I’m interpreting this message correctly after watching your video. If I receive the “regrouping output by” message, it’s essentially reminding me that I’ve grouped my data in a certain way, and that I can use the .groups argument in summarise to override the grouping I’ve set in group_by. Is this accurate? Thanks so much.

  • David

    Organizer
    November 18, 2020 at 2:24 pm

    Yup, exactly right!

  • jordan.trachtenberg

    Member
    December 4, 2020 at 6:43 am

    @dgkeyes, I also found ungroup() very important to perform before running t-test or ANOVA on existing data frames, otherwise I get a weird error that I don’t have enough observations.

Viewing 1 - 10 of 10 posts

Log in to reply.

Original Post
0 of 0 posts June 2018
Now