Skip to content
R for the Rest of Us Logo

Quick Interlude to Reorganize our Code

This lesson is locked

Get access to all lessons in this course.

If the video is not playing correctly, you can watch it in a new window

Transcript

Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.

Your Turn

Reorganize your code so that you only create the enrollment_by_race_ethnicity data frame in one place.

Learn More

I haven’t found many resources that give recommendations for organizing code. I think it’s a) idiosyncratic to individuals, and b) the kind of thing that people who have used R for a while do without even thinking about it. The one resource I’ve found is called R Best Practices by Krista DeStasio.

My general practice is this:

Load packages at the top of my files. This ensures that you have access to all functions throughout your files.

Only create objects once. This avoids the issue we encountered where you don’t know what state your object is in.

Create as few objects as possible. I’ve found that by doing all of my data cleaning and tidying before beginning analysis enables me to create just a few objects, which I can then easily manipulate with a few lines of code to show a wide range of results.

Have any questions? Put them below and we will help you out!

You need to be signed-in to comment on this post. Login.

you said that you don't "need" to include enrollment_by_race_ethnicity as the x in left_join() but when I try to include it, the code does not run. as soon as I deleted it, it runs. Is that we cannot include it here because it has not been created yet?

Jordan Helms

Jordan Helms

May 10, 2022

I ran into this issue during the Functions lesson. When I try to run the code, even copy what's in the solution, I get this error message:

"Error in mutate(): ! Problem while computing number_of_students = replace_na(number_of_students, 0). Caused by error in vec_assign(): ! Can't convert replace to match type of data ."

It happens with the 2018-2019 data. This section: "enrollment_by_race_ethnicity_18_19 <- clean_enrollment_data(raw_data = enrollment_18_19, data_year = "2018-2019", race_ethnicity_remove_text = "x2018_19_")"

I can run the code with no problems with the 2017-2018 data. When I don't use the function and do two separate code chunks, I don't get the error message. Unsure what's going on.

Niger Sultana

Niger Sultana

May 17, 2022

Hi I know David sent us the solution of debugging (replace _na), I cannot find the solution about code below, sorry probably lost the e-mail, could you please help me to show where is this to solve the code?

clean_enrollment_data %

  •   select(-contains("grade")) %&gt;% 
    
  •   select(-contains("kindergarten")) %&gt;% 
    
  •   select(-contains("percent")) %&gt;% 
    
  •   pivot_longer(cols = -district_id,
    
  •                names_to = "race_ethnicity",
    
  •                values_to = "number_of_students") %&gt;% 
    
  •   mutate(number_of_students = na_if(number_of_students, "-")) %&gt;% 
    
  •   mutate(number_of_students = as.character(number_of_students),
    
  •   number_of_students = replace_na(number_of_students, "0"))
    
  •   mutate(number_of_students = as.numeric(number_of_students)) %&gt;% 
    
  •   mutate(race_ethnicity = str_remove(race_ethnicity, race_ethnicity_remove_text)) %&gt;% 
    
  •   mutate(race_ethnicity = case_when(
    
  •     race_ethnicity == "american_indian_alaska_native" ~ "American Indian Alaska Native",
    
  •     race_ethnicity == "asian" ~ "Asian",
    
  •     race_ethnicity == "black_african_american" ~ "Black/African American",
    
  •     race_ethnicity == "hispanic_latino" ~ "Hispanic/Latino",
    
  •     race_ethnicity == "multiracial" ~ "Multi-Racial",
    
  •     race_ethnicity == "native_hawaiian_pacific_islander" ~ "Pacific Islander",
    
  •     race_ethnicity == "white" ~ "White"
    
  •   )) %&gt;% 
    
  •   group_by(district_id) %&gt;% 
    
  •   mutate(pct = number_of_students / sum(number_of_students)) %&gt;% 
    
  •   ungroup() %&gt;% 
    
  •   mutate(year = data_year)
    
  • } > View(clean_enrollment_data) > enrollment_by_race_ethnicity_18_19 <- clean_enrollment_data(raw_data = enrollment_18_19,
  •                                                           data_year = &quot;2018-2019&quot;,
    
  •                                                           race_ethnicity_remove_text = &quot;x2018_19_&quot;)
    

Error in mutate(number_of_students = as.numeric(number_of_students)) : object 'number_of_students' not found