Dana Wanzer is an evaluation consultant currently finishing her PhD in Evaluation and Applied Research Methods at Claremont Graduate University. This fall she’ll be starting a new position as an Assistant Professor of Psychology in Evaluation Research at the University of Wisconsin Stout.
I’ve gotten to know Dana over the last couple years through as she has learned R. It’s been enjoyable to have someone to talk to about some of the struggles of learning R as well as the mind-blowing moments of realizing what R is capable of (check out her article on using R for immediate reporting in evaluation). I asked Dana to reflect on her learning process and she was kind enough to oblige.
Why did you decide to learn R?
I pretty much wanted to learn R since I learned about it. I had a little bit of a coding background, but seeing the possibilities of R, knowing it was free (as opposed to SPSS which was the only statistical software I knew prior), and understanding the reproducibility of working through R was really important to me.
How easy or difficult did you find learning R?
My journey in starting with R is perhaps different than most individuals. I learned R primarily to learn structural equation modeling (SEM) when I was a teaching assistant for a professor at Claremont Graduate University. At the time, I knew SEM well but only knew it through AMOS, but the professor was teaching primarily through R so I had to catch up quick. I knew the content well, just not how to apply it to SEM. That semester I went head first into it all and became really well-versed in the lavaan package in SEM, but man was there a lot of trial and error!
There are two components I feel were most challenging when starting to learn R.
First, with any new software comes having to learn new processes and debugging issues. Thankfully, I always felt really good at Googling, so finding answers wasn’t so much difficult as it was time-consuming. When I’m a full-time student working a bunch of part-time gigs, and much of the work I’m doing requiring a fairly quick turnaround, it’s difficult to justify spending an inordinate amount of time figuring out how to do something I could have easily done in Excel or SPSS.
That was the second challenge I experienced in learning R: forcing myself to learn how to do the basic things I could easily and quickly do in other software in R. I had to keep telling myself that it would pay off in the long run (which it has), but that was a big challenge.
I learned about the Tidyverse which completely changed my perspective about R. Learning the Tidyverse was when everything started to make sense to me and I could finally feel confident enough to fully transition to R.
I could finally clean and manipulate my data to get the frequency, descriptive, and statistical tables that I was able to get through SPSS and write in my reports!
In what ways has learning R changed your work?
Learning R has changed my data cleaning, analysis, and reporting work flow dramatically. I spend much more time in the cleaning and analysis stage than I used to, but the result is that I have a document of all my code that I could re-run at a later time.
This is perhaps the biggest reason why I switched to R: the reproducible work flow. As an evaluator, data comes in all the time and things shift all the time. Maybe I get a few more consent forms for students so now I can add a few more people into my data. Maybe another school gets their data to me in at a later date. Before, that would mean re-running everything manually. Now, I just update my data file, maybe adjust my filter code a little bit at the beginning, and then re-run everything. Voila! It’s all done in just a few minutes. That has been a huge time saver in the long run.
What do you think people considering learning R might not appreciate about it?
Being able to report directly from R is something that I don’t think many people learning R know about or appreciate enough. Many times I have feedback reports I want to send to a client when data comes in, just to give them the basic information so they know what to expect in the final report. Before, that would be quite a lot of work for each report. Now, when the data coming in is the same type just different numbers, all I have to do is rerun the report with new data and send it to them. I take a few hours to write up the code that will work for all reports and then every report thereafter just takes a few minutes to create (see this blog post for more information about the process).