Adrienne Zell wears many hats. She is an Assistant Professor in the Oregon Health and Science University (OHSU) – Portland State University (PSU) School of Public Health. She is also the Director and co-founder of the OHSU Evaluation Core, an Assistant Director for the Oregon Clinical and Translational Research Institute (OCTRI), and the Director of the OCTRI Office of Research Impact at OHSU. Adrienne’s team provides evaluation support and technical assistance to academic and community-based researchers throughout Oregon.
I was inspired to reach out to Adrienne when I saw a tweet about her talk at the 2019 Symposium on Data Science and Statistics, which focused on how she moved her team from SPSS to R.
Adrienne was kind enough to answer a few of my questions about her — and her team’s — R journey. As you’ll read below, she was fortunate to be able to work with Alison Hill (who now works at RStudio, but worked at OHSU at the time), to whom she gives a huge degree of credit for her team’s successful transition. Having a dedicated trainer, and dedicated time to learning, made her team’s move to R successful. As others have noted, there is a significant time commitment required to learn R, but it pays off many times over in increased efficiency.
Why did you decide to switch your team to R?
The initial reason was cost. Our team switched to Macs, and our university did not have an SPSS license for Macs. Also, the software did not work well on Macs, even if we had the budget to purchase it (which we didn’t). For a short time, we used software that would let us run windows on our Macs, but we were only using it for SPSS and it was slow and cumbersome. We considered Stata, which is what is used by many teams here and is taught to students (and therefore had an accessible price), but at that point it seemed silly to learn another proprietary program when we could put the time into learning R. I had used R before for graphics and statistics, but not for data management – that was really our sticking point because that is where we spend most of our time.
Talk about the process of learning R. What was most challenging?
For me, the most challenging part was setting aside time to learn and practice. As the director of our group, I don’t use these tools every day because I spend most of my time sitting in meetings! So, when I return to working with data, I need to keep refreshing my memory and skills. It can be frustrating to look up a solution to a simple need and have google tell you, “You have visited this page 16 times”. Probably my biggest R “breakthrough” was when I was stuck on an airplane for a 13 hour flight. Without other distractions, I really had some time to focus and work through some of my own data. Also, one team member just didn’t want to get on board. When you have deadlines, it is hard to use new tools because you really just want to get the work done. It takes time outside of regular work hours to really learn, and some people are not interested in putting in the time. By literally pulling the plug on SPSS, we were forced to use it if we wanted to finish our work. If people are not willing to learn it, they are not going to be able to move forward in their careers – at least with my group.
Talk about working with Alison. In what ways was having her guide you helpful in getting your team using R?
As I mentioned, we spend a great deal of time in data management. The tidyverse is very helpful for this, but if you are new to R you don’t know this. This was a need we communicated to Alison, and she taught us all those tools. We were very impressed by how easy it was to restructure and manipulate data. By working with Alison in a small group, we were able to use our own data and ask her questions along the way. Using data you are familiar with is very helpful – and you are also very invested in the results. Alison had some key concepts ready for us to learn, but she was also flexible. She taught us ggplot early on, which was very motivating. We were extremely lucky that she was available and willing to work with us. I was 49 when I first started learning R (and a woman), and although I don’t think age impacted my ability to learn, I was definitely more comfortable learning in a small group with flexible support.
What is different about how your team works now that you use R? In what ways does it contrast with how you used to work when you used SPSS?
The biggest difference is in the level of documentation and reproducibility of our work. We can always be better about documentation, but writing markdown files that can be shared across our team has allowed us to leverage each other’s work. SPSS litters your folders with all kinds of files. R keeps things much cleaner, and maintains the integrity and traceability of the raw data. The up-front time was worth it, in that it now can take us seconds to update our analyses and dashboards. Those of us who work in R really enjoy it, and we like to trade tips and tricks (no one did that when we were using SPSS).
I think that helps us feel like we are part of a team and a community, even when we are all working with different data. I also think it has given us a lot of self-confidence, in that we are staying current with our skills.
Having gone through this process, what advice would you offer to other teams looking to make the same transition to R?
I think I would recommend that teams try to find some funds to have in-person training. You don’t need to hire a professional trainer. Even a graduate student who has used R can be a valuable resource for a beginner. I recently hired someone who is just out of undergrad, but knows R, and she is going to teach two of my other staff.
I would allow for learning time during work hours, if at all possible – but it is ok to recognize and send the message that professional development is valuable to a person’s career and it may take some time outside of work to master. Initially, projects may take a bit longer, and it may take some patience to get through the first few projects. The trade-off is a more efficient and reproducible workflow.
Once the initial training is completed, set aside time and/or create an atmosphere where team members can ask questions of each other and share code and other tips. Really, the most important thing is to be humble about your own learning curve – whether you are a newly minted computer scientist or a middle-aged manager. We can all benefit from learning something new.
The online video courses did not work well for us. Also, Github was not useful for us (yet). It seemed like another organizational system that was duplicative, and it was hard to get people on board, although I understand why it is useful. I think that it is important to consider that you can use these tools outside of – or adjacent to – the “tech” community. The R community is very welcoming, but there can be judgement in certain sectors. I have enjoyed attending conferences, particularly the RStudio conference. However, these are expensive and we can’t afford to send all team members.
Another downside of many of the conferences, is that there seems to be a gap in the trajectory of the curriculum. There are usually beginner and advanced sessions, but few intermediate-level opportunities. I’m not sure how to define “intermediate”, but I would like to see more expansion of the concepts learned in the introductory sessions. A really great session would be, “10 ways to do frequency tables”. This is a topic that assumes you know the tidyverse, but you would like information on additional packages and perhaps some incorporation of base R.
Ready to Move Your Team to R?
Has this article inspired your team to learn R? I offer custom training to help organizations learn R. Read more about how I can help your team.