After a recent Introduction to R workshop, a participant asked me a question. She works with data that comes to her with all numeric values, but these numeric actually represent character values. Every time she receives new data, she has to recode it manually in Excel, a situation she described to me via email:
A lot of the variables share the same code (we use a few different likert scales for groups of questions) and then others may only be used once (gender, race, etc.). I’ve been trying to use Excel to clean and decode the data for reporting but it’s a struggle when the data changes every week and we’re currently sitting at around 1500 observations for over 100 variables.
This workshop participant wondered whether R could help her automate this process. R is an incredible workflow tool so of course it can! I had her send me some fake data to demonstrate what this would look like, and put together a video demonstration for her.
There are two important things to note in this video:
- Automating this recoding process can save many hours of tedious work. It’s an example of how R doesn’t just replace your existing tools (like Excel), but allows you to do far more than you’re currently capable of doing.
- R handles data much differently than other software does. If you use SPSS, for example, you’re probably familiar with assigning value labels to numeric values. This is unnecessary in R, where we can simply use either character values or factors (I use the former here for simplicity).
Got any questions? Leave a comment below or let me know on Twitter.