R Versus Python: Which One Should You Learn?
If there is one question I get asked more than any other, it's this: should I learn R or Python?
Especially in recent years, many people have developed a vague sense that Python is replacing R (the rebranding of Posit from RStudio in order to serve multiple languages scared many longtime R users). But I think the discussion of whether R is declining is misguided. Because ultimately the question of R's decline is less relevant to the question of which language you should learn. More relevant is which language is right for you. And discussions of the "right" language are too often had in the abstract. There is no right language for everyone. Which one is right for you depends on many factors, and demands a longer discussion.
Before getting into that longer discussion, a couple of caveats:
I've never used Python. Everything below is based on what I've heard from talking with others, but opinions vary widely so take mine with a grain of salt.
This is not an attempt to start a language war. Language wars are tedious and waste everyone's time. Instead, I think it's more useful to think through the pros and cons of R and Python in order to help you make the right choice about which one to learn.
What is Python good for?
Python has often been described as the second best programming language for everything. It's not necessarily beloved, but it is very widely used. Its ubiquity comes from being a general purpose programming language that can do nearly anything (and, increasingly, it is the de facto language for AI APIs.
One of the many things that Python can do is work with data. There are many different approaches to working with data in Python: polars and pandas for data wrangling, seaborn and matplotlib for plotting, to name just a few. It can be confusing to figure out which approach to take.
The biggest complaint I hear about Python is the challenges of installing it and handling environment and package management. Python also uses virtual environments, which means that you can install specific versions of packages in each project (similar to the {renv} package in R). Virtual environments are quite useful when you develop a Python project that you want to put online because you ensure that the server uses the same versions as you use locally. And, indeed, the ease of deploying Python projects on servers is one of the reasons why it has become so widely used.
As you can see, there are many different approaches to working with Python. For a newcomer, it can feel a bit overwhelming (did I mention I've never learned Python?), but it offers advanced capabilities that are useful to many, especially those with programming experience who are comfortable dealing with complicated setup and maintenance.
What is R good for?
R, in contrast, is way easier get started with. Installing R itself is fairly straightforward, as is installing packages with the install.packages()
function that brings in packages from a single location known as CRAN.
While there are multiple approaches to working with data in R, the tidyverse approach that I teach is the most common. And for good reason: its syntax is designed to be easy to remember. It's not just in data wrangling that the tidyverse approach is so dominant: the {ggplot2} package is largely considered the best tool for making data visualization.
As you can tell, I am a huge proponent of the tidyverse. Some might read what I've written and say: well, that's just one way to use R. They're right (though, as package download stats show, it is far and away the most popular approach). But that gets us back to the longer discussion that our initial question ignores.
What we talk about when we talk about R versus Python
When people ask me if they should learn R or Python, they want a simple answer. But the reality is there is no simple answer. The choice of language depends largely on who you are. If you are an experienced programmer and comfortable dealing with the challenges that come from setting up and maintaining environments, packages, etc, Python is a great choice. For many industry data science jobs, Python is becoming the de facto tool.
But there are many places where R continues to thrive. R for the Rest of Us students are a perfect example of this. They come mostly from academia, nonprofits, foundations, and other social sector professions. They tend to fit into one of two groups:
Advanced Excel users looking for a more powerful alternative
SPSS/SAS/Stata users looking for a free tool
What R for the Rest of Us students are not, for the most part, is advanced programmers. They are not looking for data science jobs at Facebook.
If they have to spend hours dealing with installation, they are going to bail. Python may be a great language (or least the second best), but it is a less good choice for someone new to programming who wants to quickly dive into their data. R's ease of setup makes it a great tool for experienced data people who may not have programming experience but are looking to dip their toes into the world of code.
So, to return to the question we started with: should you learn R or Python? I don't know. But more important than looking for a single answer is to consider who you are and what you want to do when learning R or Python. Assuming this question has a single answer is naive. There is no "right" language to learn; there is only the right language for you.
Sign up for the newsletter
Get blog posts like this delivered straight to your inbox.
You need to be signed-in to comment on this post. Login.