R for the Rest of Us Podcast Episode 24: Simon Couch

Learn More

If you want to receive emails to help you on your R Journey, sign up for the R for the Rest of Us newsletter.

If you're ready to learn R, check out our courses.

Transcript

[00:00:25] David Keyes: In this conversation, Simon and I chat about his work at Posit, the state of AI, and the packages he's developing to work with AI in R. It's a great conversation that touches on some high level topics like what's the future of AI in coding and touches on some really specific things like how to get started using AI in your coding today. Let's dive in.

[00:00:46] David Keyes: Well, I'm delighted to be joined today by Simon Couch. Simon is a software engineer at Posit, where he works on open source statistical software. And with an academic background in statistics and sociology, Simon believes that principled tooling has a profound impact on our ability to think rigorously about data.

[00:01:05] He authors and maintains a number of packages, some of which we'll talk about today, and blogs about the process at simonpcouch. com.

[00:01:13] thanks for joining me today.

[00:01:15] Simon Couch: Yeah, thanks for having me. I'm excited to be here.

[00:01:18] David Keyes: Well, I'm excited because we're going to talk a bunch about the packages that you've been working on that are specific to AI. But before we get into that, I wonder if we could start by just having you tell me a bit about your job at Posit. What is your, how do you divide your time? What does it look like in your kind of daily work there?

[00:01:39] Simon Couch: Yeah, yeah. So I work on open source R packages broadly, um, properly, like in the org chart. I'm situated in the, the tidy models team under Max Kuhn. Uh, and that's, uh, for the first couple of years when I was full time at Posit, what I was working on, um, you know, 40 hours a week, um, around a year. Or so ago, um, I started working, um, halftime on ODBC.

[00:02:07] Uh, so like an interface to different database backends. Um, around nine months ago, I started a book. So that's been another sort of chunk of my time. And then, uh, around three or four months ago, I did an internal hackathon, uh, centered on like AI and LLMs. And, um, I guess that's the subject of the podcast today.

[00:02:30] That's probably, you know, the AI work is something like 30 or 40% of my time. Um, on tidy models specifically and, and maintenance and such. There that's maybe 25%. The book is maybe 25%. Um, for a while, ODBC and the database work was around half of my time, but that's closer to maybe 10% or so now.

[00:02:53] David Keyes: Okay. And I'm curious where your, your, your own interest in AI and kind of making packages for the R world that, that interface with large language models, where does that come from?

[00:03:07] Simon Couch: Yeah, so it started off in this hackathon. The way that this worked was that, um, some folks at Posit got together around groups of like 10 or so, uh, of folks around the company, and they gave us an introduction to like interfacing with large language models via APIs rather than like a, a chat interface that you would navigate to on the web.

[00:03:32] Um. And they gave us API keys and they let us loose for a couple of days. And, um, at that time I was heading into spring cleaning, uh, which is, uh, a week that the, the tidyverse team takes twice a year, one for each, uh, hemisphere, and we, we do all sorts of, like, code updating technical debt sort of tasks. Um, one of them was specifically about updating code that raises errors.

[00:04:05] Um, so if you call stop or, uh, from rlang, the analog is abort. Um, we try our best to make errors as informative as possible for users. And part of that has been transitioning the tooling that we use to raise errors to this new package and this new syntax, and that's called CLI. And, and so for context, like within the tidy models and the tidyverse and the organizations.

[00:04:36] On GitHub and all the packages that we release and work on, there's like thousands, if not tens of thousands of these custom pieces of code that raise errors to the user. Uh, and our goal was to convert most all of them, uh, to use this new interface, but it's really a sort of fuzzy, uh, like, A little bit irritating task to transition from all these different ad hoc ways to raise these errors we've put together over the years into this standard inteface interface. And if one were to try to implement that, like some sort of package that implement or that converts the errors automatically, that'd be a really difficult software problem. And so I kind of just wondered at that time, like, I wonder how good these models are at,

[00:05:31] David Keyes: Hmm.

[00:05:32] Simon Couch: automating this task of converting this erroring code for us.

[00:05:36] And so that's what I ended up working on. Um, and then And the week after the team actually got to use, uh, this package, which was sort of a predecessor to, to pal, um, and probably, you know, called this same key command a thousand times that like takes the code that you have already to, to raise an error, sends it off to a model with a bunch of information about how to convert erroring code to this new interface.

[00:06:03] Um, and ideally it comes back in a couple seconds and you're 90 percent of the way there.

[00:06:10] David Keyes: So that's a good example of like using AI to handle something that would be incredibly tedious to do on your own. I'm curious if. If you see that as an optimal use of AI, and if there are other things that you see as like good uses of AI at this point when working in R.

[00:06:30] Simon Couch: Um, I think that, like, the sort of tedious in the, the code updating kinds of tasks, um, LLMs are, are particularly promising for, but those are probably also a subset of, like, I often think about it as, like, 45 second tasks that could be turned into 5 second ones. Um, with, with PAL, there's, like, a fixed set of these that, um, I've defined and then also tried to provide some infrastructure for people to, um, add in their own use cases for, um, and then with Gander, it's sort of more one off, like whenever you run into one and have this spidey sense that maybe it's something that could, be done for you that you're not particularly interested in doing.

[00:07:22] Um, those are the sorts of things where I'm really leaning on models right now to automate those, those smaller chunks of work that I'm not really interested in carrying out.

[00:07:34] David Keyes: yeah. Are there other ways that you think are particularly promising for AI, or is that kind of where you've been maintaining your focus at this point?

[00:07:47] Simon Couch: That's where I've been maintaining my focus for now. Um, I'm, I'm hesitant to like make any. Predictions that are too bold about what's coming next just because it, uh, the LLM space has been moving so fast and I've, I've really only been engaging earnestly with it for like a few months now. Um, but yeah, there's those sorts of smaller tasks are mostly where I'm focused right now.

[00:08:14] David Keyes: So, Well, maybe this is getting ahead of where you are. Like you said, you've, you've been doing this for a few months, but I'm, I'm curious for your thoughts. Cause you hear a lot of talk about, oh, AI is going to make, you know, writing code completely unnecessary. Everyone's just gonna, you know, chat with a, a model and it'll, you know, spit out everything you need.

[00:08:33] Having worked on these packages, PAL and Gander, um, and others, what are your thoughts on whether AI can go to that level?

[00:08:46] Simon Couch: Mm hmm. Um. There are certainly people who earnestly believe that that's the case, I think, um, and I might even go to say like many of those people know more about these models than I do, um, and at the same time, like, In my personal experience, actually working with these models in my like, sort of cursory understanding of how they're working and where they're heading, which again, I'm very hesitant to, to make any predictions about where this might be headed ultimately.

[00:09:21] Um, I don't see a lot of evidence for this. We like in my own personal experience, I. I get a lot of help from these models, writing code and updating code in, like, repeatable ways where I can, uh, really explicitly specify how I'd like the code to be updated, um, in the big picture. The proportion of my work, like developing R packages, or the work of a data scientist, um, the process of writing code is probably a relatively small portion of your work compared to, like, debugging and maintaining code.

[00:10:09] And both in terms of, like, formal evals about how good these models are at doing that. And also my personal experience. Um, I don't see that these models are particularly capable at doing those tasks, which again, right now are like a huge part of my work. Um, and so at least in the short term, I don't, I don't fully believe the argument that, uh, AI will make learning to write code unnecessary.

[00:10:41] David Keyes: Yeah, I mean, I'd say I could broadly agree with that. I think, at least right now, AI is most useful when you can give it context and you, and you kind of know the right questions to ask. I mean, I know from teaching people, a lot of times , we'll see people use AI and the answers they get back end up confusing them more than anything.

[00:11:06] For example, we teach with the tidyverse. They'll ask a question to AI and it'll give them a Base R answer, which there's, I'm not saying there's anything wrong with that, but if you're learning through in the tidyverse style. And you get a base R answer, you're going to be incredibly confused. Whereas for me, or someone who's been doing it for a while, you know, you can know how to ask questions in a way that you're likely to get answers that will be meaningful to you.

[00:11:34] But you can't kind of just spew something at chat gpt or whatever model and expect it to give you back exactly what you need.

[00:11:43] Simon Couch: right. There's absolutely that, that skill of like learning how to ask the question. And I think, unfortunately, because I, I haven't really tried to like put this in words before, but there's also a little bit of a spidey sense for like the kinds of problems where the model can turn the 45 second task into the five second one, or when it's going to give you an answer that's less than five minutes of your time, because you're trying to.

[00:12:13] debug it or it sent you down a rabbit hole that's probably not a productive one in the, in the long run.

[00:12:19] David Keyes: Yeah, that makes sense. Well, let's talk about the packages that you've been working on. Um, so we're going to talk about two. There's one called pal, and then there's one called gander. Um, can you talk about what each of them do? Right,

[00:12:38] Simon Couch: is, um, is sort of a generalization of this initial idea I had about updating this error in code and what the package does is it supplies. An extensible library of model assistants, uh, for working on our code. So, uh, the package comes with a preset number of assistants that, um, you can press a key command and highlight some code.

[00:13:08] And in the back end, what the package is doing is supplying all sorts of context. Um, to sort of teach the model real quick how to do the task that you're asking it to do. Um, and then as long as all goes well, uh, code begins streaming directly into the document that you're working in. Um, under the hood, these are using Elmer, which is a new package coming from, uh, Hadley and Joe, um, that provides an interface to, to any, um, or many of the major, uh, large language model providers out there.

[00:13:47] And because, uh, PAL uses Elmer, uh, under the hood, you can use any model that you would use with Elmer, uh, with PAL. Gander. is a newer package that I've been poking at recently. And I think the best way to phrase it is sort of like a co pilot alternative that knows about your R environment. So I think, um, At this point, many folks have experimented with Copilot Autocomplete, where if you're working in a document, uh, the model will receive all the code context from within that document, so it can see the lines of text that are surrounding where your cursor is at.

[00:14:31] But the thing that it can't see, and the thing that, um, I ultimately wished it could see, was my R environment. Like, if I have some data frame inside of R, My, uh, source file that I'm working on my R file. Uh, the thing that I really want the model to know about is the columns inside of the data frame and how many rows it has and things like that.

[00:14:53] And so Gander is a, is a co pilot alternative that knows about your R environment and how to describe it well to models.

[00:15:01] David Keyes: Well, that actually seems useful. I mean, having just put together some lessons on using AI with R, I I assume that works both in RStudio as well as in the newer editor Positron. So yeah, it seems like that's another benefit that it can be used across, across editors.

[00:15:23] Simon Couch: right. So both, both Pal and Gander in the backend are, are implemented using the same two tools. So one of them I talked about was Elmer and the other one is the RStudio API. And when I say like it uses the RStudio API in the backend, it sounds like that probably wouldn't work in Positron. Uh, but one of the nice things that the folks working on Positron have done is created what they call shims,

[00:15:48] Which allow all the same commands that you could use to interface with RStudio to interface with Positron.

[00:15:57] And so both PAL and Gander work in both RStudio and Positron, which is a really nice bonus. Um, at the same time, using those tools specifically makes for some limitations in terms of the UI as well.

[00:16:14] David Keyes: How so? What do you mean by that?

[00:16:16] Simon Couch: So, Pal and Gander both work by writing directly to the source files that you're working in. And this kind of feels like magic when they get it right. And that, like, You ask the model something and then it's there. And there's no sort of like clicking or like copying and pasting or moving around that you need to do.

[00:16:38] But the bummer is that if you accidentally get a model rambling or, uh, if the, the model does something that you don't want it to do, backspacing or undoing, uh, to, to get back to where you started, where.

[00:16:52] David Keyes: see.

[00:16:53] Simon Couch: The other way, than the way that I've been doing, which is everything comes from R, is to do it from the angle of the IDE.

[00:17:02] And so, like, in VS Code, a lot of these extensions have some sort of interface that's like, here's a diff. Like, these are the lines that I would delete. And here are the lines that I would add if you gave me a chance to, is that okay? And then you say, okay, sounds good. Uh, which prevents that problem of like a bunch of stuff being dumped into your documents.

[00:17:25] Mm

[00:17:25] David Keyes: Got it. That makes sense. Well, let's have you, um, give a little demo. Um, so, we talked about having you walk through the gander, um, package. Do you want to put your screen up and show us what that looks like?

[00:17:38] Simon Couch: Yeah, let's do that.

[00:17:40] All right. You're seeing that?

[00:17:42] David Keyes: Yep.

[00:17:43] Simon Couch: Okay. So this is a, uh, quite a brief document, probably, uh, shorter than we're maybe used to seeing heading into a demo. You might've just seen this now. Um, right now I have Copilot enabled and I'll, I'll use Copilot as an example for the, the kind of problem that I was trying to address when I worked on Gander.

[00:18:06] So, um, I'll load ggplot, and I'll load this data in called stackoverflow. We use this, uh, when demonstrating some packages in the, the tidy models. Um, and each row is a, a survey response from a developer, uh, to the stackoverflow. Annual developer survey. So this first entrant is from the UK and makes a hundred thousand, I'm guessing, dollars, uh, has coded for 20 years and so on and so forth.

[00:18:37] So if I'm working with, uh, with Copilot inside of my editor, what I probably would do if I wanted to plot this salary variable in years coded job, I could say plot salary versus years coded. This is like the plain language instruction of what I want to do. So the nice thing about this, it can see that I've loaded ggplot2, because that's a line of text inside of my source file, and it can see that there's a data frame called stack overflow that's available to me.

[00:19:14] So it's going to suggest this code. And if I go ahead and autocomplete this and run it, it's not going to work because it references this variable inside of a stack overflow data frame that doesn't actually exist. this is the kind of problem that I ran into time and time again, trying to give Copilot a go is that, uh, it can generate context based on, the lines in my file, but it can't generate context based on my R session.

[00:19:46] So I'm gonna go ahead and turn off copilot now. Oh, you know what? This is probably on a per project. And we can take a look at what Gander would do. So I'm gonna work with the same data and I'm gonna provide it exactly the same prompt. The way that Gander works is you attach some keyboard shortcut and then it pulls up like a tiny little shiny app inside of your window.

[00:20:16] So this is what the Gander add in looks like. I've selected this stack overflow data and I can say plot salary versus yours coded. So again, the piece of context that it's going to need isn't anything that's available in the lines of the source file that I'm working on. I need it to know that I'm.

[00:20:39] working with this data frame that has these columns inside of it.

[00:20:46] So that's what a completion with Gander looks like. Um, and we mentioned this earlier in the podcast, so instead of providing a suggestion, and then you, like, tab or something to say, okay, I accept that, go ahead and write that into my files, it just writes it right in, and there's some pros and cons to that.

[00:21:03] But I can try to run this, and that code works. A really nice part about interfacing with Gander, uh, or another nice part, is that once it's done writing out text to your file, the whole piece of, code that it wrote out is then selected. So, I like this plot, but, we can see that the responses in years coded are in whole year units, so there's these like lines of points, and I'm not really into how that looks.

[00:21:40] So I can trigger the add in again without having to touch anything else in my editor and just say, Jitter the points. Plain English is fine. It'll overwrite that selection, and this time it uses geom. jitter instead of geom. points. So I can run that and it's, it's jittered the points for me. The sort of introductory GIF or video that's on the website for this package.

[00:22:11] I'm doing this like super fast and it's probably a little bit overwhelming to see, but then I could say like facet by country. That's another thing that like as a ggplot user, I probably know how to do. But then I could also say, like, I'm seeing that these salaries, I actually know that they happen to be American dollars.

[00:22:36] And so I can say, format salaries as dollars.

[00:22:42] I wouldn't know how to do that off the top of my head. Especially if I hadn't given this demo a few times, right? Like, I don't, I don't know about this function in scales that labels units according to dollars. Um, another thing that I see in this plot that I think probably ought to be addressed is that the 20 year, points on the, the x axis, that's probably 20 plus.

[00:23:07] And if I went back into the source data, I Could confirm that that's actually the case, but I don't know how to do that. Again, like just off the top of my head, I would need to ask a model or, or, uh, go on stack overflow, but I can say, make the X axis max read 20 plus.

[00:23:29] I'll just run it and it worked. So this is helpful for things that I know how to do already. Um, but that's also saving me. the time of, like, going to learn on Stack Overflow or asking a model separately how to do these sorts of things. The last thing I'll show in this demo, and then maybe we can poke at some questions, is, like, what's actually happening under the hood.

[00:23:54] And I think this is really interesting, so I'll minimize these down. There's a function in Gander called ganderpeek, And peek allows you to peek under the hood and see what's actually happening. So, under the hood, ganders are just Elmer chats.

[00:24:12] And, um, if you've used Elmer before, you'll be familiar with this output where, It tells you the way that the model has been prompted using, uh, the system prompt. And then this is like analogous to everything that you would type into, like, ChatGPT or Claude if you pulled it up on the web. So, it's saying, this is what my R file looks like so far.

[00:24:36] I typed, make the x axis read 20 and there's some nice automatic, formatting of that to, uh, Drop in the thing that I've selected and then most importantly in the sort of differentiator here is that there's some logic that looks for things in the intersection of your selection, your prompt and your R environment

[00:25:00] David Keyes: Yeah

[00:25:02] Simon Couch: And so because in my selection there is this thing called Stack Overflow, and that thing is also in my environment, it will print out some information about what that data frame looks like.

[00:25:15] And so that's not only how the assistant knows how to find the column names in my data, but like when I say, make the x axis max read 20

[00:25:28] David Keyes: uh huh.

[00:25:29] Simon Couch: It has the first few entries. It knows the kind of column that is, and it knows how to format it in a way based on its range and its type.

[00:25:39] David Keyes: That's interesting.

[00:25:40] So am I correct in understanding that, like, really what Gander is doing is it's just creating a more elaborate set of instructions? that it sends to a model to be able to get better responses back. And I mean, when I say elaborate, I don't mean just like more language, specifically like this piece of being able to send like an object or the structure of an object alongside the specific instruction that you ask.

[00:26:08] Is that, is that really like fundamentally what it's doing?

[00:26:12] Simon Couch: Yep. Yep. That is fundamentally what it's doing. it's assembling a prompt that, uh, is optimized to give the best possible answers with as little, like, input and cognitive load from the users as possible.

[00:26:28] David Keyes: Right. Because usually when you're asking a question or, you know, asking a model to do something, you would want it to know, you know, the structure of your data, but it's tedious if you have to do that, if you have to say, and here's the structure, you know, here, here's what my data looks like.

[00:26:47] Right. So this just actually kind of handles that for you.

[00:26:51] Simon Couch: Right. Yeah. So like when I was first, uh, poking at language models when they came out a couple of years ago, like this process of pasting all this information in manually was so tedious that I eventually felt like this is slowing me down rather than helping me iterate faster.

[00:27:10] And then with Copilot, like still I'm missing that piece of information in the data. So I'm going. Back in correcting little pieces of code to correct for what my data actually looks like. And this is just like one further iteration, um, to where the friction is low enough to where it's actually productive to make use of these models instead of slowing you down.

[00:27:33] David Keyes: That makes sense when you said, um. In a piece that you sent to me that you said data science differs a bit from software engineering in that the state of your R environment is just important or more so than the contents of your files. So that's what you're getting at here, right? Like, unlike in other types of software engineering, for example, The state of your R environment or whatever the, the analogous thing to your R environment would be, is less relevant.

[00:28:03] Whereas here, because, you know, what we care about is like the data, um mm-hmm . You need to be able to pass that to a model in order to get good responses. Am I, am I summarizing what you're, what you're saying accurately?

[00:28:15] Simon Couch: Yeah, I think that's, that's very much so what I was trying to get at in that quote is that, uh, the existing tools are, are kind of assuming that all the information you need is contained in, in like lines of code. Uh, in the context of data science, there's, there's this other piece that's really important for a model to know about in order to, uh, complete properly.

[00:28:38] David Keyes: That's cool. And is there any other tool that does this same kind of thing where it. You know, passes the state of your environment, your data objects alongside prompts that you know of?

[00:28:49] Simon Couch: I don't know of any. I would be surprised to hear if there were not such a tool. Um, like I, I don't know that I was inspired by anything specifically in, in writing this, but, um, I would imagine that somebody probably did this before me, maybe even in R.

[00:29:08] Okay.

[00:29:08] David Keyes: Yeah. So what, what model is it using? Like, you haven't done anything here to say, like, use chat GPT or use Claude or anything. So, so how do you, or can you choose a model? And if so, how?

[00:29:25] Simon Couch: Yeah, so I'm hesitant to show any specific code because the interface to make this happen is actually going to change soon.

[00:29:33] David Keyes: that's fine.

[00:29:37] Simon Couch: can use any model that is supported by the Elmer package. And the way that that, uh, setup works is you just tell Gander, like, by the way, this is what an Elmer chat that I want to use looks like.

[00:29:53] So, uh, when I'm developing this package and in my day to day use, uh, I tend to lean on Clod. Um, but the. And using Gander, you can use, um, Claude, you can use like chat GPT from open AI. You can use locally hosted models, which, um, are effectively free, you know, like you're not paying, uh, for a service and you're not sending your data off somewhere else.

[00:30:24] Um, and at the same time, the kinds of models that our laptops can host on their own are, tend to not be as powerful as, as something Claude or, or chat GPT.

[00:30:36] David Keyes: is it then, like, I know within Elmer, um, you can correct me if I'm wrong, but I think you, like, set an environment variable with your, like, chat GPT or Claude API key or something like that. And then does, does Gander just recognize that? Or, I know you said it's changing, so no need to show the code, but just at a high level, like, how does it know which model to use?

[00:31:01] Simon Couch: right. So, um, there's, there's sort of a config, uh, associated with Gander, and it's just an option. Um, and that option says, call the, you know, chat clod function from Elmer.

[00:31:17] David Keyes: I see.

[00:31:18] Simon Couch: Gander doesn't know anything about, like, that API key that you might have set up in order to use Elmer. Elmer is still the package that's, that's taking care of all that authentication and stuff. But Gander knows how to talk to Elmer chats, and, and that's what's happening under the hood.

[00:31:37] David Keyes: So, gander is the package that's formatting the prompt, essentially. Then gander sends it to Elmer, which takes care of all the authentication, all the stuff that you need to do to interface with Claude or chat GPT or whatever model you're using. The response comes back to Elmer. Elmer passes that back to Gander and then Gander paste it into your editor.

[00:32:02] Is that, is that what it looks like?

[00:32:04] Simon Couch: Spot on. Yep.

[00:32:06] David Keyes: Cool. Um, why Claude? I've heard several people say that recently, that they really like Claude best for editing. I haven't used different models enough to, to have seen a difference. So I'm curious. It sounds like you have, why, why do you think Claude is the best?

[00:32:22] Simon Couch: Um, I would, I would hesitate to, to speak too definitively on this because again, I'm, I'm relatively new to this and this is mostly just like a vibes based evaluation. Um, and I started out like the API key that they handed me when we started, uh, experimenting, and this hackathon happened to be a Claude API key. Claude tops, um, or you know, is a peer with many of the, um, most performant LLMs on many benchmarks, which are just like various ways of evaluating how good a given model is. Um, and then there's also the component of like, The sort of taste of the model or something that's, that's kind of harder to capture in a benchmark or an evaluation, but I've just tended to, to see that, um, the, the model takes instruction, uh, better than others.

[00:33:20] I've seen where, if you tell it, this is what I want the response to look and feel like it tends to give responses that look and feel like that. Um, more commonly. Yeah,

[00:33:32] David Keyes: it's more compliant you're saying, not, not as sassy as, as chat GPT.

[00:33:38] Simon Couch: At the same time, so like right now in the development version of these packages, it's just been the default model under the hood. So like, if you happen to have an API key for, for Claude set up already, there's like no config that you need to do ultimately. Um, that said, I think that will probably change before I send these off to, to CRAN.

[00:34:02] And that's because like longterm, I kind of hope that. I can host a model on the laptop that I'm working on right now that is good enough at listening to what I ask it to do and knows enough. Based on its compression of the internet to like complete just as well as, as Claude would. And so I don't want to like bake it in, uh, in a default in these packages that folks should be relying on these, um, major corporations.

[00:34:33] David Keyes: Well, I'll ask since it's been in the news, it's January 30th as we're recording this. Have you used deep seek?

[00:34:40] Simon Couch: I've used DeepSeek a couple of times, just poking around. Yeah.

[00:34:43] David Keyes: But not, not with Gander, is it? I assume it's probably not set up to use, to be able to use that yet.

[00:34:48] Simon Couch: So actually it is possible to, um, to use DeepSeek in the same way that you would use any other model with Gander. Um, On Elmer, and this is just in the development version of the package as we're recording this, so you would need to install Elmer from GitHub rather than from CRAN, um, but the package supports DeepSeek, um, at the moment.

[00:35:11] And so you would just configure that DeepSeek chat with, with Gander, and Elmer would take care of everything. That said, um, There's been so much hype around DeepSeek, especially in the last week, that, um, the APIs have been down probably more often than they're not. Um, and so I haven't, I haven't lucked into like logging in at the right

[00:35:33] David Keyes: I see.

[00:35:34] Simon Couch: uh, to give that a try.

[00:35:36] David Keyes: But isn't it, I mean, I know it's open source and it's relatively small. I guess, I mean, it's new to me too, so I haven't really looked into this, but, um, You know, like, um, llama or models like that. Uh, I guess I don't know if it's small enough that you can run it on your own laptop, but I wonder if in the future it might be.

[00:35:57] That's true.

[00:36:03] Simon Couch: like a month or so ago, which made a little bit of a splash, but not nearly as much as the model that they released a week ago, that model is something like 500 billion parameters or something, which you can kind of approximately, um, think about the number of parameters as the number of gigabytes of RAM you would need to run the thing locally.

[00:36:26] So, you know. Unless you have a pretty amazing computer, you're probably not running the 500 billion parameter model. so yeah, the reasoning model that they put out a week ago, and Um, you know, that's probably analogous to something like O1, which quote unquote thinks for a bit before it does anything.

[00:36:51] I've found those models to be kind of painful to use with PAL and with Gander and Ensure. Because that process of thinking, there's no user interface happening at that point, and when you use something like Claude or a local model, you immediately start getting text streamed into your document.

[00:37:11] So you know that something is happening. Whereas, uh, with a thinking model, you press the command, you type in your input, you start waiting, and you're like, Did anything.

[00:37:21] David Keyes: Yeah.

[00:37:23] Simon Couch: which, um, I mean, there are surely like OpenAI OpenAI and DeepSeek have found ways to put together interfaces such that people know that something is happening.

[00:37:36] And I just haven't put the time in to figure out what that might look like with these packages.

[00:37:40] David Keyes: Okay, I've, I was listening to the Hard Fork podcast, New York Times. I don't know if that's a podcast you listen to. Yeah. So they, did you, I don't know if you listened to the like emergency podcast they did a few days ago about DeepSeek and I, I may be misinterpreting what they said, but I thought they said that DeepSeek would actually, their, the, whatever the reasoning model is called would give you like basically almost as if it was like articulating its thought process.

[00:38:07] I know I'm anthropomorphizing.

[00:38:09] Simon Couch: Sure.

[00:38:10] David Keyes: a large language model, but, um, and they were saying, you know, that's something that the O1 model, um, from OpenAI doesn't do, and they thought, oh, well, this is the kind of thing that might be incorporated, because it's actually really helpful to be able to have a little bit more visibility in terms of what's going on rather than just have, you know, like you said, sitting there and like waiting for a couple of minutes until you may or may not get a response back.

[00:38:34] Simon Couch: Yeah. The, um, the CEO, I think, It's the title of that person from Anthropic put out an essay yesterday, where one of the things he said was like, oh, it seems like it's a really nice user interface thing to print out all that thinking, which is something that like, oh, one from OpenAI hasn't done up to this point,

[00:38:56] David Keyes: Right.

[00:38:57] Simon Couch: in terms of like, what that looks like.

[00:39:00] In, uh, in pal and Gander and stuff. Again, these things are writing directly to your files. So, when they're thinking out loud, it's actually just inside of a little XML tag, slash think. And, uh, and different applications have ways of doing that. Just waiting until it's done thinking. And then once that XML tag completes, the model's like, okay, I'm ready to like, respond, respond.

[00:39:30] But the way this looks in Gander and PAL right now is it just like dump stuff into your R source file.

[00:39:36] David Keyes: Hmm. Okay. Huh. Interesting. Yeah. That would probably not be the best user experience. Um, but that actually gets me to a question I was wondering about, um, before when you were showing how Gander works, you talked about, you know, there being pros and cons of it, just pasting the response directly in. I'm curious if you can talk about those and why you chose to have it, at least it seems like by default, um, replace the text versus like putting it below.

[00:40:04] Simon Couch: Yeah. Well, so I think the nice thing about it placing directly, um, is that it allows for really quick iteration. And then in that like. If it drops some, uh, code in, it's selected already, and you're just triggering the add in again, then it's going to replace the thing that it put in, and there's no, like, deleting or copying and pasting. Um, and then on the drawback side, it's, it's really unpleasant when, uh, a model begins writing things out to the source file and you immediately know that it's probably not what you want to be there and you kind of just have to wait for it to finish. Um, I would say that, uh, that's mostly just an artifact of, of like the limitations of the way this tool is built, like this happens via the RStudio API. And, uh, the tool that I have access to, to like interact with your RStudio session is like modifying your files. Um, and not necessarily like giving you a nice, uh, diff so you can like think for a second before you let the thing, uh, drop text in or whatever.

[00:41:27] So I kind of feel like as promising as these tools are and it's. Even with how much time they save me day to day, um, I probably see this longer term looking like, uh, a UI that allows for that sort of, like, you can stop the thing at any point, you can, uh, choose to reject the changes that it wants to make.

[00:41:49] Um, and hopefully, like, authenticate easier too. There's this, like, config process that you have to work through when you first install the package, which, in general with tidyverse, we try to avoid. Like, you should be able to install the thing and get going. Um, and hopefully this sort of functionality can live in an interface like that at some point.

[00:42:11] David Keyes: Got it. So what are your plans moving forward for PAL, Gander, and any other AI packages you may be working on?

[00:42:20] Simon Couch: Yeah. So at the, at the time that we're recording this, uh, the next few weeks for me is going to look like coming back and revisiting each of these packages and kind of buttoning up the loose ends or whatever the phrase is, um, each of these three were me kind of learning in public, uh, experimenting with what these things might feel like and, um, um, And so before I, I send these things off to Cran, I want some of, um, I want to squash a good few bugs and, um, at least in the case of Gander, get these things working a little bit better outside of R files.

[00:42:57] So right now, if you're working in, for example, a quarto, like, markdown, um, that problem of, like, writing stuff directly into the files is kind of difficult. Um, and I've found that most of the time I, I prefer not to use Gander to write code when I'm inside of a quarto file. beyond that, I think trying to, to figure out how to solve this interface problem of, of like writing directly to the, the source file is, uh, is what's next.

[00:43:28] And that will probably involve coming at this from the angle of like an IDE extension, rather than, uh, from an R package that, uh, does everything that it needs to do using only R code.

[00:43:43] David Keyes: So would that be, when you say an extension, I know Positron obviously has that, that model of extensions. Would that just live in Positron? Or I don't know that our studio has that ability, or I mean, you tell me if I'm wrong.

[00:43:57] Simon Couch: Yeah, no, I'm not sure what that would look like yet.

[00:44:01] David Keyes: Okay. Yeah, because I know for the AI course that I've been putting together, I looked, I mean, I showed the extension, the Codium extension, which now that we've talked, I'm like, oh, that. is deficient because it doesn't allow, it doesn't have the, you know, access to the objects that I've created. Um, so I'm excited about your package, but also because I need to go back now and, uh, and record a new lesson.

[00:44:28] Simon Couch: hopefully we'll be keeping you busy, unfortunately.

[00:44:32] David Keyes: I mean, unfortunately, that is, fortunately or unfortunately, that's the nature. And, and, you know, I presented the course as like, this is what exists right now, and I'm going to continue to, to update it. Um, so I look forward to adding, I definitely need to add a lesson on gander, having seen you, um, walk

[00:44:49] Simon Couch: sure. Yeah, I'm happy to hear that.

[00:44:52] David Keyes: Yeah, cool. Well, if people want to learn more about the packages, about you, what are the best places to connect and do that?

[00:45:02] Simon Couch: Yeah, um, at the time that we were recording this, yesterday, so that's January 29th, a post went up on the Tidyverse blog about PAL and Gander and Ensure, um, and sort of where the ideas for those packages came from and how they're related and how to get started with them. And so that's a good place to start if you're interested in learning about those packages specifically. Um, In general, I write quite a bit about the things that I'm working on on my personal blog, which is simonpcouch. com. Um, and there's an RSS feed there if you're interested in following. And the same goes for the Tidyverse blog. There's maybe 10 or 15 people that contribute to that, but every once in a while I'm dropping in there.

[00:45:49] Um, and otherwise I'm relatively active on social media. Drop on to my website. There's links to all the different social media that I'm using relatively frequently.

[00:46:01] David Keyes: Okay, great. Well, we'll make sure we add a link to that tidyverse blog post as well as to your website so people can, um, find you there. Um, great. Well, Simon, thank you so much for chatting with me about all this, uh, the package development that you're doing. It's, it's really fascinating to see.

[00:46:16] Simon Couch: Yeah. Thanks for having me. This has been fun.

R for the Rest of Us Podcast Episode 24: Simon Couch

Listen to the Audio Version

Watch the Video Version

Learn More

Transcript

Let us know what you think by adding a comment below.