Transcript#
This transcript was generated automatically and may contain errors.
So there's been a lot of conversation lately about what makes an AI assistant actually useful for data science work. I asked a few of the engineers who built Posit Assistant to explain, in their own words, how they approached that, and here's what they said.
Simon Couch: thinking
Simon what's something that Posit Assistant does really well, and how did you build that? I wanted to talk about something that is a little bit under the hood, but I think is actually really important to making Posit Assistant feel as good as it does. And that thing is thinking, otherwise known as reasoning, which lets the models think out loud to themselves before they take actions or before they respond to you.
So as an example, I have this Posit Assistant conversation pulled up, and I've asked the agent to make a couple plots of this forested data set. And it's gone on and written some ggplot code. Because Cloud Sonnet supports thinking, there's this little light bulb icon, and when I click it I can change to a number of thinking settings, but I'll just leave it on high for the purposes of this.
So let's suppose instead of the default ggplot blue and red, I wanted to change to a nice subtle green and yellow. If I go ahead and submit that follow-up, we'll see in the UI that the model is thinking out loud to itself, and once it's done thinking, that little thinking block goes away and the model starts taking action.
So one of the things that we saw coming when we implemented thinking was that the models might take more reasonable actions or write better code, but one of the things that surprised us is that the models also seem to communicate with you more clearly and be more respectful of your attention when they have that chance to think out loud to themselves first. So we see in this example that in its reply to us, it's just given a one-sentence summary of its actions because that's really all that's needed in order to conclude that follow-up.
the models also seem to communicate with you more clearly and be more respectful of your attention when they have that chance to think out loud to themselves first.
George Stagg: terminal user interface
All right, George, what's some functionality in Posit Assistant that you're excited about? One of the things I've been working on that I'm quite excited about is the terminal user interface for Posit Assistant.
What we've learned from tools like Cloud Code is that more and more people are beginning to want to work in an agentic way using tools like terminals, and although terminals have roots in sort of text-based technology, modern terminals can actually do some amazing things. Most modern terminals these days can actually show images, and that really enables us to have that entire workflow of importing data, visualizing results, and analyzing with Posit Assistant inside a terminal very easily.
Sara Altman: predictive modeling
Sarah, what's something LLM struggle with when it comes to data analysis, and how did you build around that for Posit Assistant? When working on Posit Assistant, we noticed that it would often make methodological mistakes when building predictive models, like training on the test set or switching metrics mid-process. We did two primary things to address this problem.
The first is that we fixed the prompting to generally encourage best practices and discourage bad ones, and the second is that we created a predictive modeling skill. Skills are additional pieces of prompting and code that are invoked only when needed. In Posit Assistant, you can invoke the predictive modeling skill either by referring to it by name or by mentioning that you're working on a predictive model.
The skill is an attempt to get the entire process of modeling right. So in R, it defaults to tidy models functions, which have a consistent interface and sensible defaults, making it more likely that the model will make good decisions. It also instructs Posit Assistant to follow sound modeling practices, splitting your data and holding out the test set, resampling, starting with baseline model, iterating one change at a time, and comparing models in the same metric set.
After each model, it shows you a running comparison table so that you can track progress, and it also asks for your permission before ever touching the test set. This is because once you use the test set to make decisions, you've lost your unbiased estimate of how the model will perform on unseen data. With this skill, it's more likely that you and Posit Assistant together will make good predictive modeling choices and avoid mistakes that will interfere with or bias your results.
Winston Chang: image data extraction
Winston, what's something that Posit Assistant does really well, and how did you build that? I want to show you something cool you can do with Posit Assistant in RStudio, this combination of AI and a data science environment.
So here's a plot that I came across on the internet, which I thought was interesting. It shows employment data in Taiwan from the 1960s to the 2000s in various sectors of the economy. But I wanted to visualize this in a different way. So first thing I'll do here is I'll copy this image, and then I'll paste it into Posit Assistant, and I'll ask Posit Assistant to extract the data and convert it to a CSV.
So it wrote the CSV, and now it's plotting the data that it extracted. And let's just compare to the original to get a bit of a sanity check here. Okay, I'll just take a quick look at it. It looks pretty close. Not exact, but pretty close.
Okay, now I want to visualize it with a stacked bar chart and with grays instead of colors. Okay, so here's our stacked bar chart. And you can see that each time point should sum to exactly one, but it's slightly off here because it didn't do a perfect job with the data extraction, but it's pretty good.
Now let's normalize it so each year sums to exactly one, and let's also use a stacked area chart instead of bars. Oh, and this last suggestion is also really good here. We'll reorder the stack so the agriculture is on the bottom and services are on the top. Let's go do that.
And here we go. This is that same data plotted in a different way. So as just a reminder, let's look at the original. We started with this plot, and then we extracted data from it, then we made a new plot from the data, and we iterated on that plot until we arrived at something we like.
Now this is something that would have not been a reasonable thing to do in the past, but with this combination of AI and a data science environment and the integration that we have here, this actually becomes a really easy thing to do.
Now this is something that would have not been a reasonable thing to do in the past, but with this combination of AI and a data science environment and the integration that we have here, this actually becomes a really easy thing to do.
Thanks to everyone for telling us how you built Posit Assistant. What I love about these stories is they're not just flashy demos, they reflect real decisions made by engineers who understand what it's actually like to work with data. If folks haven't tried Posit Assistant yet, now's a great time. Head to Posit.ai to get started with a free trial.




