Ralph Wiggum, Claude Cowork, and Coding Workflows

Pierson Marks (00:00)
Hey, we're live, episode 27.

Bilal Tahir (00:00)
We are live.

Hello, hello, hello. Episode 27 today is January 15th, where this probably goes out on 16th. So yeah.

Pierson Marks (00:05)
Hello, everybody.

Right?

24 hour turnaround.

Well, everybody who, if you're watching this for the first time, this is Creative Flux. We talk about gender with media, images, video.

workflows, everything around the creative space in AI. Sometimes you get more technical than others. This week there's a lot of cool things to talk about and we'll be your shepherds through the changing environment of Hollywood, AI, creative works. So yeah, thanks for tagging along.

Bilal Tahir (00:38)
Yeah, thank you for joining us in this journey. It's crazy journey that gets crazier as we climb on that exponential hill, it gets steeper and steeper. Yeah, two weeks into 2026, and I feel like already it's been crazy. mean, so many things have happened, know, like just mortgages, know, initiatives, the chip and memory trade still going strong. It's just crazy.

Pierson Marks (00:48)
100%.

Yeah, totally.

All right, all

right. Yeah, so, where should we kick this off today? I mean, there's been a lot of talk about a lot of different things. ⁓

Bilal Tahir (01:11)
Yeah,

I wanted to focus this episode on some workflows about some of the agent workflows, which I know we talk a lot about Genia and stuff, but I think these are really cool foundational things that you can use then to build a foundation and a framework to really like.

10 extra productivity in terms of your apps are building or just generating media, the generated media engine that you are building. yeah. So I know I wanted to, I guess we can start with something you were talking about before we jumped on, which is you said you are an official multi-agent orchestrator now. What does that mean?

Pierson Marks (01:36)
Right.

Yeah, well, so like we're both software engineers and I think the way that you program has drastically changed. know, with with chat, you came out. I remember writing. I remember going back and forth between chat, GPT and my ID like.

in 2022 and ask chat GPT a coding question or give me the answer. Take that copy it paste into my ID, you know, go back and forth, make some mobile apps and then cursor came out and cursor was like, hey, we have this tab model where it can auto complete code for you and then they can also like write some bigger chunks of code without having to go back and forth with chat GPT. Pretty much just simplify that process. And then we had like recently, I think this was less than a year ago now, Anthropic had Claude code come out and so Claude code for

anybody that's not aware is kind of this agent, the coding agent powered by Claude and it can go on really long tangents of like writing a lot of code, checking itself, reviewing the code and just iterating for like extended period of time. And it was kind of very similar. I think a lot of people were like, okay, I'm in cursor, I'm doing this already there. And then there's Claude code that can kind of the same thing. And what kind of happened was that cursor kind of

veered towards smaller, more isolated tasks and Cloud Code could be more like autonomous and it will, you give it a nice big plan and some like high level features and then it'll go, it'll write the code, it'll check it, it'll run the tests, it'll build it, it can push the PRs. It kind of is much more like a junior engineer who you have assigned work to. And so the way that I was previously doing this, and I think probably you too, is you have your IDE open, you have Cloud Code in your IDE and you kind of...

ask it a prompt, you give it some information, you watch it churn and you watch it kind of going back and forth, you test and you have like this one instance of cloud. And then I was talking to some friends that were like managing, you know, multiple of these cloud agents. They had two running, had three running, they had five running. And some of these friends are

like either AI native engineers that learned to program in the last three years, and they grew up around this. And then there are some people that I know that are very well renowned, like Linux developers, really, really extreme level programmers that are running 10 cloud code agents in parallel, and they're writing all this stuff. And I was like, OK, I think this is a...

Like, why can't I do this? Why can't I be that person that's orchestrating five agents at once? Because it's very hard for me to think that way. Because I've never been a product manager. know, product managers, they think about...

Bilal Tahir (04:23)
you

Pierson Marks (04:27)
higher level tasks and how do you paralyze these things and you can split off all these tasks. And so I kind of challenged myself this week to figure out what are isolated sort of things that I can work on in parallel and kick off multiple cloud codes instances entirely and let them like those sessions just kind of run in the background. And when one needs a piece of action for me, it'll notify me on my desktop and say, like, do you allow this tool to be run? Do you allow this code?

and it notifies me through this notification on my Mac. now, I think yesterday I was able to in three, in parallel, build three different PRs that all looked good, they all are tested, and I was just cycling back and forth with them, making sure that they look good. At the end, it's done. I was just like, this is sick. I look like one of those 4X.

Bilal Tahir (04:56)
Right.

Explain to us, I'm curious, how are you kicking these off and then how are you checking the progress? How is that happening?

Pierson Marks (05:22)
So I downloaded the Cloud Code desktop app. The reason why I downloaded this is something that we'll talk about in second with Cloud Code work as well. ⁓ But the Cloud Desktop app, there's a code tab. And so rather than just chatting with it, you can go to code. And then it will actually just use your local repository. it will, just like you have Cloud Code in your IDE, it will.

Bilal Tahir (05:30)
Yes.

Pierson Marks (05:46)
create a get work tree. So it's a completely isolated instance of your repo. It'll work in that get work tree and make changes that get work tree.

Bilal Tahir (05:54)
So it doesn't create a branch. You create the branch and then it creates the work tree for that.

Pierson Marks (05:58)
So you don't even

need to the branch. mean, I think WorkTree technically is a branch. And I don't know exactly how WorkTrees work, so scratch this. But it's just like a copy of the repo entirely. yeah, so it's isolated. So I could have my coding IDE open that I'm actually looking at, I'm actually writing code with. And then I can have these agents in the side that have their own isolated repos.

Bilal Tahir (06:04)
Right.

Right.

Mm-hmm.

Pierson Marks (06:23)
their own packages and they're off doing this. And as long as I spend a few minutes at beginning, giving them the right context, the right direction, providing some screenshots, giving them actually the tools to validate their work. So one of the things that I added in was

Bilal Tahir (06:32)
right.

Pierson Marks (06:40)
like the ultra site to to lint those errors. There was a code simplifier agent that at the end of any cloud code session when it's like hey closet. Hey, I'm done with all my code. It will automatically say hey, actually wait, let me go back through and run this sub agent to simplify the code and so it actually will go through refactor and like make sure everything is tighter and that will automatically happen.

Bilal Tahir (06:59)
Hmm.

Pierson Marks (07:03)
So that was pretty cool. And then there's all this like other skills, like the React best practices skill that Vercell released and that we have in there and some other things. yeah, it's pretty cool.

Bilal Tahir (07:18)
And how you see the final code, like what is that? Do you go to the ID or do you go to GitHub to just review the PR?

Pierson Marks (07:25)
Yeah, so

I for sure go to GitHub and review PR like that's that's a nominal variable so it'll Claude will push the PR. I'll go to GitHub. I'll look at the PR read it. And then you.

Bilal Tahir (07:35)
How does Cloud create

a branch? it does a work create, and then it will create a branch of the feature?

Pierson Marks (07:41)
Yeah,

pretty much in. So I added, I think actually this is already in there for Cloud Code. If you do slash, there's these official plugins by Anthropic. It's like commit. It's like commit, checkout, push or something. I forget what it is, but it will just commit the changes. It'll check out a new branch. It'll push that branch to a remote branch and then it'll create the PR off that branch. So yeah, it'll just.

The work tree tracks the same repo and so it'll just create the branch named the same work tree and then it'll create the PR.

Bilal Tahir (08:15)
That's awesome. Wow. Sweet. So you just went to Cloud Code, you kicked off these three tasks, and you let them just go like they're just running autonomously. I think this is the key thing. I'll talk about Raffle Game in a second. But I think just AFK workflow almost, where you can just kick something off. And I'm almost like.

Pierson Marks (08:16)
So I reviewed the code.

Okay, turn it on.

Bilal Tahir (08:34)
By habit, I look at the code as a, and then I'm like, wait, why am I wasting my, why am I looking at just doing things? I can just do it when it's finished, right? I can just do something else. It's like a, it is a subtle mind shift. And I think the other key thing with this agentic workflow, which you kind of hinted at is Cloud Cowork. I think, what Cloud Cowork is something they just launched last week, which is basically Cloud Code, but for non-technical people. It's just like a nicer, know, GUI. Cause at the end of the day, and people have, you know, it's like one of those like, well, duh,

Pierson Marks (08:42)
Alright.

Vert.

Bilal Tahir (09:03)
to say, but there's no real difference between an agent, know, upgrading a PR and updating a code base versus writing an essay or, you know, going, doing a task. It's just a task at the end of the day. And so I think we're just going to increasingly see a lot code, is just like coding, technical agent work and knowledge work, just kind of merging into this one thing where we just kick off agents. do a thing, right.

Pierson Marks (09:16)
Mm-hmm.

All right.

Bilal Tahir (09:31)
And maybe we do have different GUIs or applications just because of the specific thing, but under the hood, that's what it's doing. Model is just doing something with the system prompt. So.

Pierson Marks (09:41)
No, absolutely.

mean, like it, I've spent a lot of time since the beginning of the new year thinking about Jellypot a little bit too, and like the direction that we take the platform and like how do we architect it to take advantage of, you know, the better lesson.

the flexibility of powerful agents, tool calling and all this stuff because at the end of the day, like Claude, for example, Claude code work or Claude code in general, it ships with a default skill called PowerPoint generator.

enabling that skill or that power up, you know, like I kind of imagine Claude is like this really powerful thing, but you got to give it like the skills and the power ups, like you're playing a video game, like, hey, now you can do this, now you can do this, now you can do this. And how you sort of design these skills so that you can really, or design JellyPod in that platform. So it's like the JellyPod agent, you know, and it has the ability to do all these different things. And then.

with the surface-to-right UIs because there's still element of UI that is important for creative works, like seeing the PowerPoint, seeing the podcasts, being able to edit things with the cursor versus text because everything isn't necessarily text. think text is a great medium for a lot of work, but sometimes it's like you would never go on Instagram and like, you would never like.

You want to press a button to like it. You wouldn't want to say, hey, like this post. You just want to press a button to like the post. So it's a hybrid of the UI and the text interface. But yeah, it's super interesting. Cloud Cowork is cool. It's for all those non-technical tasks that specialize to there doing things on your job.

Bilal Tahir (11:10)
Yeah, mean, it's and

I think the other word skill, feel like people, lot of people still think of skills as something like a, ⁓ a core simplifier skill or an MCP skill. And I think it can really be any, basically any thing that you feel like, you know, there's some pattern matching that you can kind of.

synthesize. be cool. I was actually playing around with Suno on the weekend and I basically took the top one Billboard 50 songs or whatever and I was like make me a skill that says what Gen Z listens to. It actually did a great job of basically coming with patterns and stuff like the subject matter and stuff and then I basically generated some songs and I was like

Yeah, I can see this like, you know, going viral on TikTok or something, you know, it's actually pretty, I hated it. I'm a boomer, but I think young people would love it. So, and I would never have come off that.

Pierson Marks (11:58)
So why did you create a skill? what, like, can you dive into that thought process? Like why?

Bilal Tahir (12:02)
Yeah,

well, I created this script initially where I had all this context of all these songs and I was like, based on this, me some patterns and then took those patterns and came up with some songs. And I reverse engineers to like what kind of a pop artist, like, you it kind of basically takes one part Sabrina Carpenter, one part Justin Bieber, whatever, right? I mean, it just comes up with something. But then I'm like, well, if I just want to have a fresh chat window and I want to not do that work.

I can now load in that scale and then be like, all right, now based on this, let's come up with a new song about jazz or whatever, right? mean, so it's just portable in that way. It's kind of like synthesizing it.

Pierson Marks (12:39)
Yeah,

I want to dive into something real quick too on this. So skills, there's all this terminology that I think I'm still a little confused on. I've read the docs, it hasn't stuck. So maybe can clarify or maybe we both could do more research. So the skills, the plugins, the sub-agents, there are... ⁓

Bilal Tahir (12:55)
Right.

There's Hooks,

there's MCP. Actually, Lee Robinson, he's a head of product, I think, at Cursor. I don't know if he's official. He's awesome. But he did a video, it just came out, I think, today or yesterday about all these things. So you go through this. There are skills we talked about, which are these empty documents, are this is a, it's kind of.

It's well, there's claw.md which is like rules. So rules are stuff like, you know, use these certain coding patterns, right? You know, don't.

you know, whatever, maybe use async await and don't use promise shading, whatever. Like those are rules. Skills would be like, you know, this, use the AISTK or we use this, you know, this package. This is the API, et cetera. For me, I mean, that's how I break it down. So it's, it's not about abstract generics. It's more about specifics, like, you know, use this API, et cetera, or use this ⁓ specific pattern for this specific thing. Cause you can, one of the reasons you want to separate them out is

you don't want context overloading. don't want it to, Claude to always have all these things in it. It should just be able to take that skill when it needs it. So that's another reason why you parse these out. And then there's, what was the third thing there? I'm blocking, I'm blanking out. There was another form of this basically. Oh yeah, so Claude just came up with something I think it's called tool. It was basically rather than having a

Pierson Marks (13:59)
Right.

Mm.

Bilal Tahir (14:22)
CloudMD, you can have these tools in which you call the tool to fetch the data rather than fetching the data. And I'm easy on this, but one of the lead developers of Cloud Code put this out. Apparently, is, I think that's like the next paradigm where you basically come up with two calls that then call the thing rather than having the thing, which is what we have right now. We stuff it in the context, or at least in the code base. So.

Pierson Marks (14:47)
Interesting. Yeah.

Because like, skills, have a ⁓ plain text description.

Bilal Tahir (14:52)
Right, yeah, they're just markdown files. mean...

Pierson Marks (14:54)
And so the skills are like tools, no? Or skills provide tools. So skills provide, like, it's a higher level abstraction above, all this stuff. So it's like a skill, you have a skill name, description, and then everything that it gives you access to, right?

Bilal Tahir (15:03)
Right.

Right.

I

mean it's just the instructions and ⁓ it's like docs for doing something basically right

Pierson Marks (15:15)
And do skills give tools? No.

Bilal Tahir (15:18)
You can document tools. can be like, if you feel like, like you can have a web search skill, which is inbuilt now, but let's say they didn't, you'd like, hey, if you use this fetch curl API, if you feel like you don't, like if you, like I made a fall skill where I basically took all their APIs and sort of searching, you're like, you know, there's this fall subscribe, there's this queue. So it just has that. It just saves you a little bit of context. stuff like that, where you have to do it. I think a meta thing on top of this, which I think is good to do is you should,

you can come up with something which goes through all your skills and updates them. So especially with an API, let's say the AISTK code has bumps aversion, your skill comes out there. So maybe there should, we'll probably have some sort of a refresh thing. And this is where plugins come in or MCP is where I think somebody else maintains them. So you pull them all the time. it's kind of like from what I understand, plugins are basically skills, but they live on a remote server rather than yours. So somebody maintains them, I guess, right?

Pierson Marks (16:10)
Because plugins have skills, but you can also add subagents and hooks and stuff. So it's like a plugin is like a bundle of skills, hooks, subagents. So it can contain more than just a skill. So if you pull the plugin, you can get the skill or all the skills that are in that plugin. You get the hooks. You also get the...

Subagents or whatever, and MCP servers. So I plugins are just like a bundle that you can just like, it's like a package. I think of them as packages that you can install and do Claude that can contain more instructions, skills, subagents. ⁓

Bilal Tahir (16:45)
Yeah, mean,

it's funny, I mean, we're in the thick of this, we're even still, it's very confusing, because I feel like every month there's some new abstraction comes out, and at some point we'll hit too many abstractions, maybe we'll go back and consolidate and simplify some of this stuff, but yeah, sub-agents, looks, all this, it's crazy.

Pierson Marks (16:59)
But.

I mean, the reason

why I got into this, I was like really trying to force myself to get into like the, become the orchestrator type of person was that like, like Andre said, he said, I feel like there's so much power right now that I just don't, I'm not aware of. And it's a challenge. Like you have to really force yourself to learn something new and like really place you into an uncomfortable situation. Cause it's very easy to just go about your day doing the things that you know how to do.

because you don't want to take the penalty of slowing down to relearn and refactoring. And I think one of the most key things today, if you want to really just go from being like this 10x engineer that uses Claude in your IDE to being the 100, the 1000x engineer is by not every day.

But every few weeks, taking a step back and dedicating a day, a full day maybe, or maybe a few hours, a full day, to thinking about what is out there today that I don't really, I'm not fully utilizing. And that was kind of a day that I spent yesterday. It was like, hey, I wanna look, I really wanna understand what plugins are out there that I could set up a system that going forward compounds.

Adding a code simplifier agent at the end of every cloud code session is going to multiply the effectiveness of both of us in our code base so much. It will take me an hour to set up to how do I do this, know, install the new skill, modify the cloud MD file and the prompts. But now it was a one-time cost of me learning that thing. And now myself, you and the team.

we're going to be able to go faster with higher quality code or go for longer. And all it took was like a small investment today to pay dividends.

Bilal Tahir (18:48)
Yeah, no,

100%. And it's something I'm trying to get better at too. And I remember, so it's actually a good segue into while I'll talk about Rathwigan. So we touched on this like last week where Rathwigan, it's basically.

Some of the hyper, I mean, honestly, it's kind of like very MCP like, like, oh my God, it's going to change everything. I'm like, chill. It's, cool. Like MCP is cool, but you know, it's not going to change the world. mean, it's just a cool workflow. What is rough pick? First off, it's basically a bash loop as the, as the creator says. And what it like right now, what you do is like, you go on cloud code, you say, do this and it enters plan mode and it'll maybe divide that task into 10 steps and then it'll execute on those 10 steps and then it'll finish, which is awesome. Very powerful.

But sometimes you want to do a lot, right? Like maybe let's you want to build an app, you know, a full app, let's say you want to build Notion or Figma, right? You're not going to just, it's not going to enter into plan mode and build Figma for you probably, right? So essentially you, what you want to do is you want to give the model less to do each turn.

and until it finishes. so Ralph Wiggum, and it's very simple. All it does is you create a PRD. And a PRD is basically a document with all the checklists. So you create the checklist before. And so instead of 10 steps, you first enter into plan mode. And I like to do this with Claude or Chad GPT, just chatting. Come up with a checklist of, let's say, 100 things you have to do. You have to install these dependencies. have to really grow granular. The more granular, the better. And so each task is very granular.

Pierson Marks (20:14)
Mm.

Bilal Tahir (20:17)
you're going to make the search function, you're going to make this component, you're going to make this blah blah blah. It breaks it down into 100 steps or whatever. And then what you do is basically you run a bash ⁓ script and it opens and I guess the simple but key insight is you don't start one cloud session. What you do is you create you start a cloud session, you go to the checklist, you say pick the most important or most natural item you want to do. You pick it, finish it,

And there's a couple of things that happen here. first you have a progress.txt file. what it does is it basically, you finish the task and you append what you did. It's kind of like a change log. So it's an append change log. This is for you as documentation. You say I did this, this, And then once you're done, you.

commit the changes, so you actually commit it, and then you check the box, and then you finish, and you spin the agent down, and that's it. So you're giving the agent less to do, which you can just focus on that one thing. And then the way the script runs is it basically is a while loop, and it says go through the things, once you're done, output this string, which is complete, complete.

something like a bash command, like, you know, just a special end identifier. And that way you can exit the loop eventually once you're done, it'll just console log, printf, ⁓ print out that thing to standard out. And that's how you know to exit the loop. But essentially that's how the while loop runs.

And it's simple, but brilliant. I was, and you could do, and the cool thing is you can just do all sorts of things. So what I ended up doing was I basically say, I did a couple of things. And by the way, I want to call out the person who built this script. Cause there's so many people who talk about this, but he actually like got into, his name is Matt Pocock. He's on Twitter. He's, he, was the TypeScript guy before. And I've watched his TypeScript videos for the longest time. Knows TypeScript very deeply.

stuff like you wouldn't even think about, like stuff like should I use type or interface? You're like, I had never thought about the difference with the trade-offs there, but yeah, he actually goes into why you would use one or the other. Turns out it's not a big difference, by the way.

But but he basically wrote this post that went viral which I recommend checking out and maybe we'll share some of the scripts I did too. So he has a script which is a raf underscore afk where you can run run this and a couple of things which he does which I really like first off There's this a docker has a single a cloud sandbox, which is really cool And if you really want to go off keyboard, I recommend you run so instead of running cloud What you do is you run docker run? Clots sandbox cloud what it does is it's a special cloud template

template

and what it does is it moves in that directory and it's just a sandbox way of running cloud so it can't do something like rm-rf tilde. you know, if it, I mean, it would never do that. I don't think, I was like, somebody posted that and I was like, fuck. But,

But even if it does that, can at most it'll wipe out your Docker root directory. It doesn't go outside the box. So that's a good way to run this. So if you run that and then you run the script, and then what I do is I have the check the PRD done. It goes through, it does the commit, and then it runs Ultrasight and Biome Lint, checks that.

fixes the length so it's kind of like so so there's no you don't need a post hook it's basically your instructions you just add it in there and then it commits and then it spins down so i just thought it was so cool i was able to do that i literally said uh if any of you have clawed subscription what you can do is i i had like some users left because my limit resets 10 a.m thursdays so i was like i don't want to waste this like money on the table so i basically set this up and went to sleep and i woke up and they had it

had gone through like five of the 10 step checklists, I was like, OK, this is cool. I can truly just leave the computer just running.

Pierson Marks (24:01)
Does

your computer, like if your computer falls asleep, do you like keep it on?

Bilal Tahir (24:05)
Yeah, I actually, yeah, that was something I was concerned about. In Mac, when you put power adapter on, I deliberately leave it so it doesn't sleep. I guess, really, if you want to take it next time, what you do is you'd spin this up in an EC2 instance or something like that so that you can actually put your Mac to sleep and just let it run. ⁓

Pierson Marks (24:22)
Well, there's interesting.

like, mean, on the cloud code desktop app, I haven't done this yet. But when you create a new session, you have the option to either select a folder locally, or you can go to the default cloud environment, which I don't know what the deal is there. It seems like.

Bilal Tahir (24:38)
Wait, wait, repeat that, sorry.

Pierson Marks (24:39)
So

you can like spin up your Cloud Code instance in a Cloud repo or a Cloud environment. I don't know. I think it might be GitHub. I just don't know where the environment is. Trusted sources. Like, I'm not actually sure where this spins up, but it spins up in the Cloud somewhere. And you can kick it off from your Cloud desktop app. ⁓

Bilal Tahir (24:45)
Where from?

Interesting

and this is Genagor is this copilot or because I know they have the workspaces and stuff that they let you kind of do that

Pierson Marks (25:07)
This is in, well this is Claude desktop app.

Bilal Tahir (25:13)
⁓ yeah. Yeah, I wonder if it's similar, because I know Cursor has this thing called background agents, which basically they'll just run that under the hood and you give them access. so I think, I do think this is going to become standard. It's going to be where you just can just have almost like a REPL like thing where, it just goes and has a group spins up a, you know, a Docker container or whatever, and you just let it go wild in that.

Pierson Marks (25:15)
But then.

Right.

Cloud code.

Right,

I think this is on Anthropic servers because they do have Cloud Code Web. And if you have Cloud Code on your IDE, you could always send your current task to the cloud by doing Control B. And so they have the background agents too. So if you want to just send this thing off to the cloud, just do Control B, and then it'll send the agent to the cloud. And then it doesn't, it's not on your device anymore.

Bilal Tahir (26:02)
Interesting. I thought control B basically it just it's kind of like it's still running but it's in your terminal so you can just start interacting with your terminal but it's just running.

Pierson Marks (26:10)

you know, you might be, I think you're right. I think, yeah, good correction. Yeah, yeah, yeah, yeah. I think, yeah.

Bilal Tahir (26:17)
But that would be

sick. if you can build just, it's almost like SSH into.

Like I think that's like, know, you, terminal I open up, I want to just run a cloud instance on a Docker container that I assess, you know, in a separate machine, assess it into it. mean, there's stuff like companies like fly.io and stuff. mean, there's a lot of opportunity. I'm sure they've thought about this where you can just, it's kind of like a better AWS because AWS sucks like in terms of just like working with it. So usually I'd like use fly or whatever, you know, it's just easy to create these old droplets. So, oh wow. This is a, what's your desktop?

Pierson Marks (26:47)
Totally. So this

is Cloud Code Desktop. I wanted to show it because I think it makes sense. ⁓ You can select your repo. I'm going to, what's a repo that I can, I guess we could do the actual repo. It's okay. ⁓ And you can do it local. So if I do local and I select the repo.

Bilal Tahir (26:51)
Nice.

Pierson Marks (27:10)
WorkTree creates an isolated copy repo ⁓ and that can just kick off a task. this is the thing that I was talking to you about yesterday. So I have all our tasks in Linear. You can't see this because I'm not sharing the screen, but Linear is like a work management program. Really good. It's like JIRA, but like much better. I like it. ⁓ And let's see. See if there's like a small little issue that we can do. ⁓

Okay so I have this issue, it's JellyPod 850, it's called podcast and blog post tool. ⁓ So what I'll do is it's already connected to linear so I can see connector linear. I'll say, ⁓ right, right, so I'm going to do implement JellyPod, wait what was the number? 850. So that was like the number ⁓ and so what it's going to do, it's going to act.

Bilal Tahir (27:48)
You can do screw slash and connect.

Pierson Marks (28:03)
And so this is all happening locally on my machine. This is like the look.

Bilal Tahir (28:06)
So yeah,

can you explain the difference between the local and the, if you had done it with the remote, it just would have run this in their servers basically and sort of.

Pierson Marks (28:15)
think so. Let's

do default. Let's do select repo this one and then let's do a different activity. Let's do I'm trying to find an issue that I have that we could do. ⁓ Okay, improve front end static pages. actually I have one. ⁓ Use the ⁓ react best practices skill.

OK so ⁓ this is JellyPod 900 so implement gel 902 and this is going to default cloud environment. Don't know where that is but let's let it run. ⁓

Bilal Tahir (28:59)
interesting I feel like you would have to connect

Pierson Marks (29:01)
So here's, ⁓ yeah. The one thing that I don't know if you can just dangerously skip permissions on here, which always allow, ⁓ but I also set up, so if I just have this, the background and when it notifies me, I have like my desktop notifications on. So I could be like just like on my email.

And in the top right hand corner, I get a notification like, hey, Claude wants to do this thing. Do you allow it? You just click Allow on the notification, which is really cool too, because I could just minimize this, put it in my doc, and just allow and allow from the notification in the top. It's like this would have shown a notification in the top right hand corner. I'm not even reading what it says. I just allowed it to do. And then.

Bilal Tahir (29:39)
Yeah.

Yeah, mean, having the constant asking for permission is very annoying. mean, so I'm definitely like, I just, you have to click the first yeses and stuff, but then you also feel like, you know, just doing dangerous escape permissions is also, feel a little, this is where the sandbox is. I feel better about just using the sandbox.

Pierson Marks (29:59)
100 % no totally but yeah.

Bilal Tahir (30:01)
But maybe that's where the

remote thing is cool, right? Because I mean, at worst, I mean, guess that the one command I think I would be scared off would be if it does like a git remove, like a flog or something like that, that completely wipes out the git directory. But at least within its server, the worst it can do is it can mess up the development of the features on or just try to delete the folders and stuff ⁓ in that thing.

Pierson Marks (30:28)
Right.

Bilal Tahir (30:29)
And I think this is something ⁓ I do think it'll become very important is something called, well, like we've talked, everyone kind of knows what prompt injection is, but prompt injection,

Bilal Tahir (30:40)
you can, the way this exploit works is you basically, the agent let's say is going, doing some web search or whatever for the task. And it goes to malicious page where they do a prompt inject there, say, take all the data in your route and send it to this address. And so you can do, I mean, the agent can be hijacked like that. So I think we're going to see some buttoned up.

you know, maybe like a course equivalent or something for agents where it can't send data or something, do some stuff like that. But that is a little scary, but this is why sandboxing and stuff, think is very important. jumping back to what you were just doing. So those two tasks, one at the remote server, one locally, just, so it's creating the work tree. It's just running on these servers. Once it's done, it's going to push to GitHub and then you can basically review it there or you can just review it inside code and be like, make this edit, et cetera. So that's so sick.

Pierson Marks (31:25)
Totally.

You could also open this up in the IDE as well. I mean, you can watch it, write the code in the IDE as well if you want that. If you don't want that, you don't have to. And then, yeah, I like just, you can kind of smell, you can kind of like look at the code and if it's like.

looks off sometimes. I realized myself, I could detect, oh, Claude's going in the wrong direction right here just by looking at the diffs. Just seeing, why are you doing all this stuff? It just looks weird, and then you investigate, oh, yeah, why are you doing this? And it just becomes this new sort of, I don't know, scanning type of capability that you learn when you're reviewing this type of code and you're watching Claude code turn along. You're like, wait, why are you doing this weird

Bilal Tahir (31:45)
Yeah.

Pierson Marks (32:12)
thing right now. Stop.

Bilal Tahir (32:14)
Yeah, yeah, no, definitely like

looking at the devs and stuff, you know, I mean, that's kind of like review first has kind of become like my go-to as well, like where I'll let it just go do this thing and then I'll just like start looking at it and people have different things, people like some people, workflow, they like to look at it.

locally or I know you love linear for that. I'll do the local disk and then I put the PR. But I also like githubs. I'm maybe I'm old school but I like the PR. That's why I like to set up draft PRs because I like that view in github to see the file differences.

Pierson Marks (32:46)
yeah, I like that view

too. It's very nice because looking at the diffs is hard locally for some reason. ⁓

Bilal Tahir (32:53)
Yeah.

Awesome, yeah, I like the deep dive we did in all these agents and stuff. think it's like building the system to do the thing. There's so much alpha there. so investing your time and resources into this, think it's great.

Pierson Marks (33:08)
Totally.

Totally. Yeah. It's interesting. And one last thing I want to talk about, I know we went on this tangent of coding. One thing I wanted to call out I thought was really cool because I tried to build this. I think I did pretty well building this about a year ago. Vercell released this new package called JSON Render, which was like a streamable version of generative UI. And it always was that.

Bilal Tahir (33:30)
Right. Was that a year

ago? Because I saw the announcement like yesterday. I don't know if that was.

Pierson Marks (33:34)
No,

Verso released it yesterday, but I built something like doing the same thing like a year ago because I was looking to how do we generate components for remotion video in the podcast, but you wanted to define it to some sort of alphabet of components ⁓ so that just the AI didn't go off the rails and

Bilal Tahir (33:39)
⁓ nice.

Right.

Right.

Pierson Marks (34:00)
What JSON render was is essentially Vercell came out and said, hey, you could use this package to allow LLMs to generate predefined components, provide the components that you want the LLM to be able to generate, such as a div, an inbox or input text area, a button, and you'll define these components in this schema. And then the LLM, you give the LLM the schema and ⁓ using structured outputs,

It confines the outputs to these keys of components and the parameters that it can create. And so by doing this, you give the LLM the ability to create components that are, it won't go off the rails. like, you know, it can only create input fields, buttons and text. You can scale that to whatever you want. And that's really powerful because I always wondered, why don't we just let,

Bilal Tahir (34:41)
Yeah.

Pierson Marks (34:48)
an LLM just generate the UI. I guess it's like security implications of this and you just let it go off the rails because you can do some weird prompting, I guess. I don't know. Still kind of confused on why you can't just let it generate the the JSX or the HTML and then just render that. But yeah, it's a cool thing.

Bilal Tahir (35:03)
It is interesting.

I I saw that package too and I think.

It's funny because I kind of do the same where I'll install all of Shotsian and I'll just specifically ask it to do it. So this is just kind of a more deterministic way, I guess, to do that. It makes sense. I will say one thing. does limit. It depends on what your goal is because once someone called, we had this discussion with someone that I was telling him about, I'll tell it to you Shotsian. And he says, well, when you do that, you limit the model's imagination on how to create a good design, well-designed website.

it does come out to look like a very short cne bootstrap type website and that is one thing but maybe you want that so it's like ⁓ spectrum on the dynamic ui thing

I'm a little skeptical because a lot of people have said, everything will be generative and stuff. I just think I don't understand it on two levels. Well, I guess I pushed back on two levels. One is first is just the amount of compute that is wasted. We're just having a pre-rendered page versus generating stuff. The second is I think people always over, because it's mostly developers, they overestimate how much people value personalization and stuff. And I think for 99 % of people, they just want the blue Facebook page and only

Pierson Marks (36:00)
Perfect.

Thank

Bilal Tahir (36:10)
a small minority actually care about, I can tweak this and personalize it. Now that doesn't mean, I think there's a hybrid. mean, it's kind of like ad placement, right? You get personal, you get the banner. The banner position is the same, but the banner changes based on your personality. So there's probably gonna be small pockets of the UI that you can personalize more, but not like the whole thing. You your Instagram looks completely different than my Instagram. that, I don't think so.

Pierson Marks (36:15)
Tom.

I agree. That's Totally.

⁓ no, absolutely.

I can't agree with you more here. I think that this whole concept of like fully dynamic personalized UIs is, I'll take a strong position of this will never happen and it will never happen because most people like there are

There's like the consumer creator paradigm. Like there's 1 % of people that create and then 99 % of people consume. And the 1 % of people that create have the taste and the ability, the thought to really think about, hey, how do I build something that a large majority of people benefit the most from? And you put a lot of thought into that process. And so there's usually...

You know one right answer, you know, I know my product really well. I know what I'm trying to achieve with this product I want to get people to subscribe I want to get people to be able to generate stuff. Well, so I'm gonna go through my head What's the most optimal path to like achieve that outcome and then? Maybe there's like multiple paths. Maybe it was like like for like people in the United States versus people in like In like Asia, maybe the paths are slightly different because of cultural differences, but I don't think it means that every single person

you'll be computing like what's the optimal path and to get them to the outcome because there's a finite number. It's like, hey, like, no, you click this button, go to this next page. You don't need to like figure that out each time. Like, you know, you click this button, you want to go to this page. So I completely agree.

Bilal Tahir (37:52)
Yeah. Right.

Yeah, so we'll see. But still, mean, it's very interesting. JSON is interesting. I know our friend Viber, he has Bammo, is it's a song programming language, but they were like doubling down on this whole JSON streaming, rendering pre-UIs as well. there's definitely a space for it. So it'll be interesting to see how it out.

Pierson Marks (38:14)
Totally. Totally. I completely agree.

Well, I mean, we're at like 40 minutes today. So I know there's other stuff we could talk about. We can save for next week. Up to you.

Bilal Tahir (38:24)
Yeah, no, I mean, I think there's so much of this is good. Like we focus on the workspace, the workflows and stuff. So I think we can like have a pure episode. Just, you know, we talked about them, you know, although lots of other cool stuff came out. We'll go talk about that next week.

Pierson Marks (38:39)
Yeah, absolutely. We'll

add into the show notes. the Lee Robinson video that we'll add, Ralph Whitham, Claude Cower. Yeah. So we'll add those into the show notes. And next week, we'll cover all the other stuff that we didn't get to talk about and more.

Bilal Tahir (38:44)
Yeah now yeah post that article as well.

Yeah,

See ya.

Ralph Wiggum, Claude Cowork, and Coding Workflows
Broadcast by