Claude Code as a General Agent, Native Ads, & AI Filmmaking
Pierson Marks (00:07.471)
Hey!
Bilal Tahir (00:08.782)
Hello, hello, we didn't get the timer this time. It just...
Pierson Marks (00:12.123)
I think we did, maybe, maybe not, I don't know. So while we're on, we're live. Episode nine of Creative Flux, welcome everybody. We almost, and yeah, in double digits, that's crazy. I remember when we first started JellyPod, there was this interesting stat that I loved. It was that 90 % of podcasters never get past episode three.
Bilal Tahir (00:15.918)
Yeah
Bilal Tahir (00:22.926)
Welcome, nine, damn, almost in double figures.
Pierson Marks (00:37.499)
And 99 never get past episode 20. So we got past episode three. We're on our way to episode 20. Definitely gonna be the 1%. Yeah. welcome to Creative Flux. I'm Pierson Marks. This is my co-host Bilal And we just talk about...
Bilal Tahir (00:43.342)
We're definitely going to be in the 1%. Yeah, that's going to happen. It's a commitment.
Bilal Tahir (00:56.43)
I really know.
Pierson Marks (00:58.797)
everything that happened this week and generative media specifically around audio gen, video gen, images, everything in the gen media space. Not that much about Sam Altman and Elon Musk drama. We'll stay away from that stuff. Listen to other podcasts to hear more about that. But we talk about real world stuff. We do. Totally. We're just trying to be like, if this is your first time listening, we're trying to be a
Bilal Tahir (01:15.01)
Yeah, although we get into some technical drama, I do have some technical drama news wheelchair, but yeah.
Pierson Marks (01:26.245)
podcast that focus specifically around the really cool creative aspects of AI and less so about benchmarks and all that stuff. We'll talk about it, but primary focus is on the images, media, video, audio, of that music, all that good stuff. So how's your week been? Blah.
Bilal Tahir (01:47.478)
It's been good, very hot, you know, just, you know, summer, so enjoying. Yeah, what about you?
Pierson Marks (01:52.409)
Right, totally. It's been good. It's super sunny outside after this call. I'm going to make some lunch. I might walk down and get some lunch actually because I haven't done that in a long time. when it's nice out in San Francisco, I'm like, let's get outside.
Bilal Tahir (02:04.77)
Yeah, mix it up. Nice. What's your favorite restaurant in San Francisco?
Pierson Marks (02:11.61)
That's tough. It depends. I think I had the best food so far. I went to an omakase sushi restaurant. It was just really good because it was a special occasion. My girlfriend just took the bar and we were doing that. So we were celebrating and that was really awesome. I've never had omakase before. So biased. Yeah.
Bilal Tahir (02:22.542)
Whoa, nice.
Bilal Tahir (02:27.819)
Mm-hmm.
Bilal Tahir (02:34.016)
Omakase is amazing. Yeah. If you're ever in New York, like everyone, this is like, I feel like New York's, it's not even a secret, but every time I have friends who ask like, I'm going to New York, where should I go? And they just expect the same, like, New York pizza and blah, blah, blah. But I'm telling you, Omakase is the best. You will get like literally for less than a hundred dollars, you will get something that it would easily cost you $500 in any other city, including San Francisco. Just because there's so many like basically
Pierson Marks (02:59.438)
Really?
Bilal Tahir (03:03.404)
Michigan star-level chefs who are in New York and there's this perfect competition so they just have to just offer really good rates and the cool thing about most omakase culture is not reservation-based so it's first come first serve so even like you know famous chefs you can get in if it's you go on a random Tuesday or something you can like get in there right and so you know on a not a busy time so I definitely recommend if you're in New York try omakase I love omakase it's
Pierson Marks (03:24.196)
Great.
Bilal Tahir (03:31.67)
an amazing way to get a variety of things catered menu.
Pierson Marks (03:35.162)
Totally. I haven't been to New York now in so many years. It's interesting because there's a decent number of AI companies out there, not as many as San Francisco, but it's like they're trying to compete. They're still the finance bro hub. But come over to San Francisco, all you all. It's much cooler over here. So in terms of tech.
Bilal Tahir (03:55.436)
That's true. Yeah, yes. As I've have the tech vibe, know, I'm excited for tech nerds.
Pierson Marks (04:00.995)
Yeah, the tech nerds, but.
Yeah, the week's been good. I did some cool stuff last night. I was playing around with that cloud code and things and won't dive into it that much. But I thought before we jump into the video stuff, it was just really cool to share. If you don't know, Cloud Code is like this coding agent based on Claude, and it just also is a really great general purpose agent. And so I installed it outside of a repo and I gave it some tools like image generators, access to our blog.
Bilal Tahir (04:26.146)
Hmm.
Pierson Marks (04:33.756)
And it does such a great job, Cloud Code, like what it does well is doing long extended tasks without human intervention.
Bilal Tahir (04:41.451)
Hmm.
Pierson Marks (04:43.354)
And unlike chat GPT or claw desktop or raycast, which I like raycast a lot too, it's more of a question, answer, response type of format. But Claude code is just really, Hey, we'll do something for 10 minutes. We'll, we'll fix ourselves. We'll write code. And what I found is really powerful as I was looking at our blogs. And I took all this data from Google search console in a CSV format. took all this data from. Ah, which we.
Bilal Tahir (04:57.315)
Hmm.
Pierson Marks (05:13.338)
use for like SEO monitoring. And then I have access to our blog as well. And I was letting Claude write a program that kind of took all that data and analyze like, what are our low hanging fruits? Like where can we improve some of the blog articles? And it was writing programs to do this analysis, running those programs, suggesting things, actually updating our blogs. And it was just really cool because
I could just give it a task and it'll go off and do something, use the tools that it has access to, write a tool on demand to do something, generate images for the blog articles and that was really cool. So I thought that was something that if you haven't tried, you should try it out.
Bilal Tahir (06:00.43)
That's awesome, yeah. So, and is it like, can you store that once it goes to a flow, can you store that procedure basically so you can just do it again or do you have to every time go first from scratch and give it the tools again and everything?
Pierson Marks (06:14.618)
I'm figuring that out right now. So the cool thing that Cloud Code did recently was sub agents and it was pretty much you have cloud code and you ask it to do something, but you could also build sub agents like I wrote a writer sub agent and at a high level I was always kind of skeptical of this like idea of like background agents, sub agents because like why do you need him? Why don't you just like you have the context of in cloud? Just give.
writes just let the main guy do it you know but I was creating this blog writer subagent that had access to specific tools and
Bilal Tahir (06:46.19)
Hmm.
Pierson Marks (06:52.536)
I give a description like, you are expert helpful blog article writer. You have access to kind of these tools. I give it some context around what Jelly Pot is or just like how to write good articles. And so when the main agent, when I just chat with my main agent, like just normally say, hey, can you write some blog articles to me? It will delegate that task over to the sub agent. It'll write the articles that sub agent will go off and the context will be contained in that sub agent.
Bilal Tahir (07:18.094)
you
Pierson Marks (07:22.92)
And so essentially that sub agent will go off do all this like all right till they'll iterate or use the tools and the main agent will only get that output and so you don't have to necessarily pollute your main agents Context window with all of this unnecessary sort of tool calls and things So that was pretty cool, but I don't know necessarily yet if Maybe you could just say hey like
Bilal Tahir (07:41.646)
Bye.
Bilal Tahir (07:45.326)
Yeah.
Pierson Marks (07:49.902)
based on everything that you just did and everything that worked and didn't work, just create a function that will do that exact same thing correctly.
Bilal Tahir (07:55.086)
Yeah, was like, that's my question. Like why, how do you contrast it with just having a function with a prompt? Like that just says that, you know, in the context instead of having a sub agent, like what is it the flex, the judgment, like the fuzziness that the sub agent gives you like, but if you have a well-defined task, I guess, why isn't it just a function versus.
Pierson Marks (08:14.714)
think it's like what I'm finding. What I'll see is like the first few iterations are fuzzy. You don't know necessarily what you want. Is this something that you'll do multiple times? And then if it is like writing a blog article that, this is the exact schema you need, this is the exact tool you need to call, this is the exact API that you need to call to generate the blog cover art.
Bilal Tahir (08:22.327)
Hmm.
Pierson Marks (08:39.99)
it probably could just be like, hey, OK, now I'm doing this bunch. Create a tool that's just create and publish blog post. And it just is like more of a procedural right code that does this correctly, knowing everything now that you know and adding that to your tool arsenal. So.
Bilal Tahir (08:46.83)
you
Bilal Tahir (08:58.734)
No, that's so cool. It's basically like a human, like, you you go through a task and then you're like, okay, I gotta make documentation on the steps I took. And so you're basically gonna be doing that with the subagent, figuring the requirements out and then buttoning them up, making that into like a function, you know, with our good rules. And you can always iterate, you know, going back to that flow. So it's pretty cool, interesting. I think that is the hybrid because API calls are always gonna be more efficient than.
Pierson Marks (09:18.223)
Totally.
Bilal Tahir (09:25.966)
running I feel like LLMs, know, like long running, maybe so that's maybe the hybrid where the LLM is the first pass and then once you button it up you just make it a regular macro or whatever. So interesting.
Pierson Marks (09:28.25)
Mm-hmm.
Pierson Marks (09:36.826)
Totally, Yeah, I agree. It's super cool. The powerful thing is just like, can you do a task with a high level description and let it fix itself, understand the errors and then just like, when I was asking it to write a blog post, it kept screwing some stuff up and then it eventually got down to the right thing and it wrote the blog post correctly based on the schema and...
Bilal Tahir (09:48.344)
Hmm.
Bilal Tahir (09:59.042)
Nice. And you said cloud code, this is all happening in the terminal for you. it's outputting the blog, you're kind of like going through the terminal, you know, and reading it and stuff, you're paying yes, no, et cetera.
Pierson Marks (10:05.082)
Mm-hmm.
Pierson Marks (10:12.698)
Yeah, I'm kind of like just more so it's running in the terminal, but it has so I built like a small repo of just configuration. So the cloud code has access to that's where you define the MCP tools and MCP servers and the sub agents are defined there. So it's just like a basic repo of just configuration and I'll publish this. everybody at Jelly Pod will be able to have the same configuration because it has access to linear for project
management has access to replicate as an MCP has access to our blog and it's running in the terminal but it's not like outputting the blog article for me to review right there it's just because I just didn't ask it to it's adding it as a draft to our CMS so it's it's cool
Bilal Tahir (11:00.686)
Mmm, nice, nice. I love it. I love these kind of workflows. They're super powerful.
Pierson Marks (11:07.394)
It just makes me feel lazy sometimes, but I like being lazy. mean, there's this awesome, yeah. Right. There's like a Bill Gates quote from when he was in early 20s about Microsoft. like, I will hire the laziest job, the laziest guy to get the job done every single day. And cause he's going to find the most efficient way to do it. So that's great.
Bilal Tahir (11:11.096)
That's the point. We're supposed to automate our jobs.
Bilal Tahir (11:24.312)
Right.
Bilal Tahir (11:29.132)
Right. Yeah. No, for sure. 100%. That's awesome. Yeah. I got to get into cloud code myself. You know, it's been on my radar and I'm like just getting through windsurf. I had windsurf and it's by cancel the plan. So now I'm down to cursor and I have cloud go, you know, after that I'm going to use next week because cursor are kind of, this is the tech drama. they're, they did kind of rug their users because I mean, I don't know if it's rug because they were just subsidizing. feel like the
the request so much, but it was 500 requests and now it's API based and basically at least from what I found, like you go through the $20 in basically a few days, maybe instead of like a month, it'll last you maybe a week. Even like sometimes for some people, it's like the first day because it's of the heavy use. So that's going to be interesting how, know, if you compare the API uses there versus Cloud Code, which is Cloud Code Max is $200 and then for a rate limit, but you do basically get a lot more.
you know, use it. So I feel like we're going to see a lot of people moving over to Cloud Code because of this, you know. So I think that's an existential crisis or cursor unless they get their pricing right. And I think the problem they have is they're beholden on Anthropic and Anthropic is the one leading lab that refuses to decrease their prices. Every other lab is decreasing their prices still. Anthropic is just like sticking because they understand that they're basically the code kings and even GPT-5 does not touch.
Pierson Marks (12:33.914)
Totally.
Bilal Tahir (12:56.672)
Opus and so on it and I'll get to GPT-5 in a second. There's an interesting twist there and unless Gemini is like the only other player I feel like that has hope and well actually I guess the Chinese model Deep Sea etc. do but Claude has figured out taste at least and I feel like unless other models that can replicate that you know they have that vote but yeah so I do hope they get competition because you know it's great for us consumers when there's more competition.
prices get lower, now, right now, Anthropic is kind of king.
Pierson Marks (13:30.212)
Yeah, it's interesting. I really am curious on how Cursor will be a year from now. Cursor's not an old company. It grew extremely quickly, but I haven't thought about it much. So this is just take this opinion with a grain of salt, but then sentence will always be different between a model lab and Cursor because Cursor's trying to maximize profits so they have to minimize token usage.
So because they're calling APIs and a model lab, don't have them in like they want to maximize token usage. And so the more tokens you spend hitting Anthropics, APIs or OpenAI, it's just it's better for them. They want it's more aligned incentives because more tokens equals more thinking.
Bilal Tahir (14:15.448)
Yeah. Right. Well, I think, that's true long-term. feel like short-term actually, they are online because the right now there's this huge demand and they just don't have enough supply that actually they would rather you use them less because they are very supply constrained. Like opening, literally came up with a whole like plan. They're like, this is our first priority will be our plus users, then the free users, then the API users. So was like, what the fuck? Which is basically makes sense because they're basically a consumer product company. I mean, they've
Pierson Marks (14:40.303)
Mm.
Bilal Tahir (14:44.514)
I don't know why people don't recognize this. They're less, I mean, they're a leading lab as well, but that's not their thing. Their thing is being the number one consumer product AI company, and that's their mode. They have the highest amounts of miles or something, and no one's close to that. And so they're prioritizing chatability, which makes us like people who are people like us who use the API. feel like, yeah, we're like not second class citizens and we should recognize that.
And so for now, I think that's where the leading labs and the application layer are aligned. But I agree with you, that's eventually going to diverge. But speaking of OpenAI, it's an interesting tech drama that happened this week. They launched GPT-5, and it was supposedly really good. They were hyping it up. They always hype it up. But it felt like for a hot minute that, wow, they're finally talking the talk as well. They're walking the walk as well, because the model was supposedly really good.
And I tested the first couple of days. was actually the first open AI model that I felt was good at coding in 18 months. Like I haven't been impressed in like a good 18 months from them and I have not really used them, but five felt good. But then something interesting happened. The model got nerfed at some point or something and it got progressively worse. there was this whole like, I don't know if you follow Theo, he's a YouTuber. He's a pretty famous YouTuber.
Pierson Marks (15:58.009)
Really.
Bilal Tahir (16:04.97)
in the tech circles and he was one of the first guys who openly invited to their demo. He's in their hype video of GPT-5 and he's like, my God, this smells, blows every other model of the water, blah, blah. And then he literally today comes out with another video that says GPT-5 sucks. And it's because they nerfed the model. And I wonder what happened was, cause right now, WinSurf and Cursor have a free user right now on GPT-5 for the week, which is, you know, which maybe made them.
didn't maybe they didn't anticipate the demand or something. they, so they silently nerf the model underneath and they're using a lower model. Cause they also came up with this other update called auto router, which is used in GMPD, which basically is the, is the, think I'm going to become a very crucial part of GPT-5 and beyond systems where what it does is it automatically takes the query and it routes it to, okay, does this require the highest quality model or does this require a lower quality model? And right now I think it's
focusing on the lower quality model. goes to the supply constraint. Cause you would think if in a world where they had as many GPUs as possible, they would want you to use the highest quality model. it requires the most amount of money to use. So better margins for them. But right now they would rather not. They want you to use mini or nano, et cetera. Cause they're they compute constraint. So I think that's what happened right now. It's an interesting drama and there's a fallout from that.
Pierson Marks (17:23.426)
soon.
Hmm.
Bilal Tahir (17:30.958)
We'll see what happens with GPT-5. I hope they give us the model that was promised and it gets better. So we'll see.
Pierson Marks (17:39.002)
Right. No, that's interesting. I haven't played around with it that much. I know you're doing a lot of latency testing to to see like how fast it is, but
Bilal Tahir (17:46.946)
Yeah, it's super slow right now. And it's partly because of the reasoning trace they do. Even on the nano and mini, there's always a reasoning. They have this new field called, it's called effort reasoning, colon effort. And you can do low, medium, high, but then there's a special setting called minimal. And the whole thing about minimal is that's basically if you wanted something like four over, there's no reasoning. So it just does the minimum amount and bypasses it. But even there, I found a good few seconds are spent on the reasoning step.
you know, and I just, I just hope that they just give us a non-reasing one, which is just completely, you know, goes past that because right now based on my testing, I can't use nano and mini because I can't use nano for sure because a mini okay, yeah, maybe there's some quality to it. For nano, absolutely want something like almost like an API. You want something immediately, quickly, you know, that's the use of nano and I'm just not seeing it. So I hope they figured it out because the pricing is very enticing, you know, they...
Pierson Marks (18:39.194)
Mm-hmm. Right. Yeah.
Bilal Tahir (18:45.774)
cut a lot of the pricing with caching especially is ridiculous. So, you I would love to use nano and mini. We would love to use it in our products as well. yeah, hopefully they'll figure that out. But other models too. I mean, there's apparently a new model called Horizon on the, is better than Opus in terms of front end. yeah, I don't know which model is, it could be Google or something, but that's like some of the testing like, you know, I haven't tested it, but.
Pierson Marks (18:53.006)
Right. Yeah. Yeah.
Pierson Marks (19:05.033)
you think this is an anthropic model or something? no.
Bilal Tahir (19:15.054)
Apparently the design is like this is the first model that's giving Anthropic a run for its money. That's what I meant by taste. I mean, we need other labs to be catching up on taste and Cassanet. Anthropic has had the monopoly since 3.5. And I hope somebody catches up because somehow nobody's been able to figure out that just really nice tasteful front-end design that Opus and Sonnet do.
Pierson Marks (19:40.975)
Right, right, no, totally. That would be interesting. like, if Anthropic just decides to be the coding, like, I strongly believe that.
The role of software engineer that existed five years ago will not exist five years from now. It's going to be much more product engineer kind of, you know, and it's like you're not writing software, but you have to be able to recognize like certain things. Like I use cloud code. I use Opus 4.1. It's great, but there's still things where it will go down these rabbit holes and it'll just make design decisions, architectural software decisions, not like UI decisions, but like
Bilal Tahir (19:57.805)
Right.
Bilal Tahir (20:15.34)
Right. Right.
Pierson Marks (20:18.21)
I just have to be like, wait, wait, wait, wait, smells off about that code. And I'm like, is that really the right way to do it? And then sometimes I'll say, yes. And then I have to go out and research. something smells about that code. And I figure out that, no, no, no, this is what you need to do. And then once I give it to Opus, it actually writes good code. what makes software engineering hard?
Bilal Tahir (20:21.901)
Right.
Bilal Tahir (20:30.03)
Hmm.
Pierson Marks (20:45.078)
is kind of having that expertise, having the exposure in the past of like good practices and knowing when to use it. Because like, you know, when you're an entry level engineer, when you're using a new library or whatever, you're just going to make mistakes and just do things that aren't the best just because you've never played around with it. And as time comes on, you learn and you have to go in and fix a bug. And when you're fixing that bug, you recognize like, why was I writing all this weird law?
when I could just make this one API call and that sort of iterative process is what makes what's the difference between like a senior engineer and a junior engineer it's that iteration and exposure and
Bilal Tahir (21:24.994)
Right. Yeah, no, you're so right. It's just like, there's something, because the other thing is like, as a engineer, like we're humans, we're lazy. My goal, if I want to test my, one of my objective functions is how can I write the least amount of code possible to meet this requirement, right? And I feel like it's kind of the opposite almost with the RL trained models because they're like, oh, how can I do more? You know, it just keeps doing it. I don't know if you've probably seen it on the agent.
Pierson Marks (21:39.534)
Right.
Bilal Tahir (21:51.02)
it'll do the first edit, it actually does it, and then it just keeps going. I almost want to stop. Oftentimes I'm like, dude, you did it, just stop, please stop. You'll make the one line code chain, then you're like, let me make a test file. I'm like, no, no, no, no, no. I just wanted the one line change, just stop. It's very annoying. so you're right. It'll almost feel like that should be part of the objective function. Yesterday I was doing this migration in our JLBot code base, and all it needed was this one line code change.
Pierson Marks (22:04.476)
Right.
Bilal Tahir (22:20.078)
and it needed to just go and it's just, cause that field wasn't supported anymore. It just recreated the whole field and I just spent so much time and ultimately I had to dig in in the docs and finally found out, oh, they just renamed the field. It's a one line coach. Took me, you know, it's just like, yeah, exactly. was like, God, I just, I spent so many hours on that kind of like embarrassing. It's like, you know, but so that, that is the classic, it's the new, you know, when you spend hours.
Pierson Marks (22:33.645)
All right. Was that the experimental transform or something? Yeah.
Bilal Tahir (22:48.366)
down one rabbit hole and you finally fix it, it's a simple chain. Now it's happening with LLMs, it just goes down the rabbit hole and you have to resist like, oh, just fix it. Okay, fine, I'll actually have to do something now. Now have to go down the rabbit hole with it. So.
Pierson Marks (23:02.075)
So yeah, yeah, I find it very satisfying too. It's like finding the balance between, okay.
One first step is like write the have the AI write the code. It writes the code pretty bad or or good, you know, somewhere in between good and bad. And then and then recognize coming back to that same code later down the line and actually cleaning some stuff up. It's a very sad. It's like spring cleaning your apartment. And it's like, wow, look at this like either like the new AI can go back and fix the old AI stuff or I can come in and I just like rewrite some of the things because now I can just see the code.
Bilal Tahir (23:25.794)
Yeah.
Pierson Marks (23:38.282)
differently and for anybody that's watching this that's not necessarily a program or I've never programmed
You can just like when you feel something when you when you look at art or when you watch a movie it evokes emotion. It's the same thing with and writing too. Like when you read something like there's a subtle sort of emotional like it's not about what it says. It's like how it made it like how did your eyes glaze across the page? Did it was it sharp where the sentence how was the sentence structured like good writing is so much more than about just like the words it says. It's like how those words are put together and it's the same in code.
Bilal Tahir (23:49.612)
Ray. Ray.
Bilal Tahir (24:14.796)
Hmm.
Pierson Marks (24:14.993)
And it's like going through code and cleaning stuff up and tightening up things. And it's the same satisfaction, good coding as good writing and reading that too. And it's interesting feeling. It's cool. I like it.
Bilal Tahir (24:28.454)
I know I 100 % agree. feel like that's like what you mentioned, the role is basically like design thinking, you know, almost like from architecture point, I feel like that is like the last, you know, part of the software engine and the most important one and which, you know, we'll be doing still, I think. It'll be interesting to see once the models, if they ever do, you know, I think it's a when, not an if, but if they get to the point where they can basically do that themselves.
Pierson Marks (24:45.285)
Totally.
Bilal Tahir (24:55.404)
and there must be some sort of threshold where they basically are at human level or superhuman level, quarter ability. And from there, it'll be interesting. Like basically, maybe we get to a point where every time the model upgrade, it goes back and just refactors the code base on its own and it just gets tighter and tighter and tighter or whatever.
Pierson Marks (25:12.837)
Totally.
Yeah, absolutely. I'm super excited for that, that self-improvement. then it also just, there's like the unknown question too, because yes, there is the engineering side of programming, which is very objective and like there's a better way to do things. But there's also the non-objective, the more design and like trade-offs, like a lot of engineering is in trade-offs. you make a change over here, like upgrading GPT-5 for, you know,
improve pricing right now. There's a latency trade off like you need to it's going to be a little bit slower and then on product decision do you like expose all this custom. I guess the question we always talk about is do we expose knobs and dials inside Jelly pod to give more features for the power users but.
Bilal Tahir (25:55.822)
Hmm.
Pierson Marks (26:05.059)
if you do that for power users, it's going to make it more complicated for entry level users. And so it's like those things where there isn't a right or wrong choice. And it's like, what do you want your product to be? And so, yeah.
Bilal Tahir (26:08.333)
Hmm.
Bilal Tahir (26:14.636)
Right, yeah. Yeah. Hey, maybe when software is so cheap, we just make multiple versions of JellyPod, you know? I was actually, I think I was tweeting about this. I think Guillermo starts it. And I'm like, if you think about it, we have light mode and dark mode. That's a version that is pretty popular that happened because a certain population just loves dark.
Pierson Marks (26:23.643)
All right.
Bilal Tahir (26:41.23)
big Dark Mode fan, but there are certain sites I do like Dark Mode and like Twitter actually. I always use the dim mode. I hate the light mode on that one. But that's like a standard variation people make, there are other ones. And I wonder like, maybe we'll just have five to 10, maybe 20, a hundred types of variations. Maybe there'll be a dense compact like Bloomberg style version of a site. Because on the spectrum, there's this whole concept of dynamic. Every site will look completely different.
because we'll be generated on demand for a user. I personally don't think that'll be the case, because I just think that's less about technical ability and more about human psychology. I think most people don't want to think about how to display a browser or their website, but they love dropdown selection. So maybe instead of light dark mode, you will have like five, 10 minimal mode, retro mode, brutal mode or something. it'll be, because light dark mode is just colors, but.
Pierson Marks (27:12.816)
Mm-hmm.
Bilal Tahir (27:38.24)
it could be a little more complicated. It could be the way the information is displayed or hidden or Apple mode where it's everything is clicked and optimized or having more buttons. So I actually think that'll be a thing where the top sites at least have the engineering resources and the LLM resources to just have 20 different variations of their site. So it'll be interesting.
Pierson Marks (27:47.587)
or glass you see through.
Pierson Marks (27:56.573)
Alright, totally. I totally agree. yeah, it goes down the whole rabbit hole of A, B in testing, generative UI. I completely agree with you on that aspect. I think there's product decision and taste that goes into a UI. Imagine you're somebody who's struggling with like...
interacting with a webpage and the guy who wrote the YouTube video is just like the page looks completely different when you're like going for help. It's like well like that doesn't look that's not what it looks like for me. And so like what do you mean? So like there's all those interesting questions. Can you send me a screenshot of the UI you're seeing because on our code I see something completely different. I don't know like what do mean the buds not there.
Bilal Tahir (28:21.614)
Mm-hmm.
Bilal Tahir (28:28.28)
Right. yeah, the surface area complexity of definitely increases for sure. That's how it is.
Bilal Tahir (28:43.086)
Oh yeah. Yeah, I mean, it's the thing. I remember one of the coolest site projects I remember, this guy, like he built this on a weekend, solo dev. It completely blew up. think, I don't want to say it, but he was making it good. I think he makes his living off it, but it was like a few years ago and he was having this problem. So what he did was he literally just took the Chrome, you know, on Chrome, like you can do select iPhone XR, 12 Pro. You can see how the screen looks on different. And he was like, you know, it's annoying. have to toggle. What if I can just see all the screens at once so I can see.
Pierson Marks (29:05.851)
Mm-hmm.
Bilal Tahir (29:12.27)
this is 16 by 900, can see. And he basically just build a wrapper that lets you do it and it blew up completely. And I was like, ah, that's actually a cool product idea. Cause yeah, you you can like, as you're building the code, you can see the exact changes on nine different skeins and you're like, oh, that broke mobile right here. And I can just fix that. I was like, oh, great idea. It's one of the simple hacks. Yeah. Yeah.
Pierson Marks (29:29.84)
Right. Totally. I wonder if people are doing that.
And like, cause I see like nowadays a lot of people are using playwright to kind of just do some like basic integration tests and like LLM powered integration tests, kind of like walking through those, those main flows of the user journey. And if you do it just on desktop, mean, do it on mobile, do it on tablet, do it on every browser and just, just run like a bunch of tests and parallel on every code change where it's like, Oh no, this broke on mobile. button now is cut off. can't click it or I have to scroll over and click it.
Bilal Tahir (29:57.933)
Right.
Pierson Marks (30:04.73)
or something. Yeah, it's cool. Let's talk. We get down this all the time. Let's talk about movies, Hollywood. Let's talk about all these different talking head avatars. Yeah, let's do it.
Bilal Tahir (30:07.79)
Yeah, no, for sure. That's awesome. But I know you said at the start, this is a generative AI podcast. And so maybe we should talk a little bit.
Bilal Tahir (30:22.224)
yeah, well before that, think just continuing from last episode, I think I wanna touch on video generation, because I do think there's a huge critical moment right now going on, which is basically the cost and quality you can get for generating a five second video has just now finally hit a threshold, think, where you can truly do amazing stuff with it. it's, what is it? Like 10 cents, 15 cents of video now, which is insane. It used to be dollar, two dollars, but.
One released a model which is open source, a fall and replicate hosted. It's a one 2.2 version and it's really kind of good and very cheap. Like sometimes like five cents a video like for a 480p resolution, which is insane. Like you think about it, that's basically an image level pricing. Then you have other models now coming out too, which are basically at that level too. I'm a video generation models and I'm sure Clint's next version will be
even better and hopefully they'll have a fast version as well. just really exciting times in the videos fair, you know, of January. You know, I would definitely play around with these tools, create videos and stuff and, know, try to come up with a cool, you know, whatever ideas you have in your head, like, you know, try to, you know, put them out there.
Pierson Marks (31:39.183)
Right. What are you creating? I'm curious. Like, is there anything that, like, really excites you about, like, some certain genre of creation or, like, when price goes down, like, is there something that you're really excited about or what you're doing today?
Bilal Tahir (31:51.81)
For me, yeah, for me the biggest thing, maybe I should like demo it. Like I have this like UI for this, but I'm really excited for like longer form consistent character stories because the hard part about the five second videos is you do one and then the next five second generation could look very different. You you can control it by having this same art aesthetic part of the prompt, but you know, still like janky. Now there are ways around it. Like you can use consistent character.
And so you can create, you can use something like context or there's a new model like C-dance edit now, by dance edit, et cetera. That will let you take an image and create from the same character a different bunch of images. So a lot of people do is they'll create the images and then use that as a first frame of the video. So then you can generate a consistent character story. And so I've kind of put that flow into practice and it kind of works, but...
you know, it's still like hackier, right? Cause sometimes you, what if you want like a five second video that doesn't have the main character in it. It's like more of a B roll almost like, know, cause you want a story, you know, immersive story. So it's not like the characters always in focus. Sometimes the character comes in focus on second two or three and you start off with a different image. So that defeats the image to prompt. Now there's other flows like reference to video, which is like a runway data, which I think is interesting, which will let you upload.
Pierson Marks (33:06.204)
Hmm.
Bilal Tahir (33:15.116)
the image character, but that doesn't mean that's a starting frame. It just means it'll be the central figure in the five second video generated. And I actually think reference to video, reference to image is more powerful than image to video, to like, you know, starting frame, that means it doesn't have to start with that. It just needs to include that. So it'll be interesting to see how this evolves. Like, we'll definitely get cheaper and higher quality and better resemblance.
Pierson Marks (33:37.084)
interesting.
Bilal Tahir (33:44.938)
I think the bigger limit, at least in the next few months to maybe a couple years is the five second output generation. Cause ideally you just press a button and you get two minutes, five minutes, 10 minute long videos. That, I don't foresee that happening anytime soon. Cause it's ridiculous amount of compute required. So, but so in the meantime, it'll be like, how do you hack together consistent storyline? And hopefully we can do it with these image to video and reference to video.
like hacks and it gets better.
Pierson Marks (34:15.9)
Right, that's super interesting. the reason why, so the challenges around.
the image to video is that if you want to have some character in later frames of that video, it's just that there's no way to do that besides like runaway runaway is new model because if you show the image, it'll just be that that'll be a starting frame. Can like why couldn't you just clip up like the first thing?
Bilal Tahir (34:44.846)
Yeah. Well, so I want to add one other cool thing that only one model has as far as I know, which is MinMax Helio 2. They have something called multi-shot. So what it does is for most 5 second videos, you know, if this is a starting frame, it'll just be this and it'll add some stuff, but it'll be this essential frame. In multi-shot, I can add this and I can add in the prompt that, you know, seconds 0 to 0 to 2, it's him talking and then second three, it's him flying in the sky.
And so that's one way to kind of shift the frame of the character. And I think MultiShot is gonna be a huge player in this flow as well. So.
Pierson Marks (35:16.795)
Hmm.
Pierson Marks (35:23.63)
Interesting no, that's interesting. I saw something about like Google v03 I wonder do you know if other models have this same emerging capability where? They're like drawing I don't know how you actually do this because I thought like v03 was just like you you give a prompt But you draw on the screen like hey There's a forest and then you say and there's a grizzly bear and then you like draw an arrow from the grizzly bear from the left to right and I'll and I'll move so like
Bilal Tahir (35:36.932)
yeah, yeah, yeah.
Bilal Tahir (35:47.928)
where it comes, yeah, it's super interesting. They only have that in their Flow app, guess, right? Yeah, it's super cool. I call it basically that kind of workflow director mode. think that Runway has something like that where you can have super granular control over the image. And I think, yeah, that's another really cool way to have more control and generate precise video outputs.
Pierson Marks (35:54.532)
Interesting.
Pierson Marks (36:08.826)
I
Pierson Marks (36:13.836)
And I remember like a while ago when like really like maybe I'm probably wrong here because it's obviously like via one and these older video gen models. But at least the first video model that I ever saw that was actually pretty solid was runway because they had this like in painting feature. Whereas if you take just a image of say a static image of the ocean you can kind of paint on an area and
it kind of had that motion, it was like a motion brush. And remember that was pretty cool because you could take an image and if you want the leaves to rustle, you just draw around those leaves and it gave motion to the leaves and that was pretty cool. it reminds me of that.
Bilal Tahir (36:46.78)
Mm-hmm. Yeah.
Bilal Tahir (36:56.322)
Right. Yeah. Yeah. Runways like definitely their audience is like, like semi professionals, professional, like actors who want granular level control. Sort of like, you know, for us at Jellypot, we give you granular level control for podcasts. That's for, them, it's video and you know, their, their, their releases kind of catered to that. So their latest one is Runway Aleph, which is very interesting. It's a basically video style transfer, but very precise. So I can literally give this video and be like, instead of the hat, like, you know, have a beanie.
like on my head and it'll do that and it'll do it very precisely. I tried around with more complex movement and that's where it breaks but for like more still videos and stuff, it's really good. Ray Du Modify also is like another competitor which is similar but I know I personally am waiting for, have these 80s, what I wanna do is I have these 80s cartoons and stuff that I wanna make live action out of. I actually think it'll be pretty fire. So I'm waiting for that to happen. Unfortunately.
Pierson Marks (37:51.406)
I remember you saying that.
Bilal Tahir (37:55.882)
anime to photorealism hasn't quite worked for me.
Pierson Marks (38:00.796)
So photo realism to animate kind of is working a little bit, right?
Bilal Tahir (38:03.47)
Yeah, yeah, that's a more popular one. You you create an anime version or a Ghibli version, but I think nobody thinks on the other side of it, you know, because what if you take Thundercats and make a live action movie out of it, you know, something like that.
Pierson Marks (38:11.225)
Right. Totally. mean, and then like also, I don't know if this is actually Nickelodeon or whoever, but like Avatar The Last Airbender, they've made like two, I think now versions, like real life versions, which both kind of flopped. Like the animated anime, it's not even anime really, I don't know. But it's like the animated version of that show is great. And then when they made the live action, it's just like, everybody doesn't like it, so.
Bilal Tahir (38:20.588)
Yeah.
Bilal Tahir (38:25.262)
Mm-hmm.
Bilal Tahir (38:35.072)
Right. Well, I think what will be super interesting with animated photorealism is if you think about it, like the anime, you can get away with so many ridiculous, like, you know, you can't have Dragon Ball Z photorealistic like the way it is because how are you going to show like a planet being destroyed? I mean, or you can, but it'll take the budget for that. Like actually having a good way to show that is insane. But for anime, it's just
It's just drawings, right? Oh yeah, Goku has a spirit bomb, he throws it, just the whole planet blows up, whatever. Well, he doesn't blow up the planet, obviously, the villains does, but whatever. My point is, with anime to photorealism, have the, all you have to do is kind of style transfer, and you have these ridiculous effects. So I actually think when we hit a threshold, you're gonna see some insane footage go viral on social media, which is be like, oh my God, like how this would have cost like a billion dollars, you know, like, just like Lord of the Rings level, kind of like effects.
Pierson Marks (39:26.747)
Cutler.
Bilal Tahir (39:31.022)
So it's a little bit super interesting.
Pierson Marks (39:33.403)
I listened to a podcast and I forget now who was on. I want to say with somebody from Lucasfilm or somebody that's worked with Lucasfilm in the past, but they were just talking about like, what is actually the workflow today that Hollywood goes through when producing CGI? And just to break it down very simply, it's like, hey, the director and or whoever they want to have a scene that let's just talk about the Death Star explosion scene because everyone kind of knows that, you know.
We want, okay, we have the Death Star and we want like a laser, let's say the laser and destroying Alderaan and it explodes a planet like this, right? And what happens is that like you describe this, you go to the visual designers, they kind of mock it up. They create these like basic one to three versions of what that scene's gonna look like without all the lighting and everything perfect. But like, hey, which one of these three do you like? They took us two months to do. And like their basics, like, I choose
Bilal Tahir (40:28.91)
Mm-hmm.
Pierson Marks (40:31.413)
option three and that's all you have you have there isn't this ability to quickly and cheaply iterate like no honestly I want it completely different and it's just
balloons your costs because like, okay, go back to the drawing board, this rendering time, there's like actual design time, and it's very expensive. But in a world of AI, you know, it's like, you could have 50 different variations that could be created in an afternoon based on the feedback. And you can actually have higher quality footage and video being created with the same people that were gonna do it. They're just gonna be able to create more and more quickly.
Bilal Tahir (41:06.616)
Right.
Bilal Tahir (41:11.213)
Right.
Yeah, I absolutely 100%. I mean, it blows my mind. I mean, I respect it too, but some scenes take up to a year sometimes of planning and stuff for like a five, 10 second scene. I'm like, what? How does anyone actually, how did they ever convince, like, okay, we're going to spend $50 million for this 10 second sequence. But it happens, it's crazy. mean, and it's awesome that it happens. But now you can do that, like you said, in a very
Pierson Marks (41:25.178)
Bye.
Pierson Marks (41:35.387)
Totally.
Bilal Tahir (41:43.246)
a fraction of the cost with the iterative feedback loop. And even then, if you mess it up, you can go back and post-processing. now you can, I mean, it's not like it doesn't happen. There's a lot of magic that happens after you post-processing from whatever it is, like 70 % of the thing, the editing now, because they'll just do everything green screen and then they'll add it. now with, you know, you can do a lot more with AI. think, know, if like there was a mess up or whatever, there was a, you know, coffee cup or whatever was left over in the scene or something, you can like.
Instead of manually going through frame ref and you can just have it wiped in a few seconds.
Pierson Marks (42:17.027)
No, it makes the whole movie industry competitive again because I I want to go see Fantastic Four. It's the whatever the fifth whatever reboot with like Pedro Pascal on everybody. I'm a Marvel fan too, or at least I was until like Endgame and then all this stuff after that just kind of But
Bilal Tahir (42:33.565)
Yeah, it's going to pull it off.
Pierson Marks (42:35.739)
I'll go see this and the budget of the movie was huge. The box office was also huge. But to make any sort of meaningful return like percentage wise on that investment, if you have a budget of 250 million dollars, I don't know what it was for Fantastic Four, but like you have to have a killer box office release because streaming now you're not making money on streaming because it's going to go to Disney Plus and it's just a completely different financial model over on streaming and
Bilal Tahir (42:56.29)
Right.
Pierson Marks (43:05.693)
It finally is going to make competitive studios that are small that you could have box offices with budgets of movies in the one to ten million dollar range which is going to be awesome because you could have a ten million dollar budget and you're have a movie that has a box office of 25 million dollars and That's gonna be an awesome return because that same return would take almost a dick a billion dollars You know if you had like a 250 million dollar
Bilal Tahir (43:18.318)
Mm-hmm.
Pierson Marks (43:35.566)
budget and
It's going to make smaller teams more competitive. then something also it's like we're not talking about is audiences, global audiences. I think one of the first big global phenomenon in a long time was Squid Game. And when Squid Game came out, was, it took the world by storm. It was a Korean film. Americans don't like subtitles or dubbed at all. And because everything's English and it's American English usually. other overseas will dub it into like, you know, those native languages and they're more familiar with it.
Bilal Tahir (43:56.654)
Mm-hmm.
Bilal Tahir (44:02.915)
Right.
Pierson Marks (44:08.605)
But for the American audience to have a Korean film get like the or Korean show be the number one talked about show is really awesome. But it still wasn't great. Like the voices were OK. The lip syncing dubbing was like bad. We're going to have universal movies. We're going to have that. No, this could be the actor is going to be the same, but the lips are going to be perfect. He's going to be able to speak every language. You can localize the films a little bit better. Like, for example, in the UK, a driver drives in the left hand side of the street in the United States.
right hand side of the street and so those types of subtleties which just would be completely out of the picture you're not going to reshoot a scene because the drivers for the UK audience should be on the left and for most of the world that should be on the right but you know those small details which were formerly prohibitively expensive you could do in the post-processing hey just put them on the right hand side but in AI just like move it over to the left and kind of just flip these small localization things
Bilal Tahir (44:38.732)
Bye.
Bilal Tahir (44:51.48)
Right.
Bilal Tahir (45:02.872)
That's so interesting. didn't even think about that, but that's so cool. That's like localization of content. can just very like, it's like right now all we do is maybe dubbing. So you have a Spanish dub, but what if like it's almost like a Spanish culture. culturally, this joke is sensitive there or something or actually the way the character.
Pierson Marks (45:08.858)
Right.
Pierson Marks (45:14.139)
All right.
Bilal Tahir (45:24.204)
you know, if he's a romantic, this is how he expresses himself in this culture versus he's a bit more reserved, you know, in this one. So you can like tweak that. That's very fascinating concept, know, having those dance.
Pierson Marks (45:31.237)
Totally. Yeah. And even like logos because you know product placement is a big thing in movies and it's like this one tea brand might be better or one soda brand is better in these markets and the other ones and you don't have to re-film, you know.
Bilal Tahir (45:38.303)
yeah, that's
Bilal Tahir (45:45.278)
my God. I can totally see like you can have the James Bond like, the clip changes. you can, and they can buy ad spots live. yeah. For the next 20 stream, 20,000 streams. If you pay this much money, we're going to insert your ad in this particular scene. my God. You know, that's actually a very fascinating point. didn't think about it, but I, hates ads. You know, it's like a universal thing.
Pierson Marks (46:02.8)
Totally, you're watching Netflix and the product that's placed there is different. Yeah, that's interesting.
Bilal Tahir (46:14.306)
But people, you know, we've talked about, there's this concept of native ads in social media, which is it'll look like a Facebook post or Instagram post, but it's actually an ad and that's called a native ad because it's basically organic, quote unquote, right? Influencer shout outs kind of similar. What if the whole like, you're watching a show and then suddenly there's a 30 second ad, but instead of that, the ad is actually well placed in the content itself. So you don't feel it's an ad, but it's an ad and it doesn't disrupt you.
in your flow, but you still got like, the character drank some Coca-Cola. So you got the Coca-Cola ad, but you didn't have to search to, you know, stop the scene or skip over it. So I actually think that's very powerful. Yeah.
Pierson Marks (46:46.875)
Totally.
Pierson Marks (46:52.507)
I completely agree with you.
it and I think we talked about this on one of the old episodes. I am very, very bullish on on Metta and Google too, because they also have YouTube. But but Metta, for example, over the last few years, there's this explosion of UGC content and where, you know, you hire some influencer to promote your company and you run ads with that content. And so it looks kind of like a, you know, a real post that somebody would be talking about some some product or whatever. And in Metta's world,
like that it will be possible just to generate unique influencers that are specific to everybody. Like let's say I really like this type of person. Like I resonate like I look similar. Maybe it's another white dude with blue eyes and something that's from San Francisco and the ad that I see for the same product is going to be like in San Francisco. It's going to look similar to me because they have all my data everything and then for you will be a slightly different and for other people around the world. It's promoting this.
same product, but depending on who is the viewer, the ads are going to look different and it's going be generated on demand. And I could see that rather than the generative UI, generative ads where it's just like, you could really highly optimize and hyper-localize ads. So same with product placement too.
Bilal Tahir (48:08.92)
Yes. yeah. It's happening. Yeah, maybe we'll take some of that and add it. I know in JellyPod, people have wanted to add ads in their podcasts and stuff. There's probably some interesting ways we can do that for you guys. So we'll see.
Pierson Marks (48:19.258)
All right.
Pierson Marks (48:24.09)
Totally, totally. Well, I know. And now we have, we still have the battle of talking heads and things we didn't get to and editing images, but nano banana model context, ideogram character, bite dance, seed, seed edit, right? And staggle, Omni-human, Pika, Hedro. Yeah, let's kick it down the road.
Bilal Tahir (48:40.61)
Yeah. Yeah. I wonder if you like kick that for the next episode because I guess we're over, but you know, it's a teaser and you know, we'll talk about image editing, a lot of cool stuff happening there, hopefully next week. So, you know.
Pierson Marks (48:52.602)
A lot of cool stuff. And sometimes it kind of slows down too. So a lot of these things, like some models are new, some models, it's just like they exist. And whether it's today we talk about them or next week, it's still cutting edge. And it's kind of nice that it feels like a little bit slower pace, which is nice right now and not everything, but that's just how I feel. don't know.
Bilal Tahir (49:13.452)
I mean, it's like slowly then quickly. feel like there's a lot of cool stuff happening under the Sometimes, you know, it takes a while too. know, a model is out there you're like, but then somebody uses it in a very interesting way. And you're like, wow, I couldn't believe you could do it. So I feel like there's all this stuff. actually, I'm going to make a prediction. I feel like Gemini 3, when it comes out, and hopefully in the next couple of weeks, it will blow GPT-5 out of the water. We'll see. But I actually think I'm very excited for that.
Pierson Marks (49:17.231)
Alright.
Pierson Marks (49:38.01)
All right. Let's see the polymarket.
Bilal Tahir (49:40.182)
And hopefully on the generative front, they actually package that up with some VO3 or image and updates. Image and, I mean, we're probably not, but that would be interesting.
Pierson Marks (49:49.186)
Right, Well, episode nine, Creative Flux, what a good one, as always. ride them doodles, yeah. Okay. Yeah, basically, we have five bullet points, so we talked about everything. Half the stuff wasn't on there, but it's great. Cool. So, we'll talk next week. See you later.
Bilal Tahir (49:56.834)
Yeah, as always going down random people's, you know, we try to keep it loose and, know,
Bilal Tahir (50:12.94)
Yeah, awesome. All right, bye guys.
