Google's Week: Gemini 3 & Nano Banana 2

Bilal Tahir (00:00)
imagine you can actually just change your voice, like real time and just interact with people who are meeting. I think, you know, there's so much, different dimension to, you know, interaction.

Pierson Marks (00:08)
I know we talked about video games a little bit last week, but I would hop on a party with my friends on Xbox live and then we'd just be chatting But I wonder if like now you could see it like video, like you're playing Call of Duty or you're playing FIFA or something. And rather than just being audio, you could have the video call on the left-hand side and you can see everybody's face maybe with a camera from your camp, from your TV or something. And then

Just like how Fortnite and everybody have these skins that people pay for. I wonder if Xbox or Microsoft or Sony will do skins but for your live streams so that you could kind of do it in real time, like the filters, change yourself into Spider-Man or Darth Vader or something.

Bilal Tahir (00:32)
Mm.

Yeah, all

the IP. it's interesting. mean, it's coming, definitely. I remember, I think Disney was just signing some sort of a deal with OpenAI or whatever, because they just realized they were going to lose this IP battle. Everyone's just making their own character, using their content library anyway. So you might as well get paid for it. So all these studios and stuff will just make a deal, get a little cut rather than trying to restrict. I think the genie's out of the bottle here for most of this.

Pierson Marks (01:08)

totally. I know. And this is something I wanted to talk about too, about Disney Plus and AI characters. But let's pause on that. Let's come back that later. So if anybody is watching or listening to us for the first time, I'm Pierson This is Bilal. And we talk about generative media every week. This week was pretty big, I think. Some weeks are bigger than others. Today it's a Google week.

Bilal Tahir (01:16)
Right. Yeah.

app the Google is the Google yeah

Pierson Marks (01:31)
I'm super excited

for next year, 2026. I think that this is going to be the explosion. if you've been listening to Creative Flux, get ready. think 2026 is going to be huge in terms of generative media. ⁓

Bilal Tahir (01:43)
Yes, 100%. Yeah, no, I'm

so excited. And I think Google will definitely be leading. I I think it's getting more apparent for people to understand like, you know, their TPUs and just the whole, you know, vertically integrated stack they built like is just gives them such an advantage. Obviously YouTube, you know, there's so much data and distribution advantages as well. So yeah, it's hard to imagine them not being in one of the top winners. But ⁓ like stepping back, let's talk about Google.

Pierson Marks (01:53)
Sorry.

Bilal Tahir (02:11)
Google had this launch week, they launched a bunch of things, but the two big ones, first was Gemini 3, that launched on Tuesday, and they've been hyping up and using OpenAI as like wake posting kind of playbook a bit, which was annoying, you know, I mean, they came out with something that was amazing, so I guess, you know, they can do that. But basically Gemini 3 broke all the benchmarks.

Pierson Marks (02:16)
Mm-hmm.

Bilal Tahir (02:31)
really shockingly amazing reasoning abilities across the board. A lot of people who used LLMs have basically come out and said like this is a step change as big as 3.5 to 4, GPD 3.5 to 4 or something like that, which I can't speak to that personally yet. I did use Gemini 3 for coding and that was pretty good. It does ⁓ go off and ramble a bit, it's kind of a worst world problem, but it's hard to

really see the difference because all the models are so good. you I'm like, you know, so it's like, need way more data to really see the gaps, but it's the reason it's some of the most advanced benchmarks, like humanities last exam, et cetera, et cetera. It just got so down that, so that's crazy. It's funny opening up because they always try to do this, which I think it is kind of bad faith, honestly, I get it, but they always try to, they know they...

Pierson Marks (03:03)
Mm-hmm.

Bilal Tahir (03:24)
when they're gonna release and they try to steal their thunder by releasing something themselves. so they tried releasing five, first they did a very boss 5.1 update, which nobody really saw. They're like, wait, what 5.1, GPT 5.1 got updated and it came out of nowhere. Apparently it's better, but not big. They did a bigger one today, which was Google's second launch, was Gemini, sorry, Nano Banana 2, which used to be called Gemini Flash 2.5, but now I guess

just call it Nano Banana, I guess.

Pierson Marks (03:54)
Well, so also, mean, you skipped

also, so like OpenAI released 5.1, Google came out with Gemini 3, then OpenAI came out with 5.1 Pro.

Bilal Tahir (04:02)
Yes, so that's what I was, so today they came out with, or are we talking about CodexMax? Because that's what they came out today, or I guess 5.4 on Pro is different, oh yeah, CodexMax is.

Pierson Marks (04:10)
think they're different. I'm pretty sure they were different. ⁓ Codex max 2.

Bilal Tahir (04:15)
I think that is actually pro. think, yeah, but it's, I think it's just their high thinking pro version, which, you know, makes sense. And yeah, that's supposed to be really good as well. Apparently it has way more reasoning. I personally use, I've been using Codex a lot and it's definitely better than Glotcode. It just takes so long to think, which is annoying, but it is really good. I mean, I will say that. And they're cool.

Pierson Marks (04:20)
Gotcha.

Right.

think the thing about ⁓ using these models in production, writing code, I know most of you listening to this, maybe you aren't programmers, but one of the most difficult tasks about writing code and thinking about building things is how do you break down a large task into subtasks that are tightly contained and specified well?

As you get into this world of agentic programming, it's much less linear because how a programmer works today is like you, you know, you have this big idea and you work linearly. You kind of, you know, write some code, write some functions, you go and they all build upon each other in this stack. But in a world of agentic programming where you essentially have some number of agents program on your behalf, it's really important. You know, it's a new skill to have to step back much more so.

think about things at a higher level and to take the most advantage of these models is to paralyze multiple agents who do things across multiple tightly contained scoped.

features. And so this is always what it's been for senior level engineers who don't actually write the code as much, but they're more like writing the architecture and then breaking up the architecture into pieces. And so it's very cool because now more people at lower levels, like entry level engineers and vibe coders and stuff, have to experience how do you think a systems architect

rather than a programmer and breaking up things into, you tightly scoped contained objects that agents can go and think for 10 minutes about because you don't need to have like immediate feedback because you just shoot that one off, let it go, shoot the next one off, let it go, and then you come back and you're like...

Bilal Tahir (05:53)
Hmm.

Right.

Yeah, and that

kind of workflow is just going to be the common one. Apparently, the Codex Max, they let it run for 24 hours, and it kind of just did it. Yeah. Yeah. We're going to see people going bankrupt. It'll be the new AWS bill coming in. They're like, my age. It just came out with the number 42 after thinking for a year. It's like, yeah.

Pierson Marks (06:16)
You could do up to 100 agents, I think. think 100 was the soft limit.

Yeah.

Yes.

Bilal Tahir (06:32)
But no, so that's OpenAI. But Gemini 3 Pro, really great model. And then they also released, which was the other highly anticipated release, was Nano Banana 2. really quickly, how does Nano Banana, so what is Nano Banana? ⁓ Nano Banana is Google's image model. But it's interesting, because the way they...

I don't know the details of the architecture, but the way it works is it pairs with the LLM. That's why it's so good because it basically has an LLM that powers it under the hood. And then I guess it has a VLM encoder layer or decoder layer or whatever that actually generates the image. But the cool thing about having an LLM is that it actually will take the prompt and have a more nuanced understanding of what you mean. It's like a lot of image generators are very dumb in that sense. They just try to image and generate that. so Nano Banana, the first one, was paired with Gemini Flash 2.5.

So it was a flash model and that's why it was so fast. It was very cheap. And on a file it's like 4 cents, 0.039 cents an image. If you go on AISU, honestly, you can just use it for free or pretty good.

Pierson Marks (07:31)
on AI Studio.

Bilal Tahir (07:32)
Yeah, yeah, Studio, they just let you use it for free for a while. And I imagine their rates, if you use the paid API keys, even lower. So it was very fast. And the new one, the NanoBanana 2 they just introduced, is paired with Gemini 3 Pro. And so I'm kind of surprised they didn't launch a flash version, because I think that's what makes the difference between the processing power. But because of its Pro,

It's two things. First, it's just way better. And we'll talk about the abilities, but the price is also a lot more. So it's like 15 cents an image, you know, from four. So that's almost a four X. Yeah. Yeah. But, you know, I guess they're banking on the fact that it's so good people will pay. And honestly, I mean, it's amazing. Like, so we were talking about some of the abilities. It's basically solves text rendering. people, can literally take a PDF and it'll just like create an infographic out of it or, or.

Pierson Marks (08:01)
Wow, it's like over three times increase.

Bilal Tahir (08:20)
picture book or just literally create some guy took his like startups, like I think it was replicate actually, one of the engineers, they took replicates blog posts because replicate just got acquired by Cloudflip, which we'll talk about in a second. They made a New York Times article out of it using that. It was actually, boom, the article like was there the image.

Pierson Marks (08:37)
Really? So all the text was

good? All of it? Like you could read the whole thing and it just was perfect?

Bilal Tahir (08:42)
Some

paragraphs made sense. wasn't like, you you just have almost like a 13th century kind of a hieroglyphic type of thing. wasn't like that. It was actually words. yeah, so text rendering ability, next level, ⁓ placement, next level, like taking objects, them in, taking your logo, putting them in a different setting, style transfer, infographics, I think are gonna be huge. We were talking about that. Just taking any concept and making infographics out of them or diagrams out of them, you know, that would be cool.

sick maybe we'll use them in JellyPod who knows right so lots of cool stuff that you know I mean we haven't even gone into because it just released today but exciting stuff you know

Pierson Marks (09:23)
Do you

know, so like, I haven't looked, but like what's the average generation time? Like is it one second, 30 seconds?

Bilal Tahir (09:30)
I

that's a good call. honestly, I have no idea. I'm guessing it'll be like a few seconds because I mean, that's one of the reasons it's like expensive. Let's see, Gemini image edit. they also have a pro image preview. If I, let me see if I run this.

Pierson Marks (09:45)
Let's do a little test run right here and see how long it takes. Yeah, because Al has a text image and then you have an image editing endpoint. image editing is you pass in an image and a prompt and that gives you a new image. And text image is just passing a prompt and get an image out. ⁓

Bilal Tahir (09:49)
full

and

anything. ⁓

Yeah, so

that took about 15, I'm gonna say 15 seconds. Still like way less than GPD image, which is just annoying. But that's, yeah, that's why I guess it's expensive. It like really things. But again, you know, I think this is gonna, and I'm kind of bummed. coming back, this is why I'm surprised they haven't released a flash version of it. Cause I was expecting them to do a Nano Banana, like two pro and then a two dist, cause the flash model is basically a distilled pro model. But.

Pierson Marks (10:10)
Right.

Bilal Tahir (10:29)
I'm sure when Flash 3 comes out, it'll be way better than basically 2.5 Pro. So we basically get 2.5 Pro's level Nano Banana, you know, for that. So I assume the reason they haven't done it yet is either they're gonna do it maybe staggering the announcement or they actually didn't have time to distill the model yet. And so when Flash 3 does come out, they'll probably do a Nano Banana announcement with that, hopefully, you know.

Pierson Marks (10:37)
Right.

Yeah, totally, that

makes sense. It's really cool. I remember on Nano Banana, the first one, mean, pairing with GPT 2.5 Flash, some of the things that it was able to do was identify elements in images. So imagine you passed in an image edit model. You passed in an image of some, not necessarily famous, but a building, right? And you say, hey, here's this building.

Bilal Tahir (11:13)
Right.

Pierson Marks (11:15)
and then annotate this building with all the different architectural elements on this building. so Nano Banana was able to take that building image and actually highlight around like, here's like an arch. This is like a Roman arch. like.

Bilal Tahir (11:29)
I

don't know.

Pierson Marks (11:30)
It

also asked it, where is this building located? So it actually had a spatial awareness of what building this was. It identified correctly it was in San Francisco. this building. It annotated the architectural elements and highlighted it. And that's something like an image model. If it's just an image model, doesn't know. It doesn't have that knowledge of like.

Bilal Tahir (11:42)
wow.

Right, right, yeah.

Pierson Marks (11:52)
what it is, it's more of like, here's the thing, but doesn't really know what it is, but like now it can and that's really cool, especially for like, you know, so many applications where they think about it. You need, have, so like you can see all these Google products coming together. You have G &E 3 world model, you know, has spatial understanding and that's amazing. I wonder if there was any, you know, any coworking G &E 3 and Nano Banana.

Bilal Tahir (11:54)
Mm-hmm.

Yep.

Yeah.

Pierson Marks (12:17)
and then you have Gemini,

and then you have, what's their video model? ⁓ VO3.

Bilal Tahir (12:22)
Vio3. Yeah,

I was hoping they would do a Vio4 launch or something. Maybe they will tomorrow. don't know. The week is still...

Pierson Marks (12:25)
All right.

And just think

about this. I know you said that you were at this event last night or two nights ago about opening on Google. I mean, so Google literally has YouTube. They have Google, the internet. They have all this content. And they also have Google Earth. They have Waymo. Waymos are literally driving around taking a dinner.

Bilal Tahir (12:34)
Right. Right.

They have been basically right. Yeah. Right.

Yeah, robotics, they have the

best LLM, they have the best image generator, and they have the best video generator. I I guess you can make Sora too. I mean, it's like really small differences. They're up there.

Pierson Marks (12:58)
Yeah.

At the end of

the day, if you have robust world knowledge and you're also able to enhance that world knowledge with fantasy storytelling, that was a problem in the early image models. There wasn't enough data on something that would just never happen in real life. It was just really hard to generate something because there was no training data on that. But now you can make a sandcastle made out of bananas. was like, that doesn't make sense, but you can do that.

Bilal Tahir (13:20)
Mm-hmm.

Mm-hmm.

Pierson Marks (13:29)
No, it's cool. I want to play around with Dan and I wish it was cheaper, but I think the distilled model when it comes out will be probably.

Bilal Tahir (13:35)
Yeah, and

there are other options too. I feel like this is where we're going to get into the spectrum where you generate cheaper images first and then.

when you want to get the highest quality, you can opt for Nano Banana. the other thing, it's funny, because Nano Banana, when we think of Nano Banana, most people think of the editing abilities, but actually just the text to image model is really good too. PJ put out a post, which was very interesting. There's a new freaking movie coming out. It was The Legend of Zelda, think. The live action is coming out, which I don't know why they do live action of like, I'm covered again, but fine.

But he took images of that and he made like a 4K amazing ⁓ cinematic images out of the Legend of Zelda and it was so good. And so just the raw image abilities are really good too. So I'm sure this will become one of the top models just for producing images, not just editing it. And I'm really curious how much of that is the LLM.

versus just the image. For me, because I don't understand in depth underneath the hood, what is, how much is just having the LLM there and just making the prompt, the perfect prompt, and then the raw image. Because I do feel like a lot, and I feel guilty about this. feel like even image models six months ago, they were probably capable of producing stuff we produce now. But the thing was you just needed the right prompt for it. So it's almost like digging for gold.

Pierson Marks (14:53)
Mm-hmm.

Bilal Tahir (14:55)
The gold is just surfacing up so you don't have to dig as deep, but it's there. It was always kind of there.

Pierson Marks (14:58)
Right,

Totally, totally. No, that's it's the yeah, that's the most difficult part because like how do you express what you're what you're intending to do and the AI could be able to take your like your messy prompt and make some good assumptions about like, hey, if you say something lazily in your prompt, which you should assume if you're building an application today, you have to assume that every single one of users is going to be the most lazy person whenever they're prompting. They're not going to spend the time, you know.

Bilal Tahir (15:07)
Yeah.

Pierson Marks (15:26)
writing out in depth, most people won't, writing out everything. So you have to accelerate that process by making assumptions about what they're trying to do. Like if they're trying to create an image that seems like the prompts like, it's probably gonna be realistic. You probably should make it realistic. Don't like go off and make it a watercolor, NMA style, know, like make some assumptions. And I think that's what the LLM and what the models do well, much better now.

Bilal Tahir (15:48)
Yeah, yeah.

And that's where stuff like, you know, Lora's and stuff come in. I, know, the cool thing about ⁓ open source image model, like when is right, people will take that fine tune them and they'll bake in this amazing prompt that, know, like, like, I know you were, you put one down like for the multiple angles. And this is like traditionally been very hard. I'm curious how the nanomagnet two handles it. But, ⁓ a lot of times if you just want to have just a different angle, especially like from up top, for some reason, a lot of image editors fail that's giving you an up top angle of,

Pierson Marks (16:10)
Right.

Bilal Tahir (16:18)
characters. So someone just fine tuned the Quinn image model and now you can just get it. So we probably have these kind of like little fine tunes as well under the hood.

Pierson Marks (16:26)
Right. Right.

And think about it. for people listening, mean, the Quen image model. So a LoRa is essentially, it's an acronym, it's low ranking adapter, but it's a lightweight fine tune. And you give a model some new data and it kind of adjusts weights so that it could do something differently and more consistently without as much prompting. And so this LoRa here was this Quen multiple angles LoRa.

Bilal Tahir (16:42)
Mm-hmm.

Yep.

Pierson Marks (16:55)
was really cool because it was this girl sitting on the floor with a bunch of records around her. She was sitting there, her hand was on a record player. was records on the ground, there was covers on the ground, and there was like this floating headphone. And the first image was that. So it was right in front of her. You see the records on the ground, floating headphone, her. And then using this Lora, they were able to pan the camera almost so that it took a shot from

up top, took a shot from the left, took a shot zoomed in and everything stayed the same. Like going from having the camera in front of her, seeing the Bob Marley cover art on the ground to having the same shot, it's almost like the same scene, but with the camera shooting from the top, it's very hard. And like making sure that each individual record never moved, like the Bob Marley cover art is still next to the Led Zeppelin cover art.

and floating headphones are still there and it's not like now the headphones on the other side, it's like spatial awareness. And that comes back to the whole world, like spatial awareness thing. Like if you're intelligent as the model, you like kind of understand, hey, like these things should stay in the same positions, like why they shouldn't move. So it's interesting.

Bilal Tahir (17:42)
Yep.

I'm not.

Yeah.

No, hundred percent. And then you can pair that with a video, like a view through your first and last frame. And then you can get these awesome shots, like going from one angle to the other. So it's almost like getting that camera feel like a zoom, you know, it's like that. ⁓ Severance, famous Severance shot of ⁓ the main guy. He's like, and we're like up and down. That was like such a cool shot, right? And, you know, we can have those kind of, we can kind of create those shots now. So kind of to your point and it, and that to your point,

Pierson Marks (18:15)
You're gonna see this so much.

Yes. Yes, but the zoom, yes, Yeah, yeah, yeah, yeah.

Tomorrow night.

Bilal Tahir (18:31)
adds more depth and that you can see that there's a whole world around the character, it's not just like a static 2D image.

Pierson Marks (18:38)
Right?

It's super, super cool. And so this is just going to enable so much more creativity in industries like Hollywood where they don't have to re-film. Like you have like one scene like, man, I wish the angle was a little bit different. We want to take this scene and we want to, you know, cut it with a over, like a scene where like the camera is now on top. And so you don't have to re-film, which is great for everybody. Like the movie come out faster. You don't waste actor's time. You don't do everything. It's just like, okay, cool.

Bilal Tahir (18:44)
Yeah.

100%. Yeah. Yeah. Definitely.

Pierson Marks (19:06)
You know, better stuff.

Bilal Tahir (19:07)
Yeah, the other thing

I mean, which is cool about Nano Banana, which I almost forgot, it actually generates multiple images. does most image images just to one image. In Nano Banana, you can basically say, give me 10 image panel of a story ⁓ and it'll just do it. because it's one shotting that the consistency is really good. So we talked about editing and making that consistent, but it's like, you know, from first principle, you know, I mean, rather than just taking an image and editing the next shot, just give me the whole story.

Pierson Marks (19:21)
Mm.

Bilal Tahir (19:36)
I mean, and it can do that pretty well. I mean, I remember Nano Banana, the original did a great job. I'm really curious to see how this goes with 3Pro. I imagine it'll be way better because its understanding is better, right? You have basically the state of the art reasoning model behind this image generator now. So you should be able to get really nuanced, complicated storylines and multiple images that are consistent, but also make sense where the story goes. So I think this is going to be

Pierson Marks (19:59)
Totally.

Bilal Tahir (20:03)
really cool. I'm excited. I'm going to try that comic book art and stuff on it and see if it can help the panels.

Pierson Marks (20:06)
You should try that. You should do that and then see like,

you know, I know you did something in the past. I want to touch on this in a second too about like the crypto, the X4 or 2 stuff because I did a project last weekend. But I mean, if you do the comic book thing and you use Nano Banana to generate like four comic book scenes, then you can kind of have like a pay for another slot, pay to continue the story. You do like four comic book frames and you're like, hey, want to, it's a cliffhanger at the end. You want to know what happens next?

Bilal Tahir (20:17)
Mm-hmm. Yeah.

yeah, nice.

Pierson Marks (20:34)
Pay me five cents and then you generate the next four ⁓ posters.

Bilal Tahir (20:36)
⁓ and that's

the ultimate micro economy or every comic panel you pay a cent. It's like,

Pierson Marks (20:42)
Yeah, yeah, yeah, yeah, because

last weekend, I know you saw this, I posted this on Twitter, I open sourced the project as well. wanted to, so last week, if you're watching this, or the other week, Belal and I, were at Versailles Ship AI Conference and the Next.js Conference. And if you're watching the video, they had a photo booth. And you saw it in the photo booth, it took a photo of you, but rather than printing the photo of you, it took that photo of you and it pixelated it.

And then they printed out the pixelated version of you, so you're retro. So like, we got these, like that's me in retro pixel version. And they printed those out. And so I was inspired because it looks like a Pokemon card. It has like the stuff on the top or like a playing card. And I was like, okay, cool. Let me see what I can build that allows me to create a Pokemon card generator. And so I used Nano Banana to...

Bilal Tahir (21:11)
you

Nice.

Right.

Pierson Marks (21:36)
I uploaded a photo of myself. It used NanoBanana to take that photo and convert it into like an 8-bit watercolor Pokemon-style image. So I took that NanoBanana generated image and then I inserted it into a template that I coded, hard-coded, not AI-generated, but I wrote a template that looked like a Pokemon card. So it was like gold on the outside, it had a block of text underneath, it has a splot for my name.

Bilal Tahir (21:52)
Mm-hmm.

Pierson Marks (22:02)
my health points, my HP, and other stuff. And so I pretty much, after NanoBanana created the image, I put it into the Pokemon card, and now I have an image of, it's a hybrid, it's a hybrid approach, it's NanoBanana and Snatch.

Bilal Tahir (22:14)
Yeah. I mean, that's

so powerful, right? Programmatically getting a deterministic layout and then using Nano Banana for the fun art part. You I mean, you can mix and match. It's perfect.

Pierson Marks (22:25)
It was really, really cool

because I also took, I also had an LLM analyze the photo that I uploaded. So was like, who was this guy? What does his special ability, like what is his special ability? And so I uploaded a photo of me like in a captain's hat and said, and my special ability or power was like Captain Zahoi or something. And it wrote out like my superpower on the thing and I could generate those and it was cool. So added some other stuff around the edges, but that was a. ⁓

Bilal Tahir (22:46)
Nice. ⁓

Pierson Marks (22:51)
fun little project. So hopefully it stuff.

Bilal Tahir (22:52)
No, that is so,

I mean, you can like, I mean, there are so many cool things you can do with something like that, like the personalized, like you can build your own Pokemon trading place. You can like kind of come up with almost like an astrology type of thing where people like say, this is my personality and based on that generate their Pokemon strengths and weaknesses is almost like a, or a Slytherin, ⁓ Hufflepuff type of house sorting type thing, right? You can do so much, so many fun things. And then you briefly mentioned you use X4 too. So what,

Pierson Marks (23:11)
Right.

⁓ totally.

Bilal Tahir (23:19)
does the X4 too do here? ⁓

Pierson Marks (23:21)
So

yeah, so essentially I didn't want to have to worry about authentication. Like when you go and, you know, pay for something on a website, you to put your credit card information in. It usually means you have to have an account because you have to gatekeep. so you have to accept payments online. not hard, like now it's not hard. But you have to create an account, you have to have your credit card information, you check out, blah, blah, blah, you do all that stuff. It's like, no, I don't want this. So what X402 does is like, when I click generate Pokemon card,

Bilal Tahir (23:26)
you

Pierson Marks (23:50)
It makes a request to generate it. And then if I don't attach a payment receipt, a crypto payment receipt, it'll say, hey, generating a Pokemon card costs one cent. Here is an address that you have to send one cent of USDC, like a stable coin, like USDC. And so retry this request after you paid the one cent. And so it automatically will.

Bilal Tahir (24:01)
Mmm.

Pierson Marks (24:13)
as long as you have crypto in your browser or like something, it'll pay the one cent to my server that I have running and they'll say, look, I paid my one cent. Here's the receipt. You can validate it. Now generate my Pokemon card. And so the server will say, yeah, look, I see that you paid. See that you haven't generated the Pokemon card yet with this receipt. Let me generate your Pokemon card and return it. It's like, thanks, thanks for paying. And so it's so easy. Like I literally have...

Bilal Tahir (24:27)
That's awesome.

Pierson Marks (24:38)
I installed for the first time the Coinbase wallet in my browser and so when I click generate Pokemon card, the browser extension pops up, like, hey look, this website's requesting one cent, would you like to pay? And I click pay and then everything goes and it generates a Pokemon card, like, no login, no password, no credit card information. I just literally go to this website, click generate Pokemon card, a pop-up appeared in my browser because it already had my crypto in there and I just clicked, yep, confirm send and then it worked.

Bilal Tahir (24:41)
Yeah.

That's it. No login, no passwords, nothing. It works. Yeah.

Yeah. Yeah.

No, I don't know.

Honestly, I don't know why Chrome hasn't done this, but because even just installing Chrome, Chrome browser should just come with a wallet. mean, not even call it a crypto wallet. This should just be a wallet you can find and then you can do all these things. It should just be part of the internet experience, you know, because thinking about, I mean, we're both, you know, crypto wars, but whenever I have to like go to MetaMask and switch networks, this is base and this, I'm like, ⁓ it's yeah. I'm like, why not? Yeah.

Pierson Marks (25:28)
It's stupid, it's so dumb. For any adopter, my gosh, it's even like, that's

crazy. As long as it's technical, mean, it sucks.

Bilal Tahir (25:35)
Even the- Right.

Even with X4 two, took me a minute to understand the concept of, cause there's, it's not just like sending somebody has to manage that, right? So there's a concept called the facilitator, which who right now there's only a couple and Coinbase is like the biggest one. They do it, let you do it for free, but then some someday they will probably charge fees. So it's like, there's like all these like nuanced topics you kind of have to understand, which is why I feel like it kind of hurts the crypto adoption where, you know, both sides developing and just using, you know, and we just need to have it abstracted away.

Pierson Marks (25:46)
Mm-hmm.

Right. Yeah,

I do think like I have a feeling that there's like this growing division, not division Google, but like I think there's a re inventing of their kind of startup s with DeepMind. Um, and you have like the smartest people in the world, you know, working at DeepMind and like today I opened up my Chrome browser in the top right hand corner. I saw buttons at Gemini. I've never seen that there before. I was like, what is this? And I was just like, yo, I click on it.

Bilal Tahir (26:17)
Mm-hmm.

wait, really? Interesting.

Pierson Marks (26:32)
It has this like massive blue hover around my entire browser. like interact with this website with Gemini. Talk with it live. I was like, did Google just win the agentic browser wars? Like they just slid in Gemini into my browser. And I was like, this is so cool. I have it open right now. ⁓

Bilal Tahir (26:39)
wow. Nice.

That's the.

That's

awesome. Wait, so this is part of Chrome because Google also released a devin competitor called Gemini, what was it? I forgot the name. Yes, anti-gravity. And so that was another release we forgot to mention.

Pierson Marks (26:55)
Anti-gravity? Anti-gravity, yeah.

Well, Antigravity is just a coding

IDE, right? Like it's a windsurf team and everything.

Bilal Tahir (27:04)
coding ID, but it also does

has computer use. Cause I thought that was its thing. It's a, it's an ID. So it's like Kerscher meets Devin kind of a thing they're trying to do. I don't know. It's, it was kind of funny because the guy who was leading it, he was a former Windsurf CEO. And there was this whole drama about how they took the winds of Google acquired half of Windsurf and took basically the, top, the leadership there. And the employees were just left hanging there with a Windsurf, which is honestly.

Pierson Marks (27:07)
Yeah.

Bilal Tahir (27:30)
probably gonna die at this point. It's very hard, know, what happened. Yeah, they got acquired by competition. So hopefully the employees will find, but that was like, you know, it was kind of a very bizarre breaking of the social contract where, know, usually if you're as a CEO founder, you're like either die with the ship kind of, you know, there's an assumption you're gonna, you know, ride or die with your own startup. So they went to Google. What they did was they literally forked Windsurf.

Pierson Marks (27:32)
They got acquired by cognition.

Bilal Tahir (27:53)
And you could tell that because they missed one piece of code. And if in the placeholder, it says cascade, which is WinSource thing. So they missed that part. And it was just so funny because they're like, dude, not only did you go to this other company, you fork your own product. It was, no, no, wait, wait, wait. I take that back. That's a good question. Was it open source?

Pierson Marks (28:10)
Yeah, was Windsor open source?

I just don't know the details of the

deal. Obviously, if you're a cognition, you don't want to sue Google. OK. OK. Gotcha.

Bilal Tahir (28:20)
No, I think it's not open source, but Google has rights to win source for I think that that's yeah. So that's why

they had it. But still that was just like, damn man, you just update the winter. But anyway, so point is, so coming back to what you're saying. So what is, I'm curious, like, so it lets you, it highlights the screen and then you can chat and does it actually take actions or is it more just a, just.

Pierson Marks (28:30)
Yeah.

Yeah, let me do it.

Actually, I

don't know if it takes actions yet. Let me see. So here's our agenda of everything that we talked about. So if you're watching this, you get a little behind the scenes. So if I click on the top right hand corner, let me just pop this out.

That's my Twitter feed. Let me share my screen again.

Bilal Tahir (28:53)
Mm-hmm.

Pierson Marks (28:55)
there we go. Okay, so top right hand corner Gemini, look at this. So and then you click this

Bilal Tahir (28:59)
Wow, it's a holy war just for that.

Pierson Marks (29:04)
It was highlighting it before. Interesting, it's not highlighting it now, but take my word for it, it would highlight the entire border.

Bilal Tahir (29:11)
Yeah, maybe only highlight

certain websites. Maybe it doesn't do it for the doc.

Pierson Marks (29:16)
Maybe let me let me see that's a good. It's a good call. Let's see Gemini if I do it on Twitter It was doing this before but it would highlight it and then you can just like say what does this Talk about and see ⁓

super interesting. was just working a second ago. Well, it's still buggy. I don't know if this is a user error, ⁓ but it was very cool. Like I could chat with my website. But if it's not, if you can't take actions, you know that they're going to be able to like, I, this search bar on the top, when you, you, rather than Google search, it's going to be AI overview. It's going to, it's AI mode already is there. It's going to be Gemini. Like Chrome is

Bilal Tahir (29:54)
Right.

Pierson Marks (29:56)
the most amazing interface for them to launch new AI products into. It's nuts. ⁓

Bilal Tahir (29:59)
Absolutely, yeah. should just be,

AI should just be baked in. You know, I mean, our browser phones, it's annoying that we still can't like, even on my phone, like my iPhone, I'm like, I should just be able to talk to my phone and be like, you know, tell me about something like, you know, just, but yeah, that's awesome.

Pierson Marks (30:14)
Right. So that's super cool. then I know

before we wanted to mention just real quick that Disney, so one, we mentioned this quick. So Replicate was acquired by Cloudflare. That was cool. So Replicate is a AI model provider. They do a lot of generative media models. And now they're acquired by Cloudflare. We'll see if like Fal gets acquired by Vercel.

competing, I don't think it'll happen, but we'll see. And then Disney and Bob Iger, the CEO of Disney, came out last week on their earnings call saying that users of, I don't know if users of Disney Plus or what the form factor is like, but people are gonna be able to use Disney protected IP and generate characters and stories with those characters. So like Disney, the House of Mouse is hopping onto the,

Bilal Tahir (30:40)
No, this one, yeah.

and

No

Pierson Marks (31:05)
the AI train. You can create your own Star Wars stories, I guess, or interact with characters from Disney IP. And this is while they're still in lawsuits with other players. So they're doing the dual edged sword. But I think at school, I think it's inevitable. You saw some backlash of people. And if you're in that camp, I'm sorry. But I think you're going to look back and hide, say, like, wow, I was wrong.

Bilal Tahir (31:27)
Yeah.

Pierson Marks (31:27)
So, yeah, the train has left the station and it's never, Gene is never going back in the bottle, so.

Bilal Tahir (31:30)
Exactly.

Exactly. And you just have to adapt to that. And the people who do adapt the best are just going to, it's going to be very lucrative for them. So that's what I would suggest. Rather than fighting it, just see how it can amplify what you do already and learn. I mean, the creativity potential is so cool. Like literally, I know one of the projects I did all the weekend was I was running, I was on a run and I literally had this idea. It was just a random idea I had on

Pierson Marks (31:46)
totally.

Bilal Tahir (32:00)
I was like, I need to do something. I literally went back home, generate, I have this own, once I'm polished up, maybe I'll show it like here, but like it's basically my own comfy UI kind of a comfy UI type app where I have nodes. use React flow for that. And I probably need to make it better because it gets unwieldy, but literally create images like I did, use the nano banana, et cetera, to edit the images. And then I was, I put myself as a main character of the story.

Which was also very cool because initially I started with some other character then I was like, hey, it's my story You know, I mean not like me personally, but I'm creating it. Why shouldn't I put? Someone else there so I feel like ⁓ and that's like a very interesting concept too because I'm like you can like create your own You know be your own movie star, know, whatever right? mean, so so I did that I created the images then I generated the videos from those images and it was pretty good I mean for a few hours work, you know, it was awesome and I'm gonna polish that up but

The movie's out there, if anyone saw it, it's called 30. It's basically a three-minute movie trailer that I was able to make and put on a song, which is cool. So yeah, lots of potential, lots of fun stuff ⁓ you can do with this. Yeah.

Pierson Marks (33:06)
That's the thing.

That's awesome. Yeah, that

is cool. Check it out. Follow Bilal on Twitter, Deep Whitman. He posts his stuff there. And follow me too if you want. ⁓

Bilal Tahir (33:16)
Yeah, and there's yeah, yours

is just Pierson marks, which is nice.

Pierson Marks (33:21)
Yeah, I try to get Pierson marks on everything.

okay, well,

⁓ everybody take care. We'll talk ⁓ next week.

Bilal Tahir (33:27)
All right, take care, bye.

Google's Week: Gemini 3 & Nano Banana 2
Broadcast by