GPT-5, ElevenLabs Music & Google Genie 3
Pierson Marks (00:00.873)
Nice, sweet. Hey everybody. Hey, below.
Bilal Tahir (00:03.288)
hey everyone hello hello it's been a while
Pierson Marks (00:08.479)
Yes, it's good to see you back. know, last week it was a...
Bilal Tahir (00:10.21)
Yeah, yeah, I've been, I-
Yeah, I was on vacation. I've been gone to Europe for a couple of weeks and coming back and resettling. So I'm sure, you know, like we joked about like how two weeks is like a year in the AI world. It does feel like that to a certain extent, you know, a lot of development. So we have a lot to, I guess, catch up on. How are you doing?
Pierson Marks (00:29.29)
Right.
Pierson Marks (00:33.011)
last two weeks I mean I'm doing well yeah really doing well last two weeks have been crazy this week has been crazy I mean like in the last two weeks has been things 11 Labs music
Bilal Tahir (00:42.367)
yeah, especially this week.
Pierson Marks (00:48.363)
We put we did we got so 11 Labs music. We got all these new video models. We got all these new text models LLMs today GPT 5 Genie 3 runway ideogram like last two weeks have been crazy this week especially so good. I'm excited to talk about it all so.
Bilal Tahir (01:09.454)
Yeah, yeah, yeah, where do you want to start? Let's just dive in.
Pierson Marks (01:13.995)
Yeah, let's okay. So I think the first thing that I wanted to talk about well, we just got off of this right now. GPT-5 just you know, GPT-5 came out today. Super cool.
Bilal Tahir (01:25.87)
Mm-hmm.
Yeah, well, as of 15 minutes ago, this is like very fresh.
Pierson Marks (01:31.276)
15 minutes ago fresh news. It's I think the live stream is still going on right now So it's a You this is a creative like media type of podcast so we're not going to dive into it too much I think but from a first look it seems like There is
Bilal Tahir (01:36.663)
out.
Pierson Marks (01:53.996)
It's a unified model. It's going to be accessible for everybody in chat GPT. So all four users are going to be able to get it. You don't have to necessarily switch between 03 and 40 for thinking and non-thinking. it'll reason as much as it needs to to solve a problem supposedly.
Bilal Tahir (02:11.598)
Hmm.
Pierson Marks (02:15.263)
And that's cool because I still know a lot of people from a UX perspective. People are still don't understand what model to use. When is 03 better than 04.0 and it's just like, okay, good simplified modeling. Hopefully it's, you know, every lab goes towards this. just like one model. That's the thing. And developers maybe have more control in the API. but like for a consumer product, it's like, they don't want to hide all that stuff away. They don't, they just want to ask something to question, you
Bilal Tahir (02:25.997)
Mm-hmm.
Bilal Tahir (02:37.986)
Mm-hmm.
Bilal Tahir (02:43.469)
Yeah, yeah.
Pierson Marks (02:45.217)
Thank
Bilal Tahir (02:46.412)
Yeah, think this is where everyone's going to have these hyper models where it just does.
thinking or non-thinking under the hood. And it likely will be a spectrum. Like you said, the developers have more control. think I actually like how Google exposed this to us. They actually give us multiple options. One is like, I think low, medium, high, which is like how much of a thinking it should do. But then also it has this thing called the thinking budget, which I really like, because then you can just set a ceiling. Like I want max 10,000 tokens, right? Which is as I, I mean, it's good to just be like, all right, let's just give it a ceiling so it doesn't go too crazy. But then, you you give it enough,
for range to go from thinking a little bit to a lot. So I imagine OpenAI will have similar parameters. obviously for chat GPT and stuff, will be the auto mode just kind of figures it out, which will be good enough for 99 % of everyday tasks.
Pierson Marks (03:42.953)
Right, right. Yeah, totally. Yeah, hybrid reasoning, hybrid thinking budgets, all super cool. if you're out there, I mean, check out GPT-5. It's supposed to be in Chatchie PT today.
Bilal Tahir (03:49.016)
Yeah.
Bilal Tahir (03:55.266)
Yeah, and it's the same, right? They have the base model, mini and nano. guess they're doing the three shared sizes. I think that's the approach, which is cool. mean, yeah. Yeah, and there's another model like with the Flex as well now. So there's base, mini, nano, but then you have Flex, which is for a little slower response time, you actually get 50 % off pricing, which is pretty cool. Right, right.
Pierson Marks (04:04.341)
Yeah.
Yeah, totally.
Pierson Marks (04:18.377)
Right. Right. And that's for developers too. So it's like if you're not a developer, like you'll have like your minis and your nano and your regular models. You have your batch processing. You have flex processing. Now there is like a thinking budget. It looks like.
in GPT-5. So I don't know if there's a budget, there's a new model parameter, it's like specify thinking or not, or how much thinking, I don't know. I didn't look at the API to see, but that's cool. If you're developer listening to this, maybe you are, but you know, lot of standard stuff, things that you kind of came to expect with other model providers, hopefully good. So.
Bilal Tahir (04:56.812)
Yeah, yeah, yeah.
Yeah, no, I think it's super cool. um, you know, I've said this before, think open AI strength is as a consumer company and their application, they've chat GPT, think is pretty, you know, they've like built it out, you know, they've had time to build it out. There's so many, so much functionality and there's, it's not just about the model. It's about like the tool calling and stuff. so in chat GPT, I saw some examples where you can throw in a spreadsheet or whatever, it'll create a graph or whatever, right. Using that. So those kind of tools, I feel like, you know, are definitely.
gonna get so much better. It'll be very interesting to see how with this smarter model plus the great UI that Chativity has, it can actually replace tasks like data analysis or market research and stuff like that. I imagine those kind of things will just get exponentially better with this.
Pierson Marks (05:49.398)
Totally, And the last thing I think I wanted to mention too, we'll have to see if this is true. I know they spent a lot of time in the live stream today talking about healthcare and like medical professionals and hallucinations.
At least if the graphs, benchmarks that they demonstrated a significant reduction in falsifying like confidence on answering and it will like be much better at adhering to truth supposedly. so like that and not like in complicated domains where
It'll just probably do it seems like they trained it spent a lot of effort on making sure that in PhD law Medicine that just going to do better in sort of minimizing hallucinations and it's really been tuned On that so we'll see if that's true But it'd be cool
Bilal Tahir (06:45.486)
Yeah, yeah. I I feel like people like the most impressed they've ever been before this was 03Pro, right? With like the deep research and stuff. so they clearly have a great flow set up for citations and stuff. And I wonder how that it's going to work with GPT-5. Will we just have GPT-5 deep research and that's 03Pro basically, like, like, you know, doing that. So it'll be interesting to see how that long-term thing works.
Pierson Marks (07:12.811)
And I'll also be interested too, just in like, cause 04 and 03 were trained to like use web search in their thinking process. And so it'll like, you don't have to click on the web search. So it has access to the internet and it'll go out and like pull information from the internet as part of its reasoning. So I would expect that to still happen and pull that stuff in as citations. But like in addition kind of have the nuance ability to, you know, just really reflect on.
Bilal Tahir (07:21.432)
Hmm.
Pierson Marks (07:41.547)
content and hopefully just provide better answers and things. So yeah, I mean, we'll see. I'm excited to get my hands on it.
Bilal Tahir (07:48.002)
Yeah, no, it's super interesting.
Yeah, and one last feature I guess how much I just saw was the context window has gone up now. It's 400,000 tokens for the base model, but it's 256k for the mini and nano. And I think if I remember correctly, it was 128 before. yeah, mean, that's getting up there now. feel like Gemini has been the only one as far as I know that that has one to two million context. All the others were kind of stuck at 128k for a while.
Pierson Marks (07:57.418)
Right.
Pierson Marks (08:03.445)
Good.
Pierson Marks (08:06.975)
Mm-hmm.
Bilal Tahir (08:21.058)
on if you're trying like the whole Harry Potter book series or something in there, you know, that opens up use cases. very cool.
Pierson Marks (08:29.045)
Totally totally. So.
Bilal Tahir (08:31.838)
And yeah, I mean, it's super interesting. It's funny. Like, I don't know if I've talked about it in this way, but, I think of it, you know, right? Writers block writers blocks. Like, you know, you don't get any ideas and stuff. And I feel like with these models, it's writers block on stairs where I feel like it's like, if you can do anything at all, like imagine having the power to do anything. And then all you can think about is a to-do list or some, some things to put like, I mean, I'm exaggerating, but sometimes I feel like I'm like, I should be doing more.
But it's more like I have this power, but I'm just not utilizing it in the right way, you know? it's, yeah. It's like guilt.
Pierson Marks (09:07.893)
Totally, No, it's very true. That's why I like, I think that's like, that's a phenomenon with humans and it's like, you need the...
that starter fuel to kind of those suggestions, suggestions. When Google added suggestions to your search queries, query volume went up, people searched more. Same with ChatGPT, that's why when you're typing on ChatGPT, it'll show a list of auto completions of queries or prompts that you might be asking. it just sparks that innovation. Same thing with Jellypot. mean, that's what we do. We had auto completions and suggestions all the time.
Bilal Tahir (09:26.861)
Yeah.
Bilal Tahir (09:41.846)
Yeah, yeah.
Bilal Tahir (09:46.316)
Right, but you're right, that starter field has become, it's always been important. Like high agency has always been important. I feel like it separates the people who do a lot of stuff and people who don't do anything, but it's just become this like, and I wonder if that's what it converges to where people who just have initiative versus people who don't. There's always been, my read has always been that a small proportion of the population actually does actually things on their own. Most people are just kind of like.
Pierson Marks (10:13.579)
Mm-hmm.
Bilal Tahir (10:15.008)
live in a dreamlike state. And I think that will continue to be, but it'll just be exaggerated. you know, call it the matrix or whatever. so who's, know, so if anyone's listening, want, know, you're the type of person to take the red pill, you know, this is a blue pill. You know, this is the GPT-5, you know, moment, you know, go there, build shit, write cool shit, produce amazing content, podcasts, use Jelly Pot, create amazing stuff, you know.
Pierson Marks (10:17.001)
Right.
Right, right.
Pierson Marks (10:31.435)
For sure. 100%.
Pierson Marks (10:37.215)
Yeah.
Pierson Marks (10:41.483)
Yeah, write some code. mean like really like write some code if you or that type of person I want to build a business. I start on a code it still excuse anymore. There really is none. I mean you can have a prototype out there even before GPT-5 lovable. Replicate agent these these tools like it's bolt.new, know, there's no excuse anymore.
Bilal Tahir (10:45.315)
Yeah.
Bilal Tahir (10:52.172)
Yeah, there is it.
Pierson Marks (11:04.721)
So get out, get off your ass, go and actually take the initiative and make your dreams of reality. There we go. End the podcast right there. But okay, cool, cool. So we talked about GPT-5. There's a few other things. One thing I want to touch upon real quick. The thing I'm most excited about, Genie 3, but before that.
Bilal Tahir (11:14.572)
There you go, boom. That's all I wanted to say.
Pierson Marks (11:29.238)
Eleven labs music. I know you mentioned it music models or something that are super cool We had you know, we have su no not API So su no is just like I think this previously maybe Microsoft backed or Microsoft sub project that spun out I thought I thought there was some relationship there. I might be completely wrong. So don't quote me on it Su no is cool 11 labs music. I haven't gotten around to play with it But I think maybe you have or if you've
Bilal Tahir (11:42.62)
really? I didn't know that.
Bilal Tahir (11:54.734)
I have and it's amazing. So I just kind of backing up kind of history of the text to music. Basically, there's like two big players in the scene. There's Suno, Suno is the big one. And then UDO is, I feel like it's good. It's not quite Suno level, but it's like up there and they're kind of like, they.
release models and stuff, which kind of edges the model, but then the other guy releases another model. So, you know, it's like, that's like the opening or philanthropic fight happening there. And that they've been kind of the big players for a while and Sooner is kind of the leader. But something interesting happened last year. I think it's been last year or 18 months ago, 11 Labs out of the blue just drops a song.
on Twitter, it goes viral. And at that time it was like head and shoulders above like the other ones. Like I'm not an audio file and stuff, but I talked to a couple of my friends who are and they're like, wow, this is amazing. And then we were like, wow, so Level Lab's coming into the music scene, which I guess I'm not privy to a lot of DTS training, but I imagine there's some skill overlaps. that's why the guys were like, hey, we can train songs as well, so why not? But then curiously, they put out the song and then nothing happens.
Pierson Marks (12:58.251)
Mm-hmm.
Bilal Tahir (13:03.12)
kind of like they talked about it for a while and then they shut up. And I think my hypothesis was they, it was copyright. people were, and at around that time, Suno got sued as well. think Udo had some lawsuits cause all the companies were like, what the fuck, you know, like, especially the, you know, like the music labels and stuff, are notorious for copyright lawsuits and stuff, you know, and they've, they've cut their treat on Napster and the streaming days there, you know, so they, they know like, you know, how to clamp down on this. So I think it spooked 11labs, their investors.
Pierson Marks (13:17.184)
Right.
Pierson Marks (13:22.176)
Right.
Pierson Marks (13:27.999)
Right.
Bilal Tahir (13:33.042)
They backed off.
which is why, mean, one of the big things before we even get to quality is like this music, they did it in collaboration with the music labels. So they struck a deal. And so if you look at the fine print, this is like actually has, you know, the music labels have signed off on this. So I'm guessing there's some rev share thing or something, they, some agreement under the hood so that they, every time you create a song or something, they get some portion of it. So that's great. So that if you're trying to build a business off it, I think that
Pierson Marks (13:41.771)
Hmm.
Pierson Marks (13:51.669)
Right.
Pierson Marks (13:55.679)
Hmm.
Pierson Marks (14:00.607)
Very cool.
Bilal Tahir (14:04.432)
you on solid footing because your level apps won't like shut down the service like tomorrow because they get sued or something so there's it's much more stable ground so so that's key and then the music just like I asked here it's still good and compared to Suna and Udio I think they definitely have an edge on that so very interesting and I played around with it and the quality was really cool the way they do it is which is very interesting is Suna and Udio what they do is you put in a prompt or you can even put in your custom
Pierson Marks (14:10.911)
Right.
Bilal Tahir (14:34.512)
the genre of the whole song. What a Level Lab does is it breaks it into segments. So it'll give you, this is the intro. This is like the acapella version where the guy starts talking and then the audience comes in. And so it breaks it into these logical hooks, pieces of a song, which I think makes sense. when you create a song as an artist, you break it up into pieces. You're like, this is when the chorus is going to come in. This is when something's going to happen. And you can edit that piece then.
Pierson Marks (14:44.651)
Mm.
Bilal Tahir (15:04.271)
It was so cool because I was actually able to create a live recording. So I actually did a song where a female artist was singing, but then the crowd starts chanting with the artist. And I was like, whoa. And it sounded like, you know, like they were like at a concert, it's like Adele-like and you know, the crowd is chanting with the artist. And I'm like, holy shit, this is amazing. Because you can recreate like experiences now, like, you know, like that, like that. So I think...
Pierson Marks (15:10.208)
Hmm.
Pierson Marks (15:15.059)
Whoa. Alright.
Pierson Marks (15:25.343)
Right. That's cool.
Bilal Tahir (15:33.088)
I can see someone taking, if you've talked about Hedra before, actually Hedra actually posted portraits of 11 app songs with these creators, but I think we're not too far away from having, and K-pop is ahead of this, they actually have AI members, but you're see full celebrities, Instagram pages, everything, they're AI basically celebrities, they have albums out, they are having concerts where you can join Fortnite style probably, having like thousands of people with your,
Pierson Marks (15:59.884)
Hmm.
Bilal Tahir (16:03.152)
an avatar VR experience maybe of a know a band think like gorillas but but on steroids and
Pierson Marks (16:11.776)
Right.
Bilal Tahir (16:11.886)
That's gonna be insane. So I think this is coming. It's gonna be huge I don't want to ramble too much on it But last thing I'll say is another one of the biggest Netflix's hits of the last few months has been this Korean movie I don't know if you saw but it was a musical they put out It's basically Korean band members who gave supernatural powers something stupid, but the album went viral It became number one on Netflix and on YouTube had as millions of views. So clearly if you can tie a good story
Pierson Marks (16:14.197)
So do I.
Pierson Marks (16:24.576)
Right.
Bilal Tahir (16:41.134)
with the, you know, cause that's what's missing. don't just pump out a random song, actually build a brand, a persona, then build a backstory. And then if you can give an experience like a crowd concert or something, well, that's like gold. And now we have the pieces for it. And some DJ like, you know, 10 years ago, like the guy who figured out Lo-Fi girl, he literally was some broke ass Swedish DJ. was like, what if I take this like Lo-Fi gift and put out like good Lo-Fi chill songs?
You know, some guy is gonna figure out in his basement, he's gonna like, all right, I have a musical taste. I'm gonna build out like a metal, like album or whatever. I'm gonna build out this. I'm gonna create, this generate this character. And then I'm gonna create a backstory like a Daft Monk interstellar type, know, something and a movie experience. It's gonna be amazing. Yeah.
Pierson Marks (17:10.731)
Totally.
Pierson Marks (17:28.789)
Totally.
Do you know, is there like a duration on 11 Labs, like on the music, like can you do one minute, 10 minutes?
Bilal Tahir (17:36.622)
You can do 30 seconds, you can do up to four minutes. I think you can go longer. I'm sure there's some gap, but they basically just charge you credits and stuff. I think four minutes might be the max. And there's an auto version where you can say, just give me the logical something, yeah.
Pierson Marks (17:41.385)
minutes.
Interesting.
Pierson Marks (17:49.695)
Gotcha.
Pierson Marks (17:53.728)
Very cool. Yeah, I've been seeing a lot of AI generated music on YouTube recently and it's actually really good. Like you can really curate playlists of different moods. There's this one channel called Roti and they have like subtle non lyrical like coffee house but
Bilal Tahir (17:58.508)
Mm-hmm. Mm-hmm.
Pierson Marks (18:18.163)
Energetic coffee house that's kind of great for deep work. That stuff has been great. Like I love it and so
Bilal Tahir (18:20.109)
Yeah.
absolutely. mean, if you think about it, the lo-fi instrumental music is the first to go, right? Because it's easy, it's generic and, you know, and obviously lyrics and stuff, that's a little harder because, you know, it's not just about random lyrics. need to like, you know, have something that hits. actually Spotify, I remember there was an article last year or something. Spotify is quietly rolling out lo-fi instrumental because for them, if you think about it, their biggest cost is artists. That's been the biggest issue they've had, you know, and one of the reasons Spotify, I think stock price is like,
Pierson Marks (18:36.491)
Totally.
Pierson Marks (18:47.423)
Hmm. Right.
Bilal Tahir (18:52.802)
up and stuff because I think they're gonna go on all in on AI generated music and so get away with the headaches of you know these artists who are always demanding more money you know and justifiably it's their music they should you know so
Pierson Marks (19:00.543)
Bye.
Pierson Marks (19:06.091)
All right. No, super interesting. Yeah, like Spotify entering that game. It's always a question when you're a like a marketplace or like a platform that you rely on user generated content or submitted content. It's like you have a question. Do you also enter the space of
Generating yourself like for jelly pod. mean do we enter the space of creating our own podcasts Spotify, you know, like So really interesting question and when it comes to music like
Bilal Tahir (19:29.719)
Yeah.
Pierson Marks (19:38.58)
I'm super excited about like people that are able to just communicate and the feeling like Rick Rubin, know, very famous. He's he's asked like, what sort of technical like, what talent do you have? He's just like, No, I just I'm able to explain what I feel well. And that's why he's paid to be the president of what is a UMG or something. I forget. Yeah.
Bilal Tahir (19:52.514)
Yeah, he just knows. Right, yeah.
Yeah, yeah. Yeah. Yeah, yeah, yeah. It sounds stupid, but hey, I mean, that guy's got talent. If you actually look at his clip, I remember there's this funny, viral clip of him. He actually went to Jay-Z when he was doing Nine-Nine Problems. And he goes, what if you start with an acapella? And he actually, was Rick Rubin. Like he gave him the idea. And Jay-Z was getting his haircut. He goes, that's fire. That's fire. Yeah, let's do it. So yeah. So yeah, taste is important.
Pierson Marks (20:12.991)
Right.
Pierson Marks (20:19.115)
That's like that's like The other thing also like you ever hear about like a melon Like melon like the it's like a music group. It's kind of like lo-fi girl. I think it's earlier than lo-fi girl, but it was happy fruits or happy melons Happy melon maybe like music and it pretty much they were covers of songs in different genres. So you take like
some pop song that's just like very poppy whatever and they'd make like a version of it that's more jazzy or different voice like a female voice to male voice and then just different like a cover of that and they just made covers of every single hit song and they got so many listens because they just blew up because covers are fair use so you can just completely cover I wonder I mean like just take AI make a cover of like the best songs ever and every sort of
Bilal Tahir (21:01.848)
Mm-hmm.
Bilal Tahir (21:11.522)
Yeah.
Bilal Tahir (21:16.248)
Right. Yeah.
Pierson Marks (21:18.413)
hard rock, know, metal, lo-fi, acoustic, like just make covers of every single song out there and just put them like you'll gain like million dollars right there, maybe.
Bilal Tahir (21:27.704)
Right.
Bilal Tahir (21:31.645)
yeah, yeah, yeah, depending on the right sound and stuff. And it'll be so interesting to see how.
music industry evolves, right? I mean, you it's like, think it's even a bigger moment than when Napster streaming happened. You know, everyone was like, that's the death of music, but it was actually, you know, it actually made it even more prof, you know, easy for artists to put their songs out with the SoundCloud, et cetera. But if that's all, I feel like it'll be the same. I actually think, you know, 99.9 % of people, well, they're like, what if I take a Jimi Hendrix song and, you know, like cover it? And some of them will be hit, but most of them will be garbage. And then somebody will be like, you know,
Pierson Marks (21:44.693)
Totally, I agree.
Bilal Tahir (22:07.482)
next Cardi B or whatever, like, you know, like, they'll be like, okay, I'm interested in the new sound, right? Because there's always a new sound, you know, we're always looking for the new beat. Somebody who has that will figure it out. I think the difference will be, they'll put out a song or something and within like minutes...
Pierson Marks (22:16.683)
Mm-hmm.
Bilal Tahir (22:23.82)
people will be copying that. So how do you give someone an incentive? feel like this is why we'll probably come some model where we're like, all right, I know I'm going to put out some, I know I'm going to get copied. So what I'm going to do is put it out and give permission to everyone to copy, but I get royalties or something like that. So you'll have people searching for the new head, the new wave, and then they'll put it out there with the understanding that I'm going to have a small window where I get to be the unique new voice. And then it just becomes generic. And then it starts all over again.
Pierson Marks (22:50.027)
Mm-hmm.
Bilal Tahir (22:53.784)
like a shrinking where maybe the Beatles got 10 years or some five years or whatever and then people figured out the Beach Boys got whatever two years or whatever and now it'll be like minutes maybe maybe the new Beatles or the new Swift like it'll be like 15 minutes where like my god have you heard that and then we move on right so it's like we get just get faster so right so
Pierson Marks (23:11.883)
15 minutes of fame. All right. Yeah. No, it's super interesting. I like at the end of the day I
I mentioned this in previous episodes and I think the biggest thing that AI enables is a divergence between luxury and I don't know what the word would be, but like basic experiences or it enables more people on the low end of the spectrum to have access and to do things that they would never have done originally music. You know, it's going to enable more people to make music. The people at the high end like the artists that have musical talent that are
Bilal Tahir (23:32.334)
Hmm.
Pierson Marks (23:48.52)
performing on the shows, those things are going to become more valuable because as everybody can enter the threshold to enter in music, into art, into...
Cooking like all these things, you know, it's going to lower that barrier to entry Enabling more people to enter into a creative industry but on the high end the experience of you know going to a concert being like listening to Coldplay at the Rose Bowl or Going to a nice restaurant where? With your spouse. Yeah Exactly, I mean I think that's you're seeing you're seeing a divergence and I think it's
Bilal Tahir (24:19.064)
Yeah, preferably with your spouse.
Yeah. Yeah.
Pierson Marks (24:30.349)
good, it's healthy because it enables more people to enter the game. But because more people enter the game, there's more competition and the best rise to the top. so that's that. yeah, so yeah.
Bilal Tahir (24:47.31)
Yeah, yeah, I 100 % you we've talked about the clay pot analogy where, know, make you take
one class gets to make as many claypots as possible, the other one makes one good claypot, and the quantity always beats the quality because you just get so many iterations. And I think it'll happen with songs, it's already happened with Netflix. Like there used to be, in the 60s, there were three channels, CBS, NBS, whatever, and there were only a limited number of shows, and then Netflix happened, and now there's thousands of shows, and objectively, the quality of the shows now is way better. Now the average show is shittier than the one in the 60s, but the top...
Pierson Marks (25:10.379)
Mm-hmm.
Bilal Tahir (25:21.742)
10%, 5 % shows are way better than the top 5, 10 % of shows 50 years ago. And you'll see the same thing. The average 50th percentile song is gonna be garbage. It'll be generic TikTok, know, slop, but the top 1 % is just gonna be amazing. So, yeah.
Pierson Marks (25:25.067)
All right.
Totally.
Pierson Marks (25:36.682)
Right.
Right, 100%. And another thing that I've been thinking about, I mean, it's related, but people ask me oftentimes about generative UI, and it's related to music here where...
Do you believe that AI is going to enable everybody to hear something that's or see like a website is uniquely created on demand for them? Like the website that when I see Instagram or the app is different for me than it for you based on our preferences and testing. Like that's an interesting debate. I have some opinions about that and what that could look like five, 10 years from now.
But taking that same mental model of like generating on the fly from the UI and UX to music or creative industries. Right now when you create a song and you're a producer, you record all your tracks, you mix, you master it, you produce that single track, you throw it out onto Spotify. You have one instance of that thing.
In the tech world, you know, when you're building an app, you A B test, you vary, make iterations. You can actually test what copy is going to convert people better. The music industry doesn't have something like that. You essentially you as a user take your own design taste might be right or wrong and you create a piece of music or art and you throw it into the world to see how the world reacts. You don't have this feedback loop where you could distribute 10 versions of that song and dynamically make adjustments.
Pierson Marks (27:10.381)
like in the Jay-Z example, imagine if he didn't have Rick Rubin there and he had that first version, but you could have automatic generated sort of like variations that everybody across the internet heard a little bit different. Maybe the guitar is a Spanish guitar versus like a Southern guitar and like these slight variations and you test and you could see that kind of maybe you bucket to like a thousand different variations with AI. So like the producer just kind of says, this is a guitar
Bilal Tahir (27:23.352)
Hmm.
Bilal Tahir (27:29.538)
Right, right.
Bilal Tahir (27:38.21)
Right.
Pierson Marks (27:40.321)
And AI goes like, maybe let's try a different type of guitar for all these different variations, see which one works. I don't know.
Bilal Tahir (27:46.094)
No, that's a super fascinating concept. didn't even But I mean, you rarely see people, musicians, A-B test their songs and stuff, even a little bit, like for the Spanish audience. So maybe that's like when a new, like a terrorist puts out a song, but her research group knows that in Mexico, they like a little bit of this. So maybe you A-B test it. But.
Pierson Marks (28:11.305)
Right. Totally. Interesting.
Bilal Tahir (28:14.144)
It's interesting. It's interesting. I wonder though, if that will take away from the purity of the one song as well. So I'm sure there's always the other side. The other side would be like, what happened to serendipity? What happened to just imperfection? Why do we have to hyper optimize everything? And I'm sure there'll be a pendulum towards that as well. So it's super interesting.
Pierson Marks (28:39.659)
100 % yeah, I mean talking about generative stuff. Let's let's move to Genie 3
Bilal Tahir (28:43.618)
Yeah. Yeah, yeah. Yeah, I know you're super excited about it. So like, you what was your reaction when you... So I guess kind of setting the stage, what is Genie 3?
Pierson Marks (28:55.115)
Yeah, like Genie 3 is the third generation of Google's world models. they came out, excuse me, they came out with Genie 2 a little time ago and it pretty much was a on-demand, you might explain this better than I will, let me try to explain it though. It's on-demand world.
building? Maybe you try to explain it better.
Bilal Tahir (29:22.081)
Yeah, yeah.
No, no, you're right. I it just generates basically, it's not quite like a real time video. It's, I it kind of is, but basically it generates like a world for you, like, you know, on demand, you know, and real time, real, and real time, which is, is insane. Like if, I mean, we've talked about video generation before, so anyone who's tried to generate videos and stuff, you know how it costly, you know, and the latency. And so it's kind of insane that
Pierson Marks (29:34.259)
Real time, like real time too.
Pierson Marks (29:44.651)
Right.
Bilal Tahir (29:48.782)
Google can do it. Viginie 2 was just like a very like think like Doom 2 type, a very simple world like Mario type, you you could like have a character it kind of moves and as it moves, maybe the wall comes into play. So Unity, you know, think like Unity, Unreal Engine, et cetera. But Viginie 3, the quality just went up way more. Where, you know, it was actually crisp 720p. So 720p real time, which is insane, I think. Yeah.
Pierson Marks (30:06.475)
Mm-hmm.
Right.
Pierson Marks (30:12.725)
right, real time.
And so I just mean like just like to sum this up for anybody that hasn't followed this because it was kind of under the radar. I think if you're not in the space, you don't you haven't heard about Genie 3. So for generative media, you know, you have text to image. You could take a text prompt, generate an image from it. Everybody knows about that. You have text video where you take a text prompt and then you create a video from that text prompt. So maybe a cat walking across the street with the sun and you actually see it walk video five seconds, whatever.
You have an image to video, which is you have an image like a cat in the middle of a street and you have like an island in the background or whatever and then you just create a video from that. And so maybe the waves are moving on the island. The cat is there. It's the sun shimmering off of it. Those are kind of what people all know of. The prompt to world to real time video is just completely different paradigm where
you're not generating the static video clip.
you're just really generating a world as if you are a video director that just got plopped into that world. You create, let's say, a medieval castle and it has like torches on the side of the walls. It has this like carpet. It has like this table and a dining room and the AI has generated this world and you get plopped in that world as if you're playing a video game. So you can actually move forward and back and left and right and all of that that you're seeing is being generated.
Pierson Marks (31:46.91)
in real time. There's no physics engine underneath the hood that has explicitly like
Bilal Tahir (31:50.264)
Right, right.
Pierson Marks (31:54.1)
Programmed like you shouldn't be able to walk through the walls the Apple that's on the table should stay on that table and if you move around that table that Apple is going to stay there and so it's essentially like completely upending how video games this is the most cleared just like Analogy is like how video games work because how video games work today, you know is you use a development team has a physics engine Let's talk about just like open world like real world
Bilal Tahir (32:15.928)
Hmm.
Pierson Marks (32:24.013)
games, but like you have a physics engine that says how physics on Earth and in our universe work. You can't walk through walls when you jump you fall the things that you put them on the table will stay on the table and all about like natural physics and that is hard coded. It is the intractable problem that is extremely difficult to actually model all physics. It's very computationally expensive.
Bilal Tahir (32:48.056)
Yeah, it's impossible to capture the long tail of all the things that can happen in our physical world. yeah.
Pierson Marks (32:53.619)
Right. And with Genie 3, essentially, they just trained a model on the world data and it was able to understand physics and how objects interact with each other. And so you could be put into a world now and just walk around.
And in a video game space, what's so interesting about this is that like you have an Xbox on like I have an Xbox over here and you have your CPU, you have your the game developers bought the physics engine. They built the logic about like, hey, if you're a knight and you're running through this castle with a sword in your hand, they have to think about like, how does that sword interact and how does it swing and how does it, you know, do all these things? You have the narrative designers who are designing the storyline, but then you also have the programmers.
who have to take that narrative design and make sure that actually is converted from like, you know, a story and what the video game should feel like for the user to logic and code and converting that. That code part is about to just completely get eliminated. Now, video game studios are going to have a bunch of designers, storytellers. People are going to be like, hey, I want these characters to look like this. I want the worlds to look like this. You're not going to
Bilal Tahir (33:58.798)
Hmm.
Bilal Tahir (34:05.816)
Hmm. All right.
Bilal Tahir (34:13.88)
Yeah.
Pierson Marks (34:15.943)
the guys, the programmers to take that and convert it into code. They're just going to be able to be just the designers and the storytellers because that's all you need. It's so cool.
Bilal Tahir (34:19.468)
Hmm. Yeah.
Yeah. No, I, it's amazing. It's insane. I will say, I do feel like that there is still need for code because like, I feel like we're not at a point that you can generate the same because with a game at least.
like, you you want to generate the same world. Like you can't have a castle that's even, even just small details. Like if you have a castle that's gray, but then it's light brown the next time you're like, so I feel like maybe there's some hybrid thing where you generate the pixels and stuff, but then you maybe generate the code to generate it. So the next time it's more deterministic. I feel like.
Pierson Marks (34:42.099)
Right. Totally.
Pierson Marks (34:47.2)
Right.
Bilal Tahir (34:59.618)
there's a, because of the deterministic aspect of things, we'll probably have some sort of a hybrid approach. And I wonder if it's used as a initial boilerplate that expedites the process to generate these worlds. But then they're like, all right, we have the world now, we have the characters, how do we make sure we always have the sword, we always have the castle, et cetera. So it'll be super interesting to see how that works. Yeah.
Pierson Marks (35:17.833)
Right. Totally.
And I have a thesis around this too, is because in most cases, I'd kind of like agree with you that you need to translate into code. think that the AI accelerates that development process. But in video games specifically, you know, there's narrative designers who design the storyline. That storyline, like is, it's a lot of thought and logic and that's like the core of the game. And that storyline should not be different for everybody. You know, you need to have this one character, he needs to have this personality.
you have these other characters in the world and then you need to go through this journey towards some endpoint which like the development studio wants that story to be because that's like what they're paid to do is like creating a great story. But then to make that world come alive there's so many things that don't necessarily need to be consistent from player to player like the number let's say think about you're running through a field and you're playing Zelda and the number of trees on that one field
There's a designer out there that's like, I'm putting a tree here. I'm putting a tree here, putting a tree there. The grass is going to look this way. Nobody cares about those. Those are like these elements that make the world feel real. But if I have seven trees and your game is generated with eight trees, it's not going to matter. It doesn't like. And so I think you have that element. But also for the consistency part, I could see like this new type of paradigm where.
Bilal Tahir (36:25.187)
Mm-hmm.
Bilal Tahir (36:30.062)
Mm-hmm.
Bilal Tahir (36:35.777)
Hmm.
Pierson Marks (36:44.626)
you create a canvas that you have images, you go text to image. And so let's say in the same like castle example, you prompt kind of a castle on a hill and you pretty much as the narrative designer, you...
generate multiple images and you place them in this virtual world. And so you have Gini 3 which is generating the world, but it also has coordinate locations of where images should live in that world.
And so today, Genie 3, it has relativity. Like it knows how far that wall is from that player and it knows if your hand touches it. But there's no way to influence that beyond that prompt. But imagine if Genie 3 had some idea about coordinate location where you could actually drop in, like, hey, at this one point on that world, I need a castle that looks like this. And here are the five images. And so you have the consistency where it's like, this is the image as the base.
Bilal Tahir (37:25.966)
Hmm.
Bilal Tahir (37:34.318)
Hmm.
Bilal Tahir (37:45.434)
Hmm. I see what you're saying.
Pierson Marks (37:47.659)
And then you have a world that can be built around it. Maybe on that field of Zelda. Here's the trees and how they should look. Let me put a tree on this location, this location, this location. Genie 3, you fill in the details. I don't want to make the I don't want to have to make a 3D model about like how each little leaf looks. But like here's the image and just here's all the images. And I don't know. I mean, that's kind of what I thinking. Like video design and CGI for movies could look like.
Bilal Tahir (38:04.11)
Hmm.
Hmm.
Bilal Tahir (38:13.516)
Yeah, yeah, no, I agree with you. think it's a welcome to there'll be stuff that we don't care if it's there are seven trees and eight trees, but the appearance of the character, we probably do care about and we want to make sure that he looks exactly the same every time. So yeah, you're probably right. I think I think I agree with your approach, but they might like designers might spend a lot of time making the critical static assets, but then
Pierson Marks (38:20.586)
All right.
Right. For sure.
Mm-hmm.
Bilal Tahir (38:37.666)
they'll drop those assets in the world and be like, yeah, the rest of the stuff can be a little more flow, can flow, there can be a confidence interval around it. And we got the critical elements and then the storyline. As long as we get that tight, everything else can flow. it could also be, in some games, games are imagined like a minesweeper type will be a feature, not a bug where, yeah, you know.
Pierson Marks (38:51.657)
Right.
Bilal Tahir (39:01.358)
It was different, like the tree became like a different species or whatever. And you're like, yeah, I'd rediscovered treasure in there or something. So it's super interesting. I haven't played games in such a long time, but I was super into it. For me, the world building aspect and the leveling up aspect was so interesting. And I feel like...
Pierson Marks (39:08.34)
Yeah.
Ta-da.
Pierson Marks (39:20.362)
Mm-hmm.
Bilal Tahir (39:20.91)
There will be so many cool aspects of it. I actually this is a quick tangent. I'm watching this hour I know I I used to watch a lot of anime I don't do as much but I still like I have a soft soft spot for it and I was in I was in Paris and I Met this guy was super into anime and I was like dude. Yeah, there are no good animes anymore He was like no you gotta watch it So he gave me a couple of recommending and one of recommendations and one of them was solo leveling I don't know if you've heard of this one. So I was like, all right, I'll check it out and it's a super cool anime. It's basically
a world where you have level so you're a hunter and there's a dimension and you have to beat monsters so very like game but it's real life where there's like monsters and they have each one they appear in dungeons and there's a dungeon monster and the only way to close the dungeons to defeat the boss and so there are these ranks of these people and then this guy has the ability like I'm not gonna get a spoiler but he he was like the lowest of the low and somehow he gets the ability to
Pierson Marks (40:00.81)
Hmm.
Pierson Marks (40:10.218)
Bye.
Bilal Tahir (40:19.968)
level up, which is not allowed in this world. Like you're supposed to just have your rank. And so he starts getting stronger. And so he gets these quests where he's like, if I do a hundred pushups, I get stronger, which is like, well, yeah, but, but it's like very metric based. So he's like, if I do a hundred pushups, I, my, my strength goes up by like a little bit. And so, which is interesting. Cause I mean, I feel like
Pierson Marks (40:22.698)
Mm.
Pierson Marks (40:32.938)
All right.
Pierson Marks (40:36.874)
All
Bilal Tahir (40:39.608)
There's a, if you gamify life like that, if he had metrics, I feel like people would be super into it. Like, if I read a book, my intelligence goes up a bit. So, you know, it incentivizes you. we, so it's a super cool anime. And I wonder if we can like take that concept and we'll build these worlds where you're like, I'm going to learn math.
Pierson Marks (40:44.106)
Mm-hmm.
Pierson Marks (40:47.593)
Right.
Bilal Tahir (40:56.108)
You know, I go through calculus 101, you know, and I get a meter or whatever, like I get awards and stuff like that. there's so much, I think, scope for gamifying a lot of this stuff. anyways, check out Soul Leveling, anyone who's interested in anime, it's super cool. Yeah.
Pierson Marks (41:01.0)
Right. Right.
Pierson Marks (41:07.432)
that
Yeah, gamification of life is like that. Like that's why capitalism is a great economic model where you've kind of gamified with, you know, those problems with it too. But it's the best thing that we've discovered where, you know, you try to achieve and, you know, level up in society and have the ability to move up and down. And whether it's like strength, intelligence, wealth, these things where you actually have the mobility on like the caste systems and like the systems of the past.
Bilal Tahir (41:33.73)
Right, yeah.
Pierson Marks (41:39.338)
He just couldn't, so it's cool.
Bilal Tahir (41:41.538)
Yeah, no, absolutely. I've been having some sort of a measure to give you feedback about how you're doing. think it's so key. You we talk about like for startups or for people, you know? yeah. Yeah. So yeah. Genie 3 is super interesting. I guess, unfortunately, one thing with Google with these products, they never release it to the public. So I don't know if you'll get it, but you know, somebody was saying like Imogen, like last year they released Imogen, there was research preview and they were like,
Pierson Marks (41:48.554)
Mm hmm. For sure.
Pierson Marks (41:52.701)
absolutely.
Bilal Tahir (42:09.838)
which is too dangerous and now we have image in for via API. So who knows, maybe Genie 4 will get access to.
Pierson Marks (42:09.93)
Mm-hmm.
Right. Right.
Pierson Marks (42:17.419)
Totally, totally, I hope so. know there was an Oasis that you could play to, which is like a Minecraft version that was like GDT level, but it was like Minecraft generated on like real time and you could just actually play it on your browser. That's pretty cool. It looked like you were like on drugs playing it. So it was like the world kind of was nuts.
Bilal Tahir (42:25.75)
Really? nice.
Bilal Tahir (42:30.476)
Yeah, yeah. that's amazing. Yeah, yeah. And we've only touched on the gaming aspect, but I think there's a lot of other economics. people talk, the robots, I think one of the big things I think is super interesting is like simulating robotics. think one of the big problems in robotics has been the lack of
Pierson Marks (42:41.407)
Right. Totally.
Bilal Tahir (42:52.108)
data to actually train RL, train robots, because there's only so many factory hours you can record and stuff. And so one of the things, be like Jim Fang and stuff, people in this space have said, if we can generate, simulate robotic factory interactions and train robots in a simulator world, and then take millions of years of training and...
Pierson Marks (42:53.163)
Right.
Bilal Tahir (43:13.684)
speed run that and then you'll actually be able to train robots because if you think about us human beings, it took us millions of years to learn how to walk and crawl. Like this is why, you know, if you look at from an evolutionary standpoint, this is why, you know, we've a robot, sorry, chat GPT writing Shakespeare came before a robot learning to walk properly because, you know, we've used our brains, our higher functions for the last term.
100,000 years, but it took us 2 million years to learn to walk. And so from that point, it makes sense that that's the last boss. But if we can synthetically generate the data in these worlds, we can speed run that and get drones, get robots, humanoids, robots and stuff in the factories doing the labor for us, which I think if that happens, that's going to be a key economic driver. Because that way you can create amazing
Pierson Marks (43:41.546)
Right.
Pierson Marks (43:46.859)
Totally.
Bilal Tahir (44:07.722)
goods for basically dirt cheap.
Pierson Marks (44:10.92)
Yeah, and this is like one of the Nvidia's big value props on the omniverse type of simulated worlds where they're creating digital twins. Like digital twins in general just could be so cool for both planning purposes, like digital optimization of factories, of cities, of the moon bases, all of these really cool things that you want to be able to optimize. Like when you build a factory today, you get an architect and you get some engineers together, like how are we going to build out our assembly line? And you make a best guess decision.
Bilal Tahir (44:13.964)
Yep. Right.
Pierson Marks (44:40.864)
and those decisions are very permanent and you're not going to be able to iterate on the failures of, you know, your assembly line until one, you either take down that assembly line and have some downtime, understanding where the bottlenecks could be in that pipeline of like actual development or until you build a new factory. And that's expensive from like economics point of view. But if you can simulate, hey, we want to create X number of cars or robots or houses in a factory, like how
Bilal Tahir (44:55.79)
Thank you.
Bilal Tahir (45:03.351)
Right.
Pierson Marks (45:10.956)
can we simulate that and understand physics? And that could also be procedurally done too. I think you could do like a hybrid approach where it's like some procedural physics, like we can do today, know, video games and put a robot into a pretty good physics engine and like simulate, but it's hard to create the accurate environments and all the nuances around the edges.
Bilal Tahir (45:12.643)
Right.
Bilal Tahir (45:32.014)
Yeah, super, super fascinating. if you take that to your logical code, this is where I put my tinfoil basically gets you a simulation hypothesis, right? We basically end up, maybe our world is running a Genie 6 or whatever. We're just like characters in it, so who knows?
Pierson Marks (45:41.268)
Alright.
Pierson Marks (45:48.53)
Yeah, totally.
No, it's wild. And I know we had some of the video stuff on here too, but we're also at 45. So I mean.
Bilal Tahir (46:00.706)
Yeah, we can talk about the video stuff, I guess, like next week, I kind of leaving a teaser, I guess that the only thing I'll say is like, video is having its moment, right? You know, it's been coming for a while, but we're finally getting videos like five second videos being generated, basically at the price of an image with the one release there 14 billion parameter model, which was really good. And it was costing 10 to 15 cents to create one video. And then somebody distilled that to a 5 billion parameter model, which costs 0.025.
Pierson Marks (46:06.6)
Yeah.
Bilal Tahir (46:30.64)
like dollars for a five second 540p video which and you know 720p like slightly more like 0.03 or 0.04 which is insane I mean think about it like you're basically generating a five second video basically in a two three five second video I think was in five seconds so basically real time and costing you cents
Pierson Marks (46:35.966)
Right.
Pierson Marks (46:42.538)
to learn.
Pierson Marks (46:54.708)
Mm-hmm.
Bilal Tahir (46:54.798)
And so if Genie 3 might not be there, you can like create basically a movie, yeah, to our movie, you know, basically for a handful of dollars. I won't be a good movie probably, but Hey, I mean, you can do it and it's only going to get better. Right. So I don't know. think it'll be. We're living through an insane period where I genuinely believe, think within a year or two, we're going to be able to create Netflix shows and movies, you know, like from our, from our laptop, not from our laptops, but using these APIs and stuff, you know, and think about that.
Pierson Marks (47:23.176)
Right, right.
Bilal Tahir (47:24.812)
Like when I can just generate my own animation or you can generate your own movie, think that's been like a dream of mine for a number of years now. So yeah, very exciting. So yeah.
Pierson Marks (47:32.97)
Totally super exciting. Cool. Okay well episode 8 of Creative Flux. Thanks everybody for joining. Below we'll chat next week and talk a little bit more about video. Cool.
Bilal Tahir (47:39.533)
Yeah.
Bilal Tahir (47:45.356)
Yeah? Alright, take care guys. Bye.
