What is Clawdbot (Moltbolt/OpenClaw)?

Pierson Marks (00:00)
Hello.

Bilal Tahir (00:01)
Hello,

Pierson Marks (00:02)
Episode

29.

Bilal Tahir (00:05)
nice, yes.

Pierson Marks (00:07)

last week we talked about re-motion and that one I think was our most popular episode to date yet. yeah, a lot of listeners. So if you wanted to learn about how to use AI to make motion graphics and videos, that is the episode to watch.

Bilal Tahir (00:16)
Wow.

Yeah, that was a fun one. Remotion skills I think are so powerful. So definitely check it out, play around with it, build cool stuff.

Pierson Marks (00:31)
Yeah,

totally. Well, if you're watching or listening to Creative Flux for the first time, we are your hosts.

And we talk about AI, generative media, images, video, music, world models, some coding, like all of this stuff, I think we've kind of branched out. I mean, it's generative media focused for sure, but then we just talk about like the cool things that interest us in the world of AI.

Bilal Tahir (00:54)
there's just so much happening and a lot of them obviously will overlap and intertwine because, you know, one enables the other like, like we talked about emotion and that's like a gendered media tool that we know and love. But also the reason it blew up is because of this whole agentic programming paradigm, you know, with the plot code and Ralph and stuff. And so it all comes together. I feel like you got to if there's a superpower in knowing

a little, you know, on a high level what's going on in all these different aspects, you know, especially as a creative person, I think, you know, because most people don't know what tools there are. It's like having a Sledgehammer, but you're like, you know, there's, you know, there's other stuff, you know, there's the, these nails and stuff you can like drive in.

Pierson Marks (01:26)
Alright.

Yeah,

I know it's true.

So listen to creative flux. If you want to, you know, have be the first ones to, to have the power tools and leverage a lot of the stuff. There are some things that I thought was pretty cool this week that I wanted to talk about before we even get into Claude bot, which was like what took the world by storm? I know we've talked about skills and it kind of is a related thing. And I wanted to talk about this because we talked about this a few days ago.

But if you're listening to this and you are using agent skills, the problem that the agent skills have right now is that they don't like version correctly. So.

Bilal Tahir (02:11)
Hmm.

Pierson Marks (02:13)
I was just thinking, mean, it'd be really, really helpful if we created a tangential package manager or something out there that when you publish a package, like Remotion, for example, great example, when you publish Remotion and allow other people to use it, that you can also publish alongside your NPM package a skill.

that is versioned alongside your release so that when you push a new package with new changes on the API that if you want to probably should you could just update the skill to also be versioned to use the new APIs and the new stuff. This is like one of the biggest problems right now is like people are facing when the LLM for example are trying to write code or do something that is outdated and

You're like, no, no, no, why are you doing this? This is like, use the new version. And it's just so ingrained in this training data to use the old stuff. And that's why skills come in handy, because it can just provide a new context. But also, it's like, sometimes the skills get out of date. Now you have this other thing you have to version. You have to manage and update and make sure they're parallel. So yeah, mean, somebody's got to create this or figure out why.

Bilal Tahir (03:26)
I yeah, I mean,

I remember you mentioned this and it just made so much sense. Because I mean, we already solved like code versioning through NPM packages. mean, so why reinvent it? So it's like we have node modules, you know, when you install.

package, gets updated. So maybe dot skills or dot cloud, whatever that should just be is basically a node module in a sense. Right. And so you should just update it. And if you're a creator of an NPM package or library, whatever, and you have an NPM, it's just so easy because it's just a CI CD step where every time you make a change, create people, automate change logs already. So you just might as well be like, okay, look at all our code, code changes and now update the skills, you know, automatically, and you just make it part of your process. So

Pierson Marks (04:05)
All

right.

Bilal Tahir (04:07)
Yeah,

I think people should do it as publishers. And then I think once we start seeing this, hopefully we can come up with a standard way of pulling this in rather than right now the hack, which I think is a middle ground is this NPX skills thing that Vercell created where you can find skills and stuff.

Pierson Marks (04:24)
Right, Yeah, if I just go npm install remotion, it should give me the option, would you like to install this skill also? And then be like, yeah, install the skill or bundle the skill in the package. And then you have all these problem injection issues, you know, like if you have a malicious package and then because at least packages.

I mean, it could have malicious code in there too, but you don't have access to the agent. But now if you insert instruction into your skill, that'd be interesting because you insert instructions in the skill. So your agent now has access to your computer because your packages don't have access to like the entire operating system and stuff.

Bilal Tahir (05:00)
Yeah.

But I feel like that's not a

new thing. That's like, mean, you can ship malicious code in your NPM package. I mean, that's actually probably, I would argue that's probably the weakest, most fragile part of our, at least our web infrastructure, because it's so easy to have a malicious dependency that a package uses that another package uses. And so because it's all nested down, we've seen these crazy ⁓ exploits. that's all.

Pierson Marks (05:04)
Yeah.

Totally.

Bilal Tahir (05:25)
I think that's just part of the game. ⁓ And NPM, I think we talked about it, there was a huge exploit that caused NPM to really button down, button up their process and come up with a review process whenever you're shipping new NPM packages or something. yeah, it's just gonna be something we have to be aware of.

Pierson Marks (05:27)
All right.

Totally. ⁓

It's super interesting. I was reading something the other day about how like there was alumni from UCLA who I got to meet a few months ago and he sold his company to Microsoft back 20 something years ago. It became Microsoft Outlook. So he was talking about that acquisition process is really interesting. But one of the parts in the due diligence, so like any acquirer is going to due diligence. They interview the whole founding team. They make sure that you actually know what you're talking about. They go through the licenses of all your packages.

packages

like do you actually are you you know like do you one are you like competent and you you actually wrote this code does the team know the code really well it's like an interview process that's really funny do or do all the packages that you've installed have the right licenses malicious code you know like all these things imagine you're a little small startup and then you get acquired by Microsoft and there is something that you did along the way that had some like malicious package which was just waiting there and then boom now

all of Microsoft's compromised, you It was interesting him talking about that. But he's like, nope, we audited ourselves beforehand. So all good.

Bilal Tahir (06:51)
Yeah, yeah, maybe there's a meta skill to audit the skills or something. I already know that Versal did that with fine skills. So they created a skill called fine skills, which is a skill to then find the right skill. It's crazy.

Pierson Marks (06:58)
yeah.

Yeah, skill, fine skill.

And anthropic has a create skill skill. So you can look up fine skills and you're like,

Yeah, this is like really cool. mean, if you're listening to this, this is a process I went through this week. So for JellyPod, I wanted to create a bunch of new competitor pages. We wanted to compare JellyPod with other competitors, candidly. Like, where does JellyPod shine? Where does the other player shine? And when you should choose us versus a competitor. And what I did was I went into Claude. I asked Claude to research JellyPod on the website.

So like look through all our over code go to these other competitor websites look at what they What they're doing look at their pricing look at both of our pricings and then essentially I was okay. Well now create this template of Like how we want to do the competitor analysis like we want a title we want these sections we want this table here's pricing comparisons everything and then Put into a CMS and so I did this

and

Afterwards, it's like well, huh? This is super interesting. I'm gonna probably want to Do this sort of analysis like I was iterating with Claude over the course for two hours to be like no We want to focus on these types of things. You're talking about us wrong You want to be more like, you know truthful in the competitor analysis part And so it was an iterative process that took me some time, but then you scaled it But that's like what what if I can just collapse this sort of competitor analysis?

and copywriting into our own unique scale for JellyPod so that the next time I want to add a new competitor into this competitor table, I can just say, invoke this competitor analysis, do competitor analysis on this website, and then it'll do it and it'll update our CMS. And it kind of was really cool. It worked. I just created a scale. This one scale used other scales, and it was able to add a post into our CMS the right way.

It didn't even go through MCP. actually called the APIs because the MCPs were exploding our context windows. So actually just called the API. And so it was actually pretty cool. So you can create your skills to create these processes, SOPs, and procedures for your agents to follow.

Bilal Tahir (09:12)
Yeah, no, 100%.

I think that's such a way to compound your learnings. I've started doing that too where if something repeated, I'm like, just make a scale out of it. And I found this paradigm where almost, skill or not, feel like whenever you're chatting with an agent and doing something,

Pierson Marks (09:22)
Yeah.

Bilal Tahir (09:33)
I always basically now start with a doc, because having that persistent memory, mean, even though the agent will have its own plan mode or whatever, but just having that there and you can see easily and you can iterate, and then you don't have to worry about context or everything, and you can go in between like codecs or cloud code or whatever, I think it's so powerful. So like...

Pierson Marks (09:52)
Yeah, we can

walk us through like what you're doing and I'm curious too, because I don't do this. I leverage plan mode right away because it creates a markdown file for you that you can go in and edit. So yeah, I'm curious. What's your process?

Bilal Tahir (10:00)
Right.

Yeah, mean,

for me, mean, if you go plan mode, and I think it depends on the feature, if it's a small feature, I you can just knock it out like quickly, but if it's a bigger one, the thing with plan mode is like it does dot-clots, so it's in a different root directory where it stores that plan mode. I like to just have a PRD.md file in the root and I'll create it and I'll, it's just easier for me to just see what it was doing and I really like.

chat with it and let it update and stuff. And then sometimes I actually like to play other agents. So I'll get Codex to review the PRD. I'm like, what do you think? And then I'll say, a grumpy old senior engineer looked at your plan and had this feedback. And so I literally like go back and forth between them. It improves them. So just doing that for me personally, I think it's become.

Pierson Marks (10:43)
Thank

Alright.

Bilal Tahir (10:51)
much easier and also, and Ralph kind of inspired this. If it's a longer one, I'll immediately start doing check boxes. And what that allows me to do is then I'll see the progress, you know, and then I can update and iterate and I'll add more items to it and see how it's doing it. So it's kind of like this weird Ralph Wiggum hybrid approach where it's not completely AFK, but I'm actually seeing it checking boxes. And sometimes when it runs out of context and I feel like the performance is going down now.

maybe I'll compact it or start a new terminal and then I'm like, there's a PRD. so, because I don't know how you can refer back to another plan. Cause if you create a random three word hyphen dot MD file and you start a new session, unless you tell it that's the plan mode, like you have to remember that, it doesn't. So that's why I like the root member approach. And then I can delete it and stuff.

Pierson Marks (11:36)
⁓ Right, right. That's interesting. There's

this plugin that we don't have yet. I think I'm going to add it to our main project because, but it takes a little bit of, you know, it's not like an automatic system. takes us to like kind of diligently do some stuff, but it's by, it's called the compound engineering plugin. I don't know if you saw this. Have you seen this?

Bilal Tahir (11:59)
That sounds cool.

Pierson Marks (12:00)
Yeah,

so there's a company called Every. Dan Shipper is the CEO and founder there. They're like, it's a really cool company. I think you would like it. It's a...

I'm going to butcher the description of the company. build a lot of cool tools. So they build like they have like six different products. They have a subscription and you get access to all those products. They also have like a really unique blog. They do boot camps and like workshops all the time. So like today the V zero workshop that's downtown. They're doing it with Guillermo. Like the podcast is with Guillermo. It's Claire Vaux or Voo or something. So really cool company. They have access to all the models before.

public does and they run benchmarks and all these products but they built a plugin called compound engineering which specifies how to make or it gives Claude the ability to make each feature

It's like persistent memory for your code base. So at the end of a coding session, it will like kind of compound everything that you've worked on, architecture diagrams, everything, references, puts it into its own directory, correctly formatted with front matter, like markdown. you could easily search, just like skills use front matter as well.

Bilal Tahir (13:07)
Interesting. Yeah.

Pierson Marks (13:13)
your root and so that

You know, like right now I'm working on this big thing with agents in the jelly pod and I'm making strategic architecture decisions that really forced, had to like nudge Claude to go a different route. Like I really had to be like, don't do this, like please. And then it would do it again. I'm like, no, no, no. Like, it's blowing up the context window. it's, had to clear this off and often, and it just keeps forgetting. And this is kind of like this paradigm where you write a lot of this stuff into.

organized, easily searchable sort of files and folders where that you can like see architecture documents and API standards and make it easier for every new piece of code to be written by Claude without forgetting.

Bilal Tahir (14:01)
right.

Yeah, no, mean, this is something I actually was thinking about the other day, because we have rules, Like in code base, you can add rules like, oh, we like to use ES modules and we like async away. And that's rules and we have those. But there's something I was thinking, I was calling it similar to what you were saying. My word for that was a mental model. And I'm like, when you're a developer and you come into a code base and you kind of develop a mental model of what the code base is, so I might know, I'm like, oh, I need to upload this file to cloud.

I kind of remember we created a helper function to upload files to.

super base or whatever, right? So maybe I should look into the utl folder and see if we can reuse that component. And I found one of the things where Cloud Core especially will be bad at is it's very greedy. So it'll just try to create something from scratch rather than trying to look for and I have to nudge it like make sure we reuse stuff and we've added rules and stuff for that. I've kind of to your point, maybe, you know, we haven't seen this concept before. Maybe it's out there. You almost build like a graph of a code base, which gets up there where you're like utl helper functions, this kind of, it's not like only it's not like

every file because that would be overwhelming but on a high level like the way our developer thinks they kind of have these hazy memory like UTF folders upload functions go there hooks go here right and and this is our pattern and so maybe you create these kind of like things memories or whatever and map them

Pierson Marks (15:19)
Totally. Yeah,

no, absolutely. think so too. And I think you can have an argument. think the reason why people are like, just like, Claude will go and explore the code base at the beginning or all this. But it's like not true because it's just going to be an efficient exploration. Like it's just not going to remember because if you're every time.

Bilal Tahir (15:34)
Yeah, and it'll do that every time you open it. it should not.

I you know, that's the cool thing about it. I mean, I don't relearn our jelly pods to take code base every day, right? mean, so do your complete engineering part.

Pierson Marks (15:43)
Right, right, exactly, exactly.

And there's just gonna be things where like it was more high level where you're like, no, like, I just need you to know this going forward. And so that like if I'm developing and it's not even at the user level too, which is cool, it's at the product level. So I can implement and compound the knowledge of my architecture decisions and why I these things, put it into these folders.

correctly named, correctly organized, and then you, when you're in the code base, are able to leverage everything. So I don't have to explain. Because sometimes I think the biggest problem that I have as an engineer sometimes is, I think every engineer has this problem, but communication. There's a lot more clarity in your head than what's being spoken. And so people fall into the trap of like, oh.

Bilal Tahir (16:29)
Hmm. Yeah.

Pierson Marks (16:36)
does the other person actually understand? It's a very lossy communication medium when we're chatting. And so if you're with Claude, sometimes Claude does understand what you're saying in your brain because it has all the context of your code base. So if you can just give a little sentence, it'll do some exploration. It actually does read your brain. But another developer won't read your brain. Another Claude instance won't necessarily read your brain. But if you save that into a markdown file,

Bilal Tahir (16:58)
Yeah, it's super powerful. And maybe this is a good

jumping off point to talk about Claude bot or Maltbot as it's called now, but because I think the reason it blew up was because of the context and this memory we talking about. So just taking a step back, what is Claude bot?

Pierson Marks (17:04)
Do it. Claude.

All right.

Bilal Tahir (17:15)
now called Moldbot. some guy, know, amazing 10x developer apparently, you know, he built about 25 days ago, he created this project called Cloudbot. And it basically went from zero to 30,000 almost get up starts, which is insane. It might be the, I don't know if it's the, the highest, the, the, the fastest growing growth in grid up stars for a project ever or something, but it is crazy. And it's just like, it literally like,

Pierson Marks (17:29)
All right.

Bilal Tahir (17:42)
like I usually I feel like I'm on top of things where you know I'll see something on Twitter or whatever I found like people were talking about Claude for like two days before and I'm like wait what and suddenly it was a thing I was like wait this is a thing now where did this come from you know I didn't I missed the announcement because it was so fast and and so what is Claude bot Claude bot is essentially and it was so simple it's very simple but the idea is brilliant and it kind of makes one of those ideas you look back like duh it makes sense because every

with ChatGPD, every one of these companies, they have their own way of managing your memories. You go to ChatGPD, you chat with them, you're like, like this, but it'll create a memory, it updates the memory, keeps there. CloudBot does the same. So you have all these fragmented little applications where you interact with and you have to kind of re-teach them, I like lasagna, not pizza, whatever, right? But CloudBot flips this.

It says, hey, you know where all your memories live? In your computer, because you're always on your computer. So let us build this application on your desktop. And.

we'll use your memories on your computer to interact with Clawd, with Codex, with ChatGPT, whatever, and use all the context you have. So to your point about compound engineering, all your data, which exists in one place, and so you don't have to worry about repeating this process with everyone. Instead of talking to my friend Joe and my friend Jack and having the same conversation, I'm like, let's just have...

Pierson Marks (18:52)
Right.

Bilal Tahir (19:05)
get have a party and bring all my friends together and then we can have the conversation and everyone can listen to it. The second thing which I think was brilliant was instead of building their own chat GPT chat UI.

which everyone has a chat UI, I mean, we have a chat UI, but it's, he said, you know where people like are already messaging? Telegram, WhatsApp. And so let's just hook into those interfaces. So Cloudbot, you use one of these interfaces like WhatsApp or Telegram, and you chat with Cloudbot through that, and you have memory. So these were the two big insights that are so simple,

Pierson Marks (19:33)
Yeah.

Bilal Tahir (19:37)
you add them up and suddenly everyone's like, yes, this is what I want. Persistent memory, the context problem is, I wouldn't say it's solved, but people say, AGI is there, it's solved. I don't think it's solved, but this is a big step in that direction. then you have a lot of people who are not developers, who are not into it, but they understand WhatsApp and they're like, they can install a Mac OS app. And suddenly, you know, it's kind of like the cloud cowork moment for them again in a week. They're like, yes, I can download it. Suddenly it remembers and I can have WhatsApp.

Pierson Marks (19:39)
app.

Bilal Tahir (20:05)
and I can jack with it.

Pierson Marks (20:07)
Totally. No, it's super, super cool. There's three directions that can take this. One of the ones is like, I want to call this out. I don't know. Maybe you'd know this. like, have you ever used Obsidian? The note app? Right. Yeah. ⁓

Bilal Tahir (20:16)
I have heard of it. I know people are fanatic. I mean, it's one of those like, you're called super productivity, second

brain type of person. You love that.

Pierson Marks (20:23)
It's kind of like, yeah,

it's the Obsidian versus Notion crowd, think. Obsidian is much more hackery. It's open source. I'm on the Obsidian crowd. I don't use it. I'm Apple Notes diehard. I just literally have a note, and it's just super simple. don't need anything. I would just use a text edit if it was just as easy to open up. But yeah, Obsidian essentially is like,

Bilal Tahir (20:28)
Yeah. ⁓

Hahaha.

Pierson Marks (20:46)
basic note taking app where all the notes are stored in Markdown files. And Markdown is the perfect medium for LLMs. I'm just like, there has to be a combination here because so many people use Obsidian to connect. Like the thing is that you can create a graph of connections in Obsidian. So like you have one note connects to another one. And I don't know if they use an LLM now to do that automatically with like embeddings, but.

Like having that index of all your personal thoughts in addition to all your other data that's everywhere, like in Cloudbot, be really helpful. Like I could just like, if I'm just writing notes down all the time on to-dos and obsidian, that seems like a really just natural way to make store memories. I mean, everybody's trying to think of like these complicated memory architectures and everything. But if you just have a bunch of Markdown files that are correctly named and correctly

Bilal Tahir (21:13)
Mm-hmm.

Pierson Marks (21:34)
organized. That's all you need for memory. I don't know. That's my opinion. I don't know if you need anything special.

Bilal Tahir (21:39)
Yeah.

Yeah, mean, it's fascinating. The embedding approach, which I was subscribed to for a long time, I thought that was enough. Turns out it's like the problem with that is like, there's just so much conflicting information, the context blows up. the whole search for the five most relevant documents and do that. And so the approach I've seen, I think a lot of people are doing it is rather than creating embeddings out of all documents and then doing a similarity score, you know, pulling it in the top end documents and then putting in context, what you do

is you take a document, you come up with basically rules from it. You're like, it's a document about how I went to this restaurant and I tried this dish and I liked it. You'll create a memory, Bilal went to this restaurant and he loved this dish. And then you stored just that one line instead of like the whole, maybe the whole note I wrote. And that's like a memory. The other...

tricky thing is recency, there's something called the recency score and then updating memories and deleting memories. So recency score, maybe let's say I go to that restaurant again and I'm like, well, actually last time was a dud. So I actually hate this restaurant now. What you wanna do is then update the memory base and be like, this is a more recent memory.

So what recency beats the previous one? So I probably want to update the memory because the recent memory is more, you know, it's as higher You know higher value and so you update the memory based on that and then the third thing is forgetting memories And this is something it's also tricky because memories build up and you can't just have 50 million memories and so humans do that to we Forget stuff to keep ourselves, you know like saying and so you have to decide on what adventure I delete a memory and So there's a to your point. Yeah, I agree. There's a lot of over

Pierson Marks (22:52)
Mmm. Alright.

Mm-hmm.

Bilal Tahir (23:13)
going on right now, but I do think there's somewhere between having just plain files and this over engine architecture. There's probably some hybrid approach where you can like do something simple. In hindsight, it'd be like, yeah, that makes sense. You know, we should have just done that, but we haven't found that sweet spot yet.

Pierson Marks (23:24)
over.

It's

super interesting. mean, if you're in academia right now, this just throws me back to discrete algebra and graph theory. Because you just have these connections. When you're talking about deprecating memories, have a memory that has some half-life. And do you reset that half-life? Do you slow it down? Do you change it? If you have a core memory that everything is pointing to, it's all related to stuff.

That probably becomes more important. You probably don't want to forget that core memory, but everything that's connected to it. It's like how far away, like how many degrees away is some memory node from like a core node and how relevant is it? And you traverse the graph to find it. But then like you, there's like all these sort of like graph traversal things that you traverse from core memories down to the edges of the edge nodes of like memories. Do you do like

like a hierarchical sort of folder like system where you have one core memory and then you, I don't know, like it's cool. mean, there's different approaches.

Bilal Tahir (24:32)
This is the new lead code, breadth-first

search, depth-first search problem right here. This is all.

Pierson Marks (24:36)
Yeah, I can't wait till we start interviewing people eventually. And it'd be

like, questions, but more of like researchy, like how would you implement memory or like doing these things? I hated lead code. And not to get on that tangent, but like, my gosh, the worst thing ever. And I'm glad it's dead. It's finally dead because you can't ask lead code anymore because of AI. It doesn't matter.

Bilal Tahir (24:50)
same. Yeah.

Yeah. Yeah.

I mean, I don't know if

it'll ever be dead because I think unfortunately it's a good proxy for coding IQ in a sense. But yeah, I did, I hated myself. And, I will say one thing, probably data structure. mean, particularly graph theory is very, I think it is critical because it just informs so much of your architecture. I do think systems architecture is still, still key until like hopefully maybe Opus five. Yeah.

Pierson Marks (25:07)
Yeah.

That is structures.

Yeah.

yeah, system architecture for sure. That's what's going to be, that matters.

But I don't need to know like these crazy, leet code hard questions. I'm just so dumb. And yeah.

Bilal Tahir (25:31)
Right. Yeah. But

memory is fascinating. do think it's like memory context, you know, is like, is the big bottleneck and people are trying to figure it out. We will figure it out and, and.

Is this fascinating? think if you're working in this space, think it's something, you know, to pay a lot of attention to. I've thought about, we've talked about JettyPod as well, like, you know, people that have podcasts and episodes and what kind of data, you know, how should we keep track of, you what you're talking about and stuff and what are things? We have different aspects, like you have hosts who have personalities, have backstories. And so building a rich, basically this rich universe almost of, you know, like recreating hosts with a personality and a voice.

Pierson Marks (26:02)
All right.

Bilal Tahir (26:10)
and a knowledge bank, et cetera. It's like a, it's not trivial at all. And there's so many different ways you can go about it. So very, very cool stuff.

Pierson Marks (26:17)
Right. ⁓ totally.

Well, I just got pinged about this. Literally what you were just talking about, writing the markdown files. I literally just got a text about how they were doing this. This is crazy. I guess Karpathy was talking about this. Was this inspired? Is that kind of why you do this? Or was it?

Bilal Tahir (26:29)
Who's joining us?

I

don't know which Karpati post was it.

Pierson Marks (26:40)
Yeah.

But yeah, it was just interesting about HiCow. I think writing your own markdown files first and then having Claude do this. I wonder even too here, sorry to go back to this, though. I liked, I was doing it in linear. I was putting the content in a linear issue and then just saying, Claude, hey Claude, look at this issue and it gets all the context on the issue and then go out and implement and then go do your plan and everything. Yeah.

Bilal Tahir (26:53)
Right.

Yeah.

No, it is cool.

mean, I do that for content as well. It's very cool. I've done it. was telling you about how I was like playing around with Suno. So I've gotten kind of into Suno with like making my own music and stuff. So I'll like create songs and stuff. And then I have a markdown file of a song. And then I'll be like, Claude, okay, let's make it. Let's create a skill of how to make a viral Gen Z song. So you create a skill of like, know, okay, Sabrina Carpenter is espresso. Like what makes this song catchy, you know? And then you...

Pierson Marks (27:19)
you

Alright.

Alright.

Bilal Tahir (27:33)
kind of have this like, let's make my song, you based on my subject matter, maybe I come up with a subject. How do we like, you know, it's not like pleasure, it's more like inspiration, like, you know, okay, I want to make a catchy song. And it's so cool. Like I'll create this and I'll see the song get better. So it's almost like a new way of doing stuff. So I won't say, I think people are still thinking of this as a cool thing, but it's like a content thing. You anything you want, you know, you just start with a document and a doc is basically this living doc, the concept of a living doc, you know, I think it's so powerful.

Pierson Marks (27:48)
Bye.

I don't think.

Bilal Tahir (28:03)
where it's kind of like the Claude Constitution or whatever, right? I mean, it's like a thing, you just hone in on it, you focus on it and you update it, you add your thoughts and it grounds what you're trying to do. And it forces you to, like you said, communicate like, okay, this is what I want. And you can see when things don't work, you're like, okay, this is why in the model, like in plan, I don't like this, let's change that.

Pierson Marks (28:11)
Right.

totally. Super sick. And I know there was some other stuff, too, that I think we should talk about real quick. There was Google DeepMind partnered with Pixar. Created a short story with AI. That be very cool. There was.

Bilal Tahir (28:35)
Yeah, yeah, very cool short story.

Pierson Marks (28:41)
Last week we didn't talk about all the open source like the Quen TTS stuff, but that was pretty cool. I yeah, I mean, there's like a Chinese open source movement right now. I don't know if you could, people could feel it as much in the U S but I definitely think this week was very big in the, or this past week in the Chinese open source community. had Kimmy K2.5 that came out, this agent swarm stuff, crushed all these benchmarks ⁓ besides coding.

Bilal Tahir (29:00)
huge.

Mm-hmm.

Pierson Marks (29:08)
And then we also have a Quen TTS, which is really good. Like, I don't know. What's your thoughts?

Bilal Tahir (29:13)
Yeah, no, I have, yeah,

I'm a huge fan of the Chinese labs. They offer cool stuff. I mean, there's a lot of China here, you know, and some of it is justified, but I mean, I really like.

the stuff they put out, it's great for the community. so talking about these two releases, think, starting with, I'll talk about Kimi 2.5, because that's like a whole rabbit hole, but Quent TTS is cool. What I really liked about it was, it's an open source text-to-speech model, but it's actually, they did a couple of things which most models, unlike, haven't done yet. 11 Labs, which you are familiar with, they let you design a voice, so you can describe a voice and then design it, and they let you clone your voice

give

it a voice sample and then you can obviously use it text to speech. Quen released a series where of a 0.6b model and a 1.7b model where you can design a voice just for the 1.7b so you can like describe like give me a cowboy gruff old man's voice it'll give you that you design that then you can clone that if you want or you can clone your own voice whatever and then you can use it in the text to speech endpoint so you can just use that it creates a basically an embedding under the hood and

Pierson Marks (30:15)
Mm.

Bilal Tahir (30:15)
you

use that tensor embedding and so it's super cheap it's open source and I was playing around and I was like this is awesome you know I mean just you know voice I feel like hasn't improved as much in the last six months it's been kind of frustrating and so seeing these releases always makes me go yes you know more more like this you know and it's very it's nailing some of the emotions as well still improve there's room for improvement but it's getting pretty well good it also follows a couple of other voice releases there was

I think the wide voice was by Microsoft and then there was another one I forgot but it was also really focused on the emotional cadence of a voice. It was pretty good. and there's this other package I should talk about. It's called MLX Audio. It's kind of still like kind of hidden. It's basically Apple's library kind of equivalent to like Pytorch for audio.

Pierson Marks (30:55)
That's super cool. ⁓

Bilal Tahir (31:06)
So if you have a MacBook and stuff, want to run stuff on Apple Silicon, MLX audio is pretty good. And they have a couple of small models like Pocket TTS. It's only 100 million parameters. Kokoro is 82 million. So it basically gives you real time or close to real time inference, extra speech, which is really good on the CPU, on your laptop. So very cool stuff. If you just want to run local TTS, I would definitely recommend you check it out, especially if you have an Apple MacBook.

Pierson Marks (31:31)
Yeah, no, totally. I haven't done a lot of local TTS.

Bilal Tahir (31:35)
Yeah, I think it'll be huge, especially for like more intimate, private conversations. know, I'm surprised that I've been thinking about this idea for ages where you basically have a local model, you hook up an avatar, and then you can basically have your own like Jarvis like, you know, in your laptop, but private, you know, because...

Pierson Marks (31:52)
yeah, sure.

Like, I don't know. What's your thoughts on Smart Home stuff?

Bilal Tahir (31:57)
Smart Home, I've never been a huge IoT guy, I bought an Alexa two months ago, I used it once and now it's like on silent. And sometimes I'm paranoid that even on silent, it's still listening to me. I don't know.

Pierson Marks (32:09)
I always

wonder with the local TTS and all like local skills I wonder if it's gonna be the time where you could actually have smart home stuff because The thing was always that you don't want something listening But if you can process speech locally come fully locally

Like, not, because right now, like how Alexa and Google Home, they all work is you have a wake word, supposedly. You have a wake word. And then once the wake word is triggered, then it sends everything up to the cloud and processes the cloud. The cloud then goes and does whatever it needs to do. And it comes back down. But if you could have a smart home device that literally is purely, um,

Local, where you have an LLM, does local speech recognition, local speech synthesis, connects to your other devices on your local network without ever having to hit the internet, that would be very cool. Maybe there's things that you do need the internet for, which could go through your phone, maybe versus a device going up, like your phone allows certain queries or whatever. if you could actually use your smart home devices with no internet connection, that should be, maybe that would be

Bilal Tahir (32:54)
Right, yeah.

Pierson Marks (33:17)
what works.

Bilal Tahir (33:18)
I mean, I'm

sure there will be a fraction of the population that really needs a feature. I personally just think privacy is, know, fortunately or unfortunately, is not really a big feature for most people. They don't really care. It's more about functionality. I think it's more, I feel like the promise of IoT has been just, I think.

just over promise because in terms of the form factor because I don't want a smart toaster. I don't need a smart toaster. If anything, that's probably a bad thing.

Pierson Marks (33:45)
Are you sure? Are you sure you don't need a smoker?

Bilal Tahir (33:49)
I

mean the stuff like oh you can I you know the fridge will tell you that your milk is expired maybe because maybe I maybe if you have a family of like five kids and stuff and you know you're running around I feel like it's one of those things I think the bigger the household maybe the the alpha of these things goes up but for me and maybe that's why I don't see it because I you know it's pretty light household here but stuff like Google Home Nest you know people some people love it it's like oh I I don't really know enough about it to

make an educated ⁓ opinion on it. Personally, I've never really found it to be that interesting. ⁓ Maybe with more capability. ⁓

Pierson Marks (34:23)
Right.

Totally.

Yeah, who knows, who knows.

Bilal Tahir (34:32)
Anyway, really quickly on the other one, the other Chinese, Kimi 2.5. So why was it a thing? So Kimi, think, has been released. Compared to Jeepseq, it's kind of still being slept on. Because we had the whole Jeepseq moment last year. And people are like, oh my god, Jeepseq came out with this model that's trained on like $2 million. And it's going to totally destroy the mode of US apps. Well, Kimi kind of did the same thing six months ago with Kimi 2. Kimi 2 was an amazing model. Creative writing, would imagine. Actually, a lot of people think creative writing-wise, it's better than US models.

It doesn't have the LLM way of writing stuff. And so a lot of people use it for creative writing. And it came up with other concepts. There was one called interleaved thinking where you go to thinking mode and then.

you stop, you say something, and then you can go back to the previous thinking conversation. So it was interleaved like this, some advantage to it. So that was an interesting concept. And this, they came up with now the new version is Kimi 2.5. And even though it's 0.5, it's a radically different model. Like it's huge leaps in its capabilities. And a couple of things that really stood out that people are talking about, and some are even calling it the new DeepSeq moment is that first, it basically is on parity with like all the

models except for coding as you mentioned but they're very close in that dimension as well but they came up with a couple of cool I think capabilities one was KimiCode they came up with their own agentic like CloudCode and some of the stuff they showed I have to use it myself unfortunately you have to need a subscription to use it but the front-end design was just amazing I mean you know if some of I don't know how cherry picked their examples were but it looks way better than a lot of other

the iGendered website. that was cool. know, just amazing animation and stuff. They really invested, I guess, in the data collection part there, I guess. And the second one was Agent Swarm, which was this concept of

paralyzed tool call. So right now you can paralyze stuff but it's a little hacky where you come up with sub-agents and it goes does something you know comes back. What they did with 2.5 was they built it inside the model so you give a model a task it decides itself okay I need to like it they give one of the tasks they did was they said research the top 100 youtubers you know and come up with their hooks and it immediately and the model basically said okay I can do it one by one that's going to take a long time. It spun up like 50 agents and said okay you

Pierson Marks (36:47)
Wow.

Bilal Tahir (36:47)
research these three or whatever and just parallelize that call on its own so you did have it do it and the fact that was inbuilt in the models training pipeline is why it's so powerful like you know you don't have to build like any harness or framework on top to do it

Pierson Marks (37:02)
That's super interesting. Yeah, I need to try out KimiKey 2.5. ⁓

Bilal Tahir (37:06)
Yeah, unfortunately

you need Kimi code for the agents one day. I wish it was available via API and stuff. I could use it with the open router. You can use the model itself using open router or open code or whatever, but not the actual capability. Another one that people don't know about, Kimi slides.

Pierson Marks (37:11)
you need a subscription.

Bilal Tahir (37:23)
Amazing product. Notebook Alum kinda has slides and stuff. Jelly Pot has slides by the way, which you guys should check out. They're pretty cool for your podcasts. But Kimi Slides, really amazing presentations. really nice professional looking presentation you can generate. And so they have a product taste as well, which is not, I would say that common among a lot of labs, especially Chinese labs, but Kimi is like definitely surprising in that front. Like they're building a proper product, not just a model.

Pierson Marks (37:47)
Why do you think, what makes their product sense good?

Bilal Tahir (37:51)
I'm guessing just the people like him, kinda, you know.

Pierson Marks (37:53)
Like what's

an example of like what stood out to you as being like, this is actually kind of nice versus.

Bilal Tahir (37:59)
Well, the slides

were just so nicely done and professional and just the interface is kind of nicer. You know, you can add stuff, attach it and you can see very similar to Manus. You know, they've invested in the UI, which keeps getting better and stuff, know, versus I remember I tried, I think it was DeepSeq or...

Yeah, might've been deep-seek and it was just like, it's a very clunky UI and stuff. you're like, yeah, this is kind of an afterthought. The API is like the big thing. Maybe it's gotten better now, but you know, and personally, mean, as a developer, would rather they invest in the API, but again, you know, they're trying to build a whole product, so why not? mean, Moonshot, believe is owned by, no, it's not owned by anyone. It's its own thing. They might have investments from other firms. Quen is owned by Alibaba.

Pierson Marks (38:34)
Totally.

Bilal Tahir (38:43)
So they have that, that's their big daddy there. But Moonshot is just owned by, started by this genius researcher from Tsinghua. I think he wrote the Transformers Excel paper, worked at, might've worked at Google for a while. I think he went to Carnegie Mellon afterwards and is now back in China. ⁓ really amazing founder.

Pierson Marks (38:58)
gotcha.

Huh.

Really cool. Well, I mean, crazy week.

Bilal Tahir (39:07)
crazy week, know, it's just accelerating. feel like every week, you know, it's just so much stuff to have, you know, keep up to date on. mean, my head starts spinning.

Pierson Marks (39:14)
Totally.

Head starts spinning. But I think on that note, I think we talked a lot on Cloudbot, Multbot, Cloudbot.

Bilal Tahir (39:23)
Yeah, I have a quick side why

Moldpod? Because Claude threatened to sue them because they had the word Claude and it's not even Claude it's A W D but apparently that's similar enough that Claude was Anthropic was pissed off and so they I don't know if they sent them a formal letter or at least Threatened them to send them send them a formal letter of you know suing them so they changed their name Proactively, which is kind of a shame because Claude was way better than Moldpod. I don't even know what

Pierson Marks (39:28)
Hahaha

Moltbot

is so weird. I'm like, what the heck is a Moltbot? Devs? Yeah. It does sound like a pest. It's horrible. mean, it sounds like something I do not want. Moltbot? Imagine you telling your mom, like, hey, you, you want to install Moltbot on your Mac? And you're like, what the? Yeah, like, no.

Bilal Tahir (39:53)
Yeah, exactly. Sounds like a pest or something you need to eat. I got mold on my computer.

Yeah, exactly. It sounds

like malware.

Pierson Marks (40:10)
Yeah, it does. It 100 % sounds like spam malware. Maybe it is. Maybe that's the thing. Maybe it's the best ever malware spreader of all time. But on that note, Creative Flux, episode 29. Yeah. It's a good one. Sweet. See you.

Bilal Tahir (40:13)
Yeah.

Yeah, awesome stuff and yeah, see you guys next week. Bye.

What is Clawdbot (Moltbolt/OpenClaw)?
Broadcast by