Google's Nano Banana Pro and Gemini 3 Are Very, Very Good
Gemini Pro 3 and Nano Banana Pro push Google into the lead in the race for AGI. Meanwhile, OpenAI isn’t far behind with GPT-5.1 Pro & Codex Max. The AI news is relentless!
Gemini Pro 3 and Nano Banana Pro push Google into the lead in the race for AGI. Meanwhile, OpenAI isn’t far behind with GPT-5.1 Pro & Codex Max. The AI news is relentless!
Nano Banana Pro’s ability to make infographics and edit images is nearly unprecedented and, combined with Gemini 3’s analytical abilities, makes us feel all tingly inside. Web design, vibe coded games, there is so much cool stuff to get into.
Plus, OpenAI’s updates GPT-5.1 and a cool new tool from Meta called Segment Anything 3. And, of course, who could forget the cutest lil robots. No terminators today folks!
TIME TO NANO BANANA OURSELVES INTO OBLIVION. WAIT, THAT SOUNDED BAD.
Come to our Discord: https://discord.gg/muD2TYgC8f
Join our Patreon: https://www.patreon.com/AIForHumansShow
AI For Humans Newsletter: https://aiforhumans.beehiiv.com/
Follow us for more on X @AIForHumansShow
Join our TikTok @aiforhumansshow
Show Links
Google Nano Banana Pro
https://blog.google/technology/ai/nano-banana-pro/
Gavin’s Futurama-style Image
https://x.com/gavinpurcell/status/1991525928049230170?s=20
14 Inputs on Nano Banana Pro Image
https://x.com/nickfloats/status/1991531506397741156
Sims Expansion Packs
https://x.com/sinanhelv/status/1991530277974253871
Rowan Atkinson (Mr. Bean) in Total Recall
https://x.com/TomLikesRobots/status/1991548219428663586
Gemini 3 Pro
https://youtu.be/98DcoXwGX6I?si=Fwd83wo5vRHPb78d
https://blog.google/products/gemini/gemini-3/#note-from-ceo
Demis Hassabis Talks About Trajectory on Hard Fork https://x.com/slow_developer/status/1990998467611705344?s=20
Crazy Gemini 3 Pro benchmarks
https://x.com/OfficialLoganK/status/1990813077172822143?s=20
Google AntiGravity
https://x.com/antigravity/status/1990813606217236828?s=20
3js interactive webdesign
https://x.com/EHuanglu/status/1990967259775570262?s=20
Huge improvements on DesignArena benchmark:
https://x.com/grx_xce/status/1990815340893245481?s=20
Replit’s new tool for webdesign powered by Gemini 3.0
https://x.com/amasad/status/1990859423942893816?s=20
Gavin’s quick website test
https://gemini.google.com/share/a1e8d50a3d69
Bouncing Ball Test
https://x.com/OfficialLoganK/status/1990819310072443340?s=20
Voxel Art
https://x.com/goodfellow_ian/status/1990839056331337797?s=20
Demis Recreates ThemePark
https://x.com/demishassabis/status/1990818894177513831?s=20
Playables on YouTube:
https://x.com/GoogleDeepMind/status/1991192012691808472?s=20
Updating My Bear Jump Game
https://x.com/gavinpurcell/status/1990832098131763340?s=20
OpenAI: GPT-5.1 Codex MAX
https://x.com/polynoamial/status/1991212955250327768?s=20
https://openai.com/index/gpt-5-1-codex-max/
GPT 5.1 Pro
https://x.com/OpenAI/status/1991266192905179613?s=20
Matt Shumer GPT-5.1 Pro Review
https://x.com/mattshumer_/status/1991263717820948651?s=20
Meta Segment Anything 3 Playground
https://aidemos.meta.com/segment-anything
Sunday Robotic’s Memo Robot
Gemini 3 Pro 3D Lego Editor
https://x.com/skirano/status/1990813093727789486?s=20
Realistic Water Test From MattVideoPro
https://x.com/MattVidPro/status/1990880204760252834?s=20
Power Plant Recreation
https://x.com/sebkrier/status/1990814567820058641?s=20
Sourcey: Open Source Robot
https://x.com/sourccey/status/1990903761187828199
AIForHumansNanoBananaProGemini3Pro
Gavin Purcell: [00:00:00] Google has shocked the world with the launches of Nano Banana Pro and Gemini three Pro, we'll dive into what makes Nano Banana Pro the best AI image model we've seen to date. From crisp PS to weird things like Sims, expansion PS, or Mr. Bean in total recall. Then we dive into Gemini three Pro, the world's best AI model for now, and show you what you can do
Kevin Pereira: with it.
It's actually so good. I had it. Write our intro and build it as a single page website, so we're just gonna let it do our jobs for us. Gavin, you did? Yep. Here it is.
Gemini 3 Pro: Welcome to AI For Humans. The only show that doesn't hallucinate well, mostly heres Kevin and Gavin
Kevin Pereira: already got jokes.
Gemini 3 Pro: First off, Google drops.
Gemini three. IT designs websites, codes, games, and ignores the laws of physics. Isaac Newman is shaking open. AI releases GPT 5.1 Pro with Codex Max. It can think for literal hours that is longer than my battery life.
Gavin Purcell: See, I'm into the music now
Gemini 3 Pro: and we're gonna, I'm into it launches SAM three segment, anything.
It can identify every pixel in your house, [00:01:00] which is terrifying. Thanks. So,
Kevin Pereira: okay. Okay. Even it's scary.
Gemini 3 Pro: Sunday Robotics reveals act one. It folds, socks and makes espresso. Marriage is now obsolete. The future is weird. The episode is live. Let's get into it.
Gavin Purcell: Episode is live. Everybody. This is AI for Humans. The episode is live.
Welcome everybody. It's AI for Humans, your Wonderful guide to the World of ai. My name is Gavin Barcel. That is Kevin Prayer. We are here today to talk about breaking news. Nano Banana Pro came out this morning and we had to record late last night because of a travel scenario with Kevin. So I am here by myself to just tell you a little bit about what this is before we get back to Gemini three Pro and all the incredible stuff that came outta that.
Nano Banana Pro is Google's new AI imaging model, and it is bonkers. So if you've been following the story for a bit, uh, obviously Nano Banana was the big imaging model that people talked about. Uh, [00:02:00] Google has been teasing this for a few days. We all know that this was coming. It was a weird kind of drop in that they dropped, uh, Gemini Pro earlier in the week, and now it's here.
I wanna just kind of go over some of the highlights of what I've. Seen with this, and then we'll talk quickly about some of the coolest things I've seen so far before we get back to the show. First and foremost, it is very good at information gathering and then putting that information into specific image prompts.
One of the things a lot of people have been doing is creating infographics with this. Even Sundar Phai actually showed this off in his post about, uh, nano Banana Pro. What it can basically do is take realtime information from the web and then create an actual image out of that realtime info. I actually, uh, tried to get it to do one for the current NFL receiving yards.
Leaders didn't work perfectly the first time, but then somebody from the Google team replied to me and said, it can work better in AI studio. They're trying to fix something in Gemini, but this is a really cool thing. You can imagine creating something like a slide deck with these images. Or some very specific thing like that.
[00:03:00] Also, just a really cool way to see information in general. Um, Ethan Molik, one of our favorite posters, created a very complicated infographic about how to toast bread, which we'll put up here as well, just really dense information gathering. My very first image with it today was I asked for it to gimme a Futurama style, uh, cartoon drawing.
Of an alien approaching Earth and a bunch of go away signs. And you can see in this image there's like 20 plus signs up here, all of which have unique text. And most of them, I think there's maybe one that's a little bit off. Most of them are perfect, and that is a big deal. Up until now, an AI image model would fail drastically at this exact thing.
So. That's a big upgrade. Another thing that Nano Banana Pro is great at is image transfer. Obviously we've talked about this on the show for a while. The idea that you can take one image and transfer it to another. There's an example of people doing this with Legos, with all sorts of other stuff. This is just a step up from the versions we've seen before in AI imaging.
It's just very good at it the first time. You try. Couple other very quick things here. It has native 4K imaging, which [00:04:00] is great, especially when you wanna get very detailed with your graphics and your images. I did try something with the image model in very dense text that I thought would be good in 4K.
I actually asked it to generate an infographic of the six levels of the dungeon crawler Carl, uh, dungeon. If you're not familiar, that's a book I've read and I recommend to everybody and it did a pretty good job of it. But I'm gonna ask for it in 4K because some of the words, you can see how dense the words are on this.
Some of the words were a little bit blurry and a couple of them were wrong. And when I upgraded to 4K, that didn't really solve the problem. But if you are a creative professional and you want an image of just say a person in 4K or things like that, having native 4K output is really good. The other thing that's really important to understand here, which is kind of crazy.
You can upload 14 reference images into Nano Banana Pro and it will know those and be able to create stuff with them. There's a very cool image here of a person uploading like 14 different little Muppet looking characters and then putting them together in one image. Uh, that one kind of blew me away because in general you might be able to get away with [00:05:00] three or five, maybe.
Six different characters in this sort of scenario. And in that instance, it was able to take all of them and put them into one specific look. So this is a huge upgrade. You can go try it right now if you're not a Gemini Pro or plus or Ultra subscriber. I mean, I don't know who somebody gonna pay the $200, but you can go try it for free.
It will switch you quickly back, I think, to regular nano banana. I if you are not a subscriber, but so far my experience with this model has been bonkers. I probably will spend some more time with it in the next couple days. Maybe I'll even make a very specific tutorial video next week about it. Before we get back to our regularly scheduled show, I want just shout out a couple really funny use cases.
I saw this so far. I wrote up more of this at ai for humans.beehive.com. I put one out this morning, but two really fun ones. First, uh, at. Sinan Hel hel Vai Sinan hel Vai on X. Posted a image of expansion packs from the Sims, which again, these are the creative things that come out of [00:06:00] this. You can see a couple different ones.
I really like Ponzi schemes expansion pack, which gives you a look at that. And then my favorite thing I've seen so far, and I don't know how long this will be allowed, but Tom likes robots, who's a great follow on X, actually took Mr. Bean and put him into total recall, which again. This is the fun kind of crazy stuff that you can do with AI imaging, but it looks really good.
Anyway, shout out any of your fun things you've made in our discord or on X, and we'll take a look at them. And now back to our regular schedule programming. Kevin Gemini three Pro has launched and I have to tell you, uh, I wanna hear your initial thoughts. My initial thoughts are this is a very good model.
I've been super impressed with it so far. We are gonna get into everything about it. There are so many things you can do with it, but first. You've spent some time with it. What are your thoughts? It's the best
Kevin Pereira: model. Thanks for watching everybody. See ya. It is, I mean, I, I spent time specifically using anti-gravity, which is Google's new, um, IDE for, [00:07:00] for coding.
Um, we can get into that, uh, in depth, but like. I like it so much that when I ran out of credits, I didn't want to continue working on the project that I was working on. And usually if there's a rate limit or something else with any other model, you just bounce around to a different model. They're all kind of good enough, you can get around 'em.
This thing is so much better that I felt like I was doing my projects a disservice by switching back to a best in class model as of 48 hours ago.
Gavin Purcell: Yeah, so let's just talk a little bit about what, what this is and, and what we're looking at here. So this is Google's brand new state-of-the-art model. Um, I know benchmarks probably don't mean a crazy amount to our audience, but if they do, these are very good.
In fact, there's a couple of huge numbers. One of the ones that's really interesting is the humanities last exam benchmark, which was very, very hard math and science problems that were created. So that they were the things that we would know that people would no longer be useful for. This is now on the deep think version of it, which is not out to [00:08:00] everybody, but the deep think version of that is a 47%.
So just from a pure standpoint of what this model can do, it is great, but I think that is underselling what it feels like when you use it. Um, I think it's important for everybody to understand out there that like we have expected there to be a state-of-the-art jump in some way coming in the next like six months.
This feels like it to me, my experience with playing around with it. And we're gonna get into like the specific stuff that you can try right now. Is that we are now at the next generation of, of ai. I know that people, we talked about GPT five for a long time on this show. Yeah. And we discussed it and I think there was a lot of people that felt like, oh, that was more of an optimization model.
Maybe a way for open AI to get it out the door, not that it's bad. And also they updated with 5.1 and we're gonna talk about 5.1 Pro later. But this to me does feel like the first time that I've kind of moved into that next level of like, oh, okay, there's no wall, there's no, there's no slowdown here. I feel like we're moving into something bigger.
Kevin Pereira: Yeah. I mean, it does. If you chart the progress, it goes up and it goes up very high on [00:09:00] all of the things. It's up and into the right. Congratulations. Entire Google team. As a rollout, I have seen this integrated into dynamically generating interfaces for search results. You know, making animations, interactable things.
So this is more than just creative writing or some vibe coding. This is like a, I think, a paradigm shift for the way Google uses their own intelligence for their products. But I've seen everything Gavin. I've seen the, you know, ball bouncing inside of a geometric shape test, which people do for ai. But instead of just doing a single hexagon with a ball bouncing around, they're doing a matrix architect like panel of a thousand different shapes with different settings all bouncing around and it handles it.
Without batting, I guess a digital eyelash. I've seen it write massive blog posts, um, that feel like they are a very natural language. It doesn't feel like it was being generated by the proverbial ai. It's solving logic puzzles that are written. It's crafting solar systems dynamically out of code solving [00:10:00] physics problems like.
This is it, it feels like a major release, and again, I, I mostly vibe coded with it. I'm, I'm curious to hear what you did, but I, it has me very excited about the next few months of progress once again.
Gavin Purcell: Yeah. And one of the things I wanna jump into here before we talk about some of the other cool stuff is De Abbi the man who is behind, uh, Google's ai.
Uh, whole DeepMind world actually was on hard for this week and talked about this specific model. And one of the things that's interesting about Demis is that he's always been on the, kind of the same path as it comes to the timing of where he's at in ai. But this moment, even this moment for him seemed like a pretty big deal.
Play this clip real quick. Ka
Demis Hassabis: I think it's sort of dead on track if you, if you, if you see what I mean. I, we are really happy with this progress. I think it's an absolutely amazing model, uh, and is is right on track of what I was expecting and, and the trajectory we've been on actually for the last couple of years, since the beginning of Gemini, which I think has been the fastest progress of anybody in the industry.
And I think we are gonna continue doing that [00:11:00] trajectory and we, we, we expect that to continue. But on top of that. I still think there'll be one or two more things that are required to really get the, the consistency across the board that you'd expect from a general intelligence. Uh, and also improvements still on reasoning on memory, uh, and perhaps things like world model ideas that you also know we're working on with Simmer and Genie.
Um, they will build on top of Gemini, but but extend it in various ways.
Gavin Purcell: So what's interesting there about Demis is like he's kind of playing it a little coy in that like he's saying like, there are a couple more things till we get to a GI. But I think when I played with this model, uh, today and spending a bunch of time with it in the last day or two, it really does feel like we're on the pathway there for the first time.
Like in the, it felt like for like maybe the last six months that we were slowing down a little bit. But this is a jump and I think one of the thing that was really interesting before we jump into the things that we did with it, but then also other cool stuff people have done with it. This was all trained on Google's TPUs, which is a, a sort of nerdy term, but as their tensor [00:12:00] processing units.
This was all trained on Google chips, so they didn't use Nvidia chips. This was all done there. And supposedly it is a 7 trillion parameter model, which means that both the, the scaling, uh, the scaling law, which has always been like, you know, you can add more compute at a, at a AI system. Not only on the post-training, which is where we talk about inference training, where people can spend time on compute, but on the pre-training, they actually had a much larger pre-train set.
So they're saying that that's where some of this might have come from. So just to give you guys some, some background on that before we jump into some of the amazing stuff people have made that
Kevin Pereira: shouldn't be understated. This is Google taking command of the entire stack from the Yeah, the hardware itself that is used to train these things.
And they were able to throw a massive amount of data. Into this model, right? And they were able to cheaply at least, relatively cost effectively, um, and with less energy usage than most GPUs. They were able to use those TPU chips to crank this [00:13:00] out. This might be an incredibly large moat for them. They're, they might not be beholden to everybody else grabbing up those NVIDIA GPUs to train their models.
Um, it was a, it was a bold, bold gambit for them. And it seems to be paying off like this is again, completely trained on TPUs and being served by these TPUs. And it is fast and that is also a massive difference maker. It happens to be the best. With the limited amount of time I've had to poke at it and seeing the reactions that everybody else is having.
It does seem to be the best model, but it's also one of the fastest, and that is incredibly impressive.
Gavin Purcell: Yeah. So let's jump into some of the stuff you can do with it. One of the most fascinating things I saw was three JS interactive web design and the things that Kevin did at the top. Those were generated entirely by Go Gemini three.
And so like one of the things that is really remarkable is to look at the different variations we've seen. And some of the things, you look at them, you're like, well, there's no way that a person couldn't have made that website, right? It [00:14:00] feels like it is a designer's website. And no, this was actually all coded by Gemini in a single shot.
Oftentimes, in fact, the design arena benchmark, which is a specific benchmark for design, how well an LLM can design. Websites and stuff like that. The chart is insane. If you see this chart, it's almost like doubled up the gem. The Gemini three has almost doubled up on the previous models. So it's a really interesting, there's a really cool example of somebody that redesigned a wine website, which is just one of those random things you wouldn't see.
Rept has a new tool that they have built themselves specifically for web design. I even did a small test with it myself, um, just trying to make a vapor wave site. What's crazy is there's so little you actually have to do to get really interesting designs out. Now, I'm not here to say to our designer friends in the audience, I'm not here to say that your jobs are gone, but.
What's gonna be interesting is to see like what people can do now that the kind of building blocks are there, [00:15:00] how much more people can leap off when it comes to web design, I think is pretty impressive.
Kevin Pereira: Uh, the model can natively understand video as well. You can throw, you know, YouTube at it and ask for specific descriptions of what's happening inside the video itself at any given timestamp.
And it's sort of just immediately. Pulls that video in and understands it. Um, there are some rumors that maybe, uh, there was some robotics or automation data fed in there because it seems to understand world navigation, um, uh, in a way that other models don't. Um, there were some cool multimodal tests of like.
Using, uh, uh, three Js again, but to do voxels and build art and you could select portions of it and say, add a tree here, add a waterfall here, do this, and then rotate it with the camera. It's like, this is, we are, we are minutes relatively speaking into this thing being out and people are using it for a wide variety of things and discovering that.
It's, it's really capable, not just vibe, code, the basic game, but how about now go generate all the graphics and sprites for the game.
Gavin Purcell: Yeah, actually, there was a lot of really cool games that I saw [00:16:00] people make a lot of vibe coded ones. Demi Sava actually recreated Theme Park, his famous theme park game within Google Gemini.
They're all little blocky characters, but he actually spent time doing that. And then a really cool thing that they did have on YouTube right now, you can play through some Gemini coded games. And these are not like super complicated, but it's just amazing that they've made these as a playable element on YouTube games.
If you go there, you can try a little tank game, you can try this other stuff. And I started to think about like what that means for like, especially for young people, but for anybody to imagine like creating something pretty quickly if it becomes a vibe, code of game to be able to put it out there. That is such
Kevin Pereira: a cool thing.
I'm, um, playing the captain's coin right now, Gavin, where you play a a, it's like a maze Runner 3D game as a pirate ship. Um, super. Yeah. I mean it's also super surfing on the high seas. Super simple, but they're fun. Totally fine. Right? Like,
Gavin Purcell: and I think what's cool about that is you can imagine a world where if YouTube integrates that as an actual pipeline, like, you know.
Uh, code to game to [00:17:00] publish like that is a big possible business for YouTube, but also a very cool opportunity. Um, yeah. I dunno if you remember this, Kev, but when Gemini 2.5 Pro came out, I actually created a bear jump game. Yes. Do you remember this when I did it back in the day? Yes. Do So I basically took that code, I went back and found my old Gemini thing and I literally just dumped the code into Gemini three and I said, make this better.
And it made quantum bear jump. And what's so great about this is like. It's not like the game changes that much, but it works really well. The graphics are way better. It's just much cleaner and it even added little quotes every like, inspirational quotes every time you jump, so you can all go play that game right now.
I will drop the link into the show notes, but it is just a good example of like what you can do as a complete non coder, right? Like my original Bear Jump game was like, Hey, I wanna make a game. Where you press the space bar and give a bear jumps, and then you have to get a score. I went back and forth a little bit, and then this time I just said, make that better, and it did, and like that is the awesome promise of what this AI can do.
Kevin Pereira: I think for our audience though, we should [00:18:00] point out how you actually use it. Not like what you can do with it, but how do you actually mess with Gemini because there's no like, there's no, it's like eating a reus. There's no. There's no right way, there's no wrong way, and it's kind of everywhere. Um, you could go to gemini.google.com and depending upon what type of account you have, you might have access to, uh, either there's a fast or a thinking version of it.
You might have access to it. You can go to AI studio if you wanna play around with code, and that's what, that way you can go to the AI deep dive mode on Google and you might have access to the new Gemini three. Pro and it can, might be able to do one of those dynamic interfaces that I talked about earlier where it kind of generates a webpage for you based off your search.
You might have access to it through that. You also might not. So you have to hunt around on the page to make sure you have access to the latest model. And if you wanna mess around with the coding specifically, you can download Anti-Gravity, which is Google's new coding app. Um, that is for the moment completely free.[00:19:00]
And if it's anything like my experience was yesterday or some of today. You'll have minutes of free usage before hitting a usage cap and getting frustrated.
Gavin Purcell: Yeah, well we were gonna talk about that later, but we might as well jump into it now because it is a really fascinating thing. What's going on there?
So this is Antigravity is Google's actual new command line interface, right? Yeah. This is their, this is their answer to Codex. This is their answer to cloud code. And you know, finally, a small inside baseball thing, you may or not remember, but there was a whole kind of failed windsurf acquisition. Well.
Turns out that they have forked the windsurf code when they brought over the windsurf founders and put it into, uh, and put it into this thing.
Kevin Pereira: So, yeah, so anti-gravity is a fork of Microsoft's vs. Code. So poor Microsoft man released one of the best things that everybody has pillaged and plundered to make their billions.
Um, but you're right, it it, it has some x windsurf code within it. It seems, uh, there were references to some windsurf features that people were finding deep within the app. At its core, it feels like VS. Code, which may mean nothing [00:20:00] to mo most of you out there, but there is an agent manager mode. And when you hit that, it basically gives you workspaces.
So you can have multiple apps and multiple chats open at once. And so you can say. Build me my bear jump game or create an intro for the podcast as a single webpage. And you can run these multiple chats and you'll basically have an inbox. They actually call it an inbox that will notify you when an agent is done or needs your attention or needs to have you review a plan or whatever permissions you want, and then you can go in and comment with it and let it rip.
And you can have multiple chats within multiple projects. And it's a very early glimpse at what the future of. Productivity might be for certain levels of engineering, where you've got your multiple repos that you're working on. Mm-hmm. Your various chats going with the different projects. You're approving things.
And what's also key is that there is a chrome icon on the top right when you're in the editor mode. And Gavin, this thing can run and analyze. The web browser, it basically, for you, which that look, there are other [00:21:00] tools that you could hook up with cursor or cloud code or there there's playwright. There's all sorts of stuff that you can try to like hamstring together, but to have it be one click, Hey, run my app and figure out if it works.
Go run tests against it and see if you can break it. Having it. Work intricately with Chrome, even at this very early phase is like a, a surprising leg up on some other apps. And right now, if you don't hit an awkward rate limit, which I did yesterday, um, they're giving it away. Like they want you to go use it so you can go use the pro mode.
They, they
Gavin Purcell: trap you. They trap you, is what's gonna happen. They trap you in their system. But by the way, we've said this before, but like, this is what all these companies in some ways are trying to do is like. We, I, I, I, I've been thinking about this as like, everybody originally thought that $200 price tag was so crazy both for Google Ultra or for Chat GBT Pro.
Yep. I would not be surprised if in six months to a year from now, a lot of American households are paying for an account, uh, personally and a lot of businesses Yeah. Are paying for an account because I think [00:22:00] that's how useful these new models are gonna end up being. And I don't think you can underestimate what that means for the.
Financial bottom line of these companies. Yeah. But also it's just gonna become valuable to you at home to actually do this stuff. Right. And you're good to have the amount of compute that you need to have. Now, I know a lot of people in our audience may be like, that is insane. I'm never gonna pay $200. I guess I would've said you would never thought you would've paid $160 for cable back in the day.
Right. Or however much cable bills are. And ultimately it will become a question of how much value to get, how much value do you get out of it? And I think, Kevin, you're saying very clearly from, at least from a business and coding perspective, there's a lot of value to be derived immediately.
Kevin Pereira: Like when I hit my first rate limit, the first thing I did was go to the page to see if I could pay more.
To get more credits because it was markedly better. It was solving problems with old applications and little vibe Cody sort of demos and games and things that I was working on. It was solving problems immediately that the latest Claude couldn't do that. The latest GPT model couldn't do. It [00:23:00] was quickly churning through them, and I was like, well, great.
I want to use this model for everything. When I hit a limit, I went and looked online. Google does have a, like an ultra plan that you can sign up for for 200 some odd bucks a month. It right now does not seem to apply. To coding within anti-gravity and a lot of people complaining about that. That's really
Gavin Purcell: why.
Yeah, that's seems
Kevin Pereira: crazy. I'm sure they will fix it. Maybe they'll fix it by the time this podcast is out. But at the moment people were like, Hey, why am I getting rate limited? Um, so even I was getting that to the point where it said, come back in five hours and will reset your limit. I literally set a timer and went away and like paced angrily, came back, ran a few commands, and immediately hit another rate limit.
So clearly something is up. I think they're working on getting it un gummed, but there's just, there's no doubt that it is the best model right now for coding, in my opinion. Um, I don't care about the benchmarks. My own benchmarks are, I had things that didn't work and it churned through them like one shot.
So very, very exciting time [00:24:00] to have any idea and wanna bring it to reality.
Gavin Purcell: All of that has happened. It has been a crazy week. But the most important thing that will have happened right now, starting today, is that you will have liked and subscribed to the AI for Humans YouTube channel, and we love that story.
You are here again. You're here. We always appreciate you showing up. It's a very nice thing that you are here. If you are listening to our audio, please always leave us a five star review on our podcast. We have one small piece of business today, next week in America. If you're listening in America, or you might be listening internationally, we have a lot of Australian listeners go on Gavin.
We are gonna take the week off. It is Thanksgiving week. Not only is it Thanksgiving week here, but I am moving. Kevin is doing a bunch of stuff, so we are taking one week off Thanksgiving. Unclick
Kevin Pereira: the bell, click the thumbs down, delete your comment. Clearly these two don't care about you care. Taking Thanksgiving off, we don't care about slackers.
That's right.
Gavin Purcell: But we will, we will be back the week after starting in December again, but we are taking a week off. Hopefully something crazy won't happen. I'm sure it will. Alright, Kevin. There is other big news this week that we have to get to an opening [00:25:00] eye of, as we just mentioned, was not gonna take this Google stuff sitting down.
We know as we talked last week, they just introduced GPT 5.1. Well, they have now introduced two big things, first and foremost. Codex Max, and this is the thing that people are freaking out about within the AI space. So Codex is an, uh, frontier agentic coding model from OpenAI. This is a new version, but the big thing here, Kevin, is that it can work for a very long time.
And this was the thing that people had talked about all the time with, with open AI and coding models and all those sort of things, is that the amount of time a coding model can work on its own and what sort of problems it can solve. Was the biggest deal. And according to OpenAI and some of the people that have been talking about this, this is the best, most state-of-the-art model, and it is only getting better supposedly.
Have you taken a look at these benchmarks for this? I mean, I, uh,
Kevin Pereira: yes, I've seen the, again, the lines and I've seen the amount of time that this can go. I mean, it, it, they used something called, I think it was [00:26:00] compaction, which allows it to like, work across millions of tokens. And that's what helps. I think it's just like a fancy way of compressing, uh, the data.
I, I have not had a chance to get my hands on this 'cause I've been playing with. You know, the Gemini three of all things. Yeah. This literally just
Gavin Purcell: came out too, which is so funny. Right. So the one, the one important thing to know here is that meter is a company that kind of tracks time used inside of an like Time and AI can work on its own, right?
Mm-hmm. The GPT 5.1 Codex max time is two hours and 42 minutes. That is 25 minutes longer than G PT five, which literally came out like six weeks ago. There was a tweet from Jerry Tore who goes by at million INT. He, his, his, uh, bio is Barry Farmer at Opening Eye, but he is one of their main research scientists.
He's been up, up forever, and he actually quote tweeted this and said, we're going to the moon because like if you look at this meter chart, it isn't up into the right. Bar, like it is not something that looks like it's topping out, like it is going up and up and [00:27:00] up. One of the things that's really important to know about when you can get a coding agent to work on, uh, longer problems is that it can work on harder problems, right?
Because if it can keep going on things, it's able to solve those problems in a way that before if it could only spend 20 minutes or 10 minutes on, it just wouldn't be able to get through the bigger problem. This is that kind of self recursive thing now. It's not doing it on its own yet, but this gets us to the place where AI can start to improve itself over time.
Kevin Pereira: But wait, Gavin, there's more G PT five Pro Ultra Absorbent with Wings Pro Plus. That's right. It was a new model. There actually is a new model, GPT 5.1 pro, which is out now, and Matt Schumer got his hands on it. He has a really, really great post on it. But the TLDR, which I'm just going to crib directly, is that basically, it's a quote, an absolute monster, but it's trapped in the wrong interface, so you can use.
GPT 5.1 Pro, which is a very slow but very heavyweight model, very smart and capable, can take its time, can think through things, acts like a very capable engineer. But [00:28:00] does it right now within like a chat GPT interface?
Gavin Purcell: Well, my, here's my thing about this is I think that he, because I read that, I read that thing.
Mm-hmm. I think if the, the GPT, uh, 5.1 Pro. Is probably Codex Max in some form or another. Right? Because Codex Max is the version of this that exists in that thing. And I don't know whether or not Matt had the actual connection to that Codex Max thing. This is where everything got confusing in the last week.
Yeah. 'cause this stuff all came out in different ways. But yeah, Matt's uh, review of 5.1 Pro is really good. The other thing I thought was really interesting about that, and I've actually played a little bit with 5.1 Pro. The problem I have is I don't know if I have a strong enough problem to give it.
Right, right. This is the issue where you run into with this good of coding tool is that like you have to think, okay, what can I give it to do? The one thing Matt did say in that post was that like you can give it planning things for your, if you have a normal life, if you have five one pro, which you should have, if you're a GPT [00:29:00] Pro subscriber, you can give it things that it can think about for a long time and it comes back with much better answers.
So. 5.1 Pro was very much open AI's answer to Gemini three Pro. Um, someone also out there said that there's a world where Gemini three Pro still has this deep think version. Have we confused you yet? Are you already starting to get confused by the names? Yeah, but there's another version of Gemini three Pro, which will allow them to go out and think about things.
That might be even better than GPT 5.1 Pro. So we are entering into a, a, a very big fight between these two companies. It feels like.
Kevin Pereira: Well we know that if you give these systems more time to think, right, that inference, time compute, they do perform better. So if this is the new arms race, there's a big knob that someone, Hey Larry.
Go, go grab the knob, turn it up a little bit, and they just do that and it gives it a hundred thousand more tokens in time and whatever else, and they'll just bleed out so that they could be at the top of the charts. But I mean, I, I, I guess that's a win for us.
Gavin Purcell: Yeah. Well, [00:30:00] I mean, again, it goes back to the same thing we said at the top of the show.
Like, this stuff is now moving forward again. Right. And I think that a lot of people out there had convinced themselves, okay, it's slowed down. It's not gonna be that way anymore. This is where things get weird, everybody. This is where we start to see things getting crazy because as these improvements do hockey stick, if that hockey stick thing can continue and.
All of the build out that we've talked about in the show, all the infrastructure talk, which I know we, you know, people, some people love to hear on our show, talk about all the data centers and everything like that. In general, all of that is going to allow these things to continue. Now I can't, I'm not a pro, I'm not a future teller.
I'm not gonna be able to say to anybody that, that's for sure. But it definitely feels that way based off of this week. So we'll see Kevin, where we are in a month from now, but right now it feels like we're moving pretty fast.
Kevin Pereira: Don't forget, Meta's still out here. Gavin, remember they poached a bunch of talent and started paying billions of dollars for things.
Gavin Purcell: What's so funny about this thing to me, so, so yeah, meta has a new thing called Segment Anything three, which if you remember, we covered Segment Anything two and [00:31:00] probably even segment, anything one on the show. Now, they were not usable, but what this basically allows you to do is take a piece of video and within that video or an image.
It easily cuts the, the different shapes out. And it's very good at figuring out which shape is which. So you can go in and say like, okay, here's a, here's a video. I wanna isolate the woman, uh, dancing, and I wanna leave the man sad on the side alone. Right? So I want to take the woman dancing out. And you can eliminate that and you can pull it out.
But the most important thing is that you're able to just cut around it so you can do stuff with it. Well, what's interesting to me about this is that like meta needed to get something out, I guess this week, maybe that's what this was. It's very cool and you can do it. Like I actually tried it myself. I actually took a shot I had of a piece of video and I turned this one guy in the video.
It was just a normal AI video. And I turned the guy black and white in the ceiling, black and white. And it's super easy to do. Now you could do this with high-end video tools, like, you know, Adobe's tools or things like that, but this makes it dead simple for anybody to try. And you could actually go play around [00:32:00] with it right now.
Kevin Pereira: So there's Sam three, right? Yeah. There's the video segmenting tool, and then there's Sam 3D. Yes. Which is another beast that, that, that's the one that blew my mind. Like I, it's. It's like a pen and teller act where it's like, I know the trick. I've seen the trick before. It's okay. It's the cup and balls.
I've seen the the types of open source things that allow it to do what it is doing, but they have implemented it in a way that feels really crisp. So you can go, they have a playground where you can go and upload your own image. Um, you could create 3D scenes. You can select one of the, the, the preselected images that they have, and you can literally click on objects in the image and keep adding to them.
And when you're happy, you can hit generate 3D and it will take whatever you have selected from the 2D image and give you a 3D model of it rendered in a way that you can obviously move it around. You can see it from an extreme overhead view or from below the object, and on a lot of images. It just sort of works in a way that is just like insanely magical when you look at.
[00:33:00] Other apps that let you build sky boxes for games or you could go and generate a world, yeah, that's just a flat 2D something. Okay, put it in a tool like this, tell it what the objects are, generate them into 3D and now you have a world you
Gavin Purcell: can walk around. Well, it's funny to me because one of the things that I would love this to be integrated into like.
Gemini three. Right. But it's not going to be, and it'll be interesting to see like, you know, meta has, I, I dunno, the last time I was at Meta AI is like, you know, actual chat bot. It's been months at this point. Yeah. But like each one of these companies is trying to build their tool set up. Right. And this is a really cool tool.
I could also see something like this easily working. I, I'm sure, in fact, I think they are gonna be integrating this into reels and they're sort of editing suite from, from whatever they're doing on Instagram. But like, I hope that all of these tools proliferate in some way or another because it's just another one of those very cool things.
Speaking of tools, KAF, our buddies at Suno just raised a bunch more money. They raised $250 million and Suno. You know, has had issues with the legal, with the legal restrictions on the music they've done. I'm [00:34:00] actually been spending some time in Suno with V five and it's super fun. I've, I, my wife has been gone this week, so at night I will spend some time just making some songs.
And there's something so interesting about AI music that, like, I, I talked about this, maybe we talked about this last week, but this idea that like. When you are creating stuff, even if it's ai, and I think most people who hate AI don't feel this way, but when you're creating stuff like there's something just very good organically, you feel about yourself, right?
You feel like your brain is doing interesting things. What's interesting though about AI creations is that so many people are gonna be able to create, I think there's gonna be a level of like just enjoying the experience of creating rather than having to like put something out there to make sure you're making something that people listen to.
And I think Suno is a really good example of a tool that you can do that with because it's just really enjoyable to craft something. And again, people may be laughing at the word craft if you're not into AI music, but like. You put something in, it comes out and then you start working with it. Like that is what's exciting about Suno to me.
Anyway, that's a long dis [00:35:00] uh, dis discussion of that, but like, it's a pretty cool thing.
Kevin Pereira: There's What was the gorillas song that was like their big hit? Uh, sunshine in the bag. Feel good. What was it? No, not feel good. Was the Sunshine in the Bag one, uh, clin Eastwood? Is that what it was? Yeah, yeah, yeah. So that, that drumbeat, um, and some of the syn Oh, oh yeah.
The ca the caio, right. It was a Cassio keyboard. Push one demo. They just slowed down a bit. No one was like, oh, this is, this is gross. They stole this beat. They didn't, no. They were inspired by it and they went with it. And so yeah, if you want a shade, someone who just prompts a song into existence, okay, that's fine that that's your prerogative.
But someone who starts with a prompt and then iterates and massages and builds and adds their own flair or maybe sings to the machine and has the machine pitch corrected, all those things can stack up to be a really beautiful experience. But, you know, getting started with any of this stuff. Is the hardest part sometimes, and there's a nobility and a valor of sticking with it and figuring out what the notes are on the keyboard and watching the deter, sure, fine.
But there's also something [00:36:00] great as someone that just starts a little bit further along because they prompted it into a machine and gets inspired to stick with it.
Gavin Purcell: Yeah, and I think that's one of the things I wanted to point out was just like, why this company might be worth so much now, even though it has these legal issues, is that I think people are gonna start creating on platforms like Suno for fun, and not just to publish, right?
Not just to put stuff out there. Like people will start creating games for fun. People will start creating movies for fun, like it will be a different sort of thing. Now that doesn't mean that people who do those things professionally. Aren't gonna stop. They may think that because there's gonna be such a flood of stuff out there that they might feel bad about it.
But anyway, congrats to Suno for raising a bunch of money. I'm sure some of that will be going to their legal fees. We'll see how that all lands up. Kevin, the other thing we should talk about today is a new robot from a company called Sunday Robotics. This is mimo and mimo. I do think, again, we talked about MIMO a while back, but this is a premier video.
It's a little bit, looks a little bit like the one X video in that it's designed to be in the home. I have to say, watching this MIMO video, it's very [00:37:00] cute. It rolls around though, instead of having legs. So it's on a low rolling thing. It is specifically designed to do chores in your home. I am not convinced yet that we have seen the form factor of a robot that's gonna work.
I, when you saw this video, what was your first thoughts on it?
Kevin Pereira: I mean, yeah, I was like, if you have stairs, sorry. I feel bad for you son. Um, unless this thing can use its robotic arms and lower its torso down. 'cause it is a, it's a torso on a stick. It can raise and lower that torso as needed. So that's an interesting design.
Um, I'm sure it reduced complexity and balance issues by going with the pedestal with the little rolly feet. Um. Look, the, the model looks impressive. They have a demo of it clearing off a table, Gavin, um, putting scraps into the trash and loading a dishwasher. And some of the things that they highlight are a t fully autonomous.
Um, it is delicately grabbing the wine glasses, multiple glasses in a single pass. And as it opens the dishwasher, it puts them into the little plastic thing with the stems on the top tray. [00:38:00] Where the wine glasses are supposed to go and it, it does it. Now they're playing back the video at five x or sometimes 10 x speed.
If I'm away and a robot's doing the, the dishes for me, I guess I don't care if it takes it. 45 minutes or an hour, like what do I mind? As long as I don't need the kitchen as well. Like let it do its thing. Um, for the long horizon task, that's impressive. But you know, again, we've said V three of any product is usually the good one.
This looks like a V one. Yes. And kudos to them for trying to get a handful of them into homes. Like it, it seems like it might work in a way that. Other robots haven't, but like I, I'll probably be cheering along from the sidelines until V three arrives. Yeah.
Gavin Purcell: I mean, listen, humanoid robots are coming for sure.
I, they're not coming nearly as quickly as all the other stuff we just talked about today, but they are coming and all that other stuff is gonna power them. Do you not like
Kevin Pereira: it's a little robot cap that it has? Gavin? I'm find
Gavin Purcell: that, uh, no, I don't like that. I'm trying to figure out like what I want a robot to look like.
Maybe like. You know, what if it had like some sort of weird [00:39:00] porcelain face that you could draw on? I don't, I'm trying to think of something that would be interesting. Like, like if you could come up with something or what if you had a robot that you could put those realistic masks on and then you could like turn anybody you wanted to turn into.
I think
Kevin Pereira: that's, that people will definitely do that. Even with this robot.
Tell you they already total tors, torso
and arm should be E in displays so that I can put whatever sick hats I want on it at any time. Oh, that sounds pretty exciting.
Gavin Purcell: That's right. All right everybody, let's see what you did with AI this week.
It's ai. See what you did there.
Gemini 3 Pro: Sometimes ya without then sudden, suddenly you stop. Fun shout.
Gavin Purcell: Alright, Kevin, we have a special Gemini three edition today and uh, I wanted to shout out a couple of people I saw that does really interesting stuff. This is from Senco, who was one of the first things I saw. This is Peter Senco. He's a pretty well known ex user, but he [00:40:00] used Gemini three Pro to create a 3D Lego editor in one shot.
It nailed the ui, spatial logic, and all the functionality so. If you're not, if you're just listening to this, what this is, is basically he's built himself like a literal Lego play set that he can change colors on. He can change the blocks and he can build Legos all within a single shot of Gemini three.
Yeah. So it's just another good example of how crazy is stuff you can do with this.
Kevin Pereira: And you can contribute to his GoFundMe. He is being sued out of existence by Lego at the moment, so we wish him the best. And that's not true, but it is a very, very interesting app. And just, I mean, like where Gemini really excels is, is interface and design.
Like, it's like the, the code works and it's good and it's very thoughtful. But man, when you see Gemini three output versus literally any other model, they look like professional
Gavin Purcell: apps. Yeah, well that's another good example. So Matt Vid Pro ai, who's a great, uh, AI YouTuber. If you're not familiar with him, check him out.
He said, make a realistic water physics test full 3D you can [00:41:00] interact with reflection waves. Click anywhere to drop a lemon in the water. Lemons are like his thing. Yeah, but you can see. That is amazing when you look at it. So Kev, do you wanna describe people what we're looking at?
Kevin Pereira: Yeah, it's, it is a view of like, it looks like a 3D plane of this like just blue sheet.
But then as the mouse goes over and a click happens, a lemon drops into the water. And if this were like crimson tide on the Xbox, you'd be like, whoa. How do they pull that off? I mean, this is just running in a web browser and I think it was a one shot, but it's a 3D sim, like liquid simulation of lemons into water and it just like.
Th this would've been, uh, jaw dropping a year ago, and now we're like, yeah, of course. Of course it can do that. What else you
Gavin Purcell: got? Speaking of that, Seb career, who is the A GI? Policy Dev. Dev lead at Google D Mine. So he may have had some chances to kind of spend time on this. He actually created a two shot working simulation of a nuclear power plant.
So this is something when we think about educational use cases of this, or not even like, you know, [00:42:00] necessarily something a teacher would make and it might. But you as a student can make this like, this might be the future of science fairs, Kev, where we see going forward, we see people just generating out, uh, 3D models of things, right?
In single shots. And when you imagine like this is the fun starting point, but then like what if you take that nuclear power plant and you make something bad happen? You could see all that happen in the thing too. I can.
Kevin Pereira: For the tri-fold display at the science fair, that is just the, exactly the prompt that will generate the potato that you can use as a battery.
Well, I didn't actually make it, but this is what I would use to prompt it into existence. Um, little Bonus Robotics. Shout out Gavin. I don't know that it belongs in a, I see what you did there, but did you see Sourcey?
Gavin Purcell: I, it looks weird. It looks like a giant garbage can with arms. It
Kevin Pereira: looks like a, yeah, it looks like a R 2D.
Uh, I dunno. It's like, it's like a weird, yeah, it's like a Disneyland garbage can with a connect for eyes and these little 3D printed robotic hands. But Gavin, they say it's your personal home [00:43:00] robot. Look at it though. What does it, is it
Gavin Purcell: good? What does it do? I'm just curious. Listen, sourcey, you're very cute and I think that is definitely a look, but like what I feel like I'm looking at here is something that.
I don't know. Is this, there's gotta be something else going here. Is it like open source or is it like open source by source kids or something? According to the website, it is, doesn't insult anybody? Insult. I'm not saying it doesn't, I can't make this thing. Let's be clear.
Kevin Pereira: It's fully open source. Okay. It's customizable.
Um, uh, it's lro robot compatible, so you can kind of train it to do whatever you want. Like this is more for like the tinkerers. Yeah, cool. This is for like computer clubs and, and whatnot, but I, I thought I was very, very cute and it's an interesting approach, but like, it, it's weird to think that there's the optimist figures, the unit trees of the world that are going after the mass produced.
You know, in, in home bots, but there's going to be the open source 3D printed hacker community where someone makes, lord knows what as attachments that are sourcey compatible. Yeah, sure. And maybe we could all 3D print our own little robots at home.
Gavin Purcell: It's funny. Did you go to source's [00:44:00] webpage? Yeah. It almost looks like they, they could use some of, uh, Gemini three's, uh, uh, magic on this webpage.
Five. Cody. It's a little five. Cody. This is very much a, but again, we love sourcing. That's them. For it. Yes. We love them for it. It's a very homemade website, but congrats to Sourcey. Alright everybody, that is it. We will see you all Go spend time with Gemini three Pro or Gemini three, or any of the new open AI models.
This is a big moment I feel like in the AI space. We have definitely moved into the next generation.
Kevin Pereira: Don't spend time with your loved ones. No, no. Spend time with Gemini at Thanksgiving. That's for next
Gavin Purcell: week. Spend time with your loved ones next week, this week, and spend time with Gemini at three. All right.
Bye everybody. Bye.