OpenAI’s New AI Image Model is here! But… Can It Beat Nano Banana Pro?
ChatGPT Images (aka OpenAI’s Image 1.5) can create stunning AI images, text and more. But how does it compare to Google’s Nanobanana Pro? We dive in… way too deep.
Plus, new GPT-5.2 Codex, Gemini 3 Flash, YouTube’s new vibecoded games, the controversy around Generative AI and game developer Larian Studios, a lego-like robot and, of course, seeing how AI video can cause soap opera actress hair to endlessly grow.
IT’S YET ANOTHER WEEK OF NEW RELEASES! AND WE DON’T STOP.
Get notified when AndThen launches: https://andthen.chat/
Come to our Discord to try our Secret Project: https://discord.gg/muD2TYgC8f
Join our Patreon: https://www.patreon.com/AIForHumansShow
AI For Humans Newsletter: https://aiforhumans.beehiiv.com/
Follow us for more on X @AIForHumansShow
Join our TikTok @aiforhumansshow
To book us for speaking, please visit our website: https://www.aiforhumans.show/
// Show Links //
ChatGPT Image 1.5 is here
https://openai.com/index/new-chatgpt-images-is-here/
Recreating the post from the OpenAI blog with us
https://chatgpt.com/share/69443100-6e7c-8003-910e-749bab75f6e2
Fabian (from GLIF) Notes on Images 1.5
https://x.com/fabianstelzer/status/2001300766368178435?s=20
Jeff Goldblum’s Resume
https://x.com/gavinpurcell/status/2001033377294467182?s=20
Upcoming QWEN Image Layering
https://x.com/wildmindai/status/2001593677576384747?s=20
Image 1.5 Vs Nanobanana Pro
Video Game Characters
https://www.reddit.com/r/ChatGPT/comments/1ppg4s9/test_2_turning_game_characters_into_real_people/
Fingers
https://x.com/petergostev/status/2001027573636088184?s=20
Gavin’s Original Knight and Rotisserie Chicken Post
https://www.reddit.com/r/ChatGPT/comments/1jk0p3v/tried_to_push_the_new_image_model_with_an/
OpenAI’s Greg Brockman: WE NEED THE COMPUTE
https://x.com/OpenAI/status/2001336514786017417?s=20
GPT-5.2 CODEX
https://openai.com/index/introducing-gpt-5-2-codex/
Frontier Science: New Benchmark
https://x.com/OpenAI/status/2000975293448905038?s=20
ChatGPT Apps Store Opens For Developer Submission
https://x.com/OpenAIDevs/status/2001419749016899868?s=20
Gemini 3 Flash
https://x.com/GoogleDeepMind/status/2001321759702663544?s=20
Nanobanana now in community posts on YT
https://x.com/nealmohan/status/2001425749941829920?s=20
Meanwhile, YouTube “Playable Builders”
https://x.com/YouTubeGaming/status/2000989303086649637?s=
Larian’s AI Gaming "Controversy"
Direct response from Larian Head of Studios:
https://x.com/LarAtLarian/status/2001011042642505833?s=20
MSFT Open-Source Image-to-3D Trellis 2
https://x.com/_akhaliq/status/2001041559366598799?s=20
Bernie’s Moratorium on Data Centers
https://youtu.be/f40SFNcTOXo?si=hduNjATJgtIya9oq
Meanwhile… China can now produce high-end AI Chips https://finance.yahoo.com/news/exclusive-china-built-manhattan-project-141758929.html
Meta SAM Audio
https://x.com/AIatMeta/status/2000980784425931067?s=20
Tron 2: Lego Robot
https://x.com/CyberRobooo/status/2001513866157789308?s=20
AVP Spatial Photos Of Newborn
https://x.com/SadlyItsBradley/status/2001276039671197783?s=20
WAN 2.1 Workflow Re-creates The Matrix With Homer Simpson
https://x.com/ChetiArt/status/2001291373182382526?s=20
Miss Piggy in Melania Trailer
https://x.com/charliebcurran/status/2001564626144928146?s=20
One Woman’s Transformation Via Sora Remixes
https://sora.chatgpt.com/p/s_693a2ed29e288191a542b776553e1145?psh=HXVzZXItT3diZ1NFOUtyZlRXV2ZvajcwWjJsZ2Uy.XXZmIQEXNl-L
AIForHumansOpenAIChatGPTImagesGeminFlash
Kevin Pereira: [00:00:00] Open AI's new chat. GPT images has
Gavin Purcell: arrived. We're talking
Kevin Pereira: better editing,
Gavin Purcell: far greater realism, better editing, more realism, yada, yada, yada. The big question here, Kevin, is how does it compare to Google's nano Banana Pro? Do you wanna put us in banana suits? Yes. I want to put us in banana suits. Okay. All right.
It, there we
Kevin Pereira: go. Banana. That's right. Gavin OpenAI also just announced GPT 5.2 Codex, a new update for their high-end coding model
Gavin Purcell: and a video from OpenAI president where he is basically begging for more compute.
Greg Brockman: We are absolutely bursting at the seams with demand for compute relative to our ability to supply that compute.
Kevin Pereira: Meanwhile, Gemini Flash 3.0 is here. It's small, it's fast, it's powerful, it's also free. And along those lines, YouTube has rolled out AI gaming, so. Grab your pitchforks
Gavin Purcell: friends. That's right. There is another big, huge controversy in the AI gaming world. We'll explain why everyone is so mad at Larry and Studios and how to talk to your favorite gamer [00:01:00] friends about it.
Kevin Pereira: Oh, I can't wait to have that conversation over Christmas dinner, plus new audio tech from Meta, Microsoft's new open source text to 3D model and a Lego like robot that is kind of nightmarish. And we'll
Gavin Purcell: show you how Sora transformed this one woman into a being entirely made of hair.
Kevin Pereira: You wanna put us in the hair, don't you?
Gavin Purcell: Yes. Put us in the hair.
Kevin Pereira: This is AI for Hair men.
Gavin Purcell: All right, let's go Hair Boy.
Welcome everybody to AI for Humans, your Weekly Guide into the world of ai. It is time we are here. It is another big week of releases in the AI's. Base, Kevin, this week we got chat, GPT images, A-K-A-G-P-T image, 1.5 AKA open AI's answer to uh, nano Banana Pro. This is a new model, it's an updated image model from chat GPT and from open ai.
And so far it's been pretty good. Like [00:02:00] it's been really interesting. Have you spent some time with this?
Kevin Pereira: I did, yeah. Mary ship, miss Gavin Mary ship miss to all of the listers out there? That's right. Yeah. This is their new image model, which is, you know. Leaderboards are a thing that companies like to tout.
This puts them at the top. This new image model has been ranked better by, you know, users of certain arenas as the Google's model. And it, it, you know, it's so funny, like everything that's magical and amazing about it, just because we've had these tools and they do get better and better. It's sort of like, oh yeah, that's great.
Look, character consistency is a big deal. Yeah. So if you upload a photo of yourself and say, make me insert your edit here, it does a pretty good job of keeping your face or your proportions or whatever you're asking for, painting on an image and asking for a follow-up. I did several things along those lines.
It works very well. I mean, that's. That's the reality. It's a very solid image model and it's right within the chat GPT that you're used to.
Gavin Purcell: Yeah. What's interesting to me here, so they had, they released a pretty significant, um, blog post showing off a ton of use cases of this. [00:03:00] One of the things they're really touting is the ability to edit images of yourself or a couple people and keep face consistency.
Right. And like we've, we tried this ourselves. In fact, if you look at open AI's blog posts, they have a thing where. Um, Mark Chen, and I can't remember what his name is, but the chief Science officer of OpenAI, they put in like a kind of a, a situation where they're at a kids party and they're kind of bored and then they put a bunch of kids behind it.
I actually took you and I, and what I basically did is I uploaded our photos that we used for our thumbnails last week, our original photos, plus an image of Ollie, my dog from as SOA video, and said the exact same prompts. And if you walk through this, you'll see. First we are at this kind of like, uh, you know, kids party, both looking bored, pretty close to us, wearing the same clothes, everything else.
Then it says like, add a bunch of kids in the background. I added those kids and it's actually very good at that. Um, Ollie himself actually has the image of him has like sunglasses on and like a, like a, a lve and they kept the lav wire, which is funny. Then I had it, um, asked to make Ollie into a plushie like they [00:04:00] did, and also then myself into an anime character.
Still pretty good. Your face pretty much stays consistent. If you click through all three of them, it's pretty much the same face. The last one was interesting. Um, I asked it to put AI for human sweatshirts on us and I gave it our logo the same way that they did with open ai. This one isn't as good, but it's still pretty solid.
Your face does shift in this one though, so it's interesting to think of like how many like levels down can you actually do with this? Yeah. But overall, really, really good at facial e editing in general. And I think what, to your point, like. It's a step change, right? We're not looking at like a, a giant setup in the next big, huge generation, but this is kind of like opening eyes answer to nano opening pro.
We are in a world where like they're trying to kinda maybe catch up to that stage at this point.
Kevin Pereira: Yeah. I mean, it's big for them because, uh. For the last few weeks, I didn't think about going to SOA or chat GT Yeah. To generate imagery, right? Yeah. Everything was the new nano banana. Now I'm back within my chat GPT window and I'm feeling marginally better about the [00:05:00] $20 a month that I hand them.
Yeah. Um, I did similar tests like one of their, if you go right within chat, GPT, there's like a whole interface that suggests different things that you can do. And I always find like. Onboarding for these tools is so interesting and difficult because the capabilities are so massive. I find users are at that paradox of choice.
Where do I start? What do I do? And one of the suggestions that it had was make a holiday card. Yeah. So I uploaded a picture of my dog, Wesley and said, yeah, go ahead and make a holiday card, and. So much of the like the model is great. The render times are faster, by the way, when you're creating images.
That was a pain point before. But so much of the model, you can tell it's like this behind the scenes prompt work that it's doing. Yes, yes. Which you see with SOA as welling, it's reasoning.
Gavin Purcell: Right? Exactly. It's reasoning exactly to figure out the prompts. Yes.
Kevin Pereira: What is the user asking for? What have they provided me, if anything, as a starting material, and how do I make a magical result and really bolster the prompt?
And so with the holiday card, obviously it was a picture of a human in the suggestion, but I gave it my dog, Dr. Wesley Snipes. It came back with. You know, the, the ceia toned holiday [00:06:00] card of Wesley with the tree behind him looking festive. And it was like, happy how holidays it added some text or whatever.
And I was like, okay. It did some prompt work that, that in conjunction with a good model to make a solid card. And then I took the fedora and the cigar from uh, uh, uh, I think you should leave sketch, the driving crooner, which I love. And I said, add this to the thing. It nailed it. Similarly, I tried making a holiday card with my niece in it, and two or three generations in, she started to grow extra fingers on the hand.
Oh, really? Wow. Interesting. Yeah. Yeah. Her eye color started to change and it started to sort of melt around and I tried to re-upload the initial image of her and say, make this match. And it really failed to do it a bunch of times. So again, like. Hit missed results. Yeah,
Gavin Purcell: I mean, I think one of the things I found too, and this is just a good note for everybody, if you wanna do image editing, generally you're better off on all of these downloading the result and then re-uploading it again, because it's almost like a fresh start.
Correct. That's why I was kind of impressed by how it was able to keep your consistency at [00:07:00] least. So it was. Three. So a couple other quick things about this. Um, you know, Fabian Seltzer had a really good little post, a very quick take on it, where he was talking about the fact about facial stuff, image editing is good.
He did mention there's like a little grain on these images sometimes. Mm-hmm. And I know that the yellow tint was a problem with chat GPT 1.5 originally, there's a little bit of a grain that if you zoom in close, you can sometimes see. One of the other big things I found really cool about this is they've really focused on, and this goes to your reasoning point, Kevin, um, making text better.
Yes. Meaning that you can generate quite a bit of text. And I had this experience where I, I wanted to try something, so I made a resume for Jeff Goldblum. And what was interesting about this is. There was a period of time where this model had come out and I, and I tried to do this thinking I had it, and it clearly, when I thought, I was like, oh wow, it's got all these problems.
I tried this, the generating it, it had these issues. And then about an hour later, I generated again, it had zero problems. And when I say zero problems, I mean this is a full resume for Jeff Goldblum's acting, and there are no text issues, right? There are no problems with the text along the way. I had nano Banana Pro try to do this too, and it did an okay job, but [00:08:00] the OpenAI version of this was a literal, I asked for it to be a TS compliant, which if you know what a resume an A TS means, it's like this weird ability to be all search by the bots that are out there.
It created a resume for. Pure resume. The other one was a kinda a picture of a resume from Nano Pro. So just a very cool thing in general. Um, you had said a new thing came out from Quinn this week that was interesting, and maybe this kind of gets to the idea of what the, the future of imaging looks like in some form.
Kevin Pereira: Yeah. So, uh, uh, Quinn, uh, like it's it's image layered is what they're calling it. It's, I don't believe it's out yet, but they're teasing it. Basically, you can feed it any flat image and it does a best guess at. Which we've been talking about for a while. Yeah, yeah. Extracting each element and then making the full scene around it transparent, and it's like, well, this makes sense because even tools like Canva has a magic edit mode where you can grab objects or whatever.
Yeah. But this, having it be fully automated, automatically generate all the layers. Th this is, you know, some of this work is happening behind the scenes already with [00:09:00] these models, but a tool that just generates them all one shot. So then you could go in and say, I wanna change just this thing, or let me swap this out and put in a new product.
Like that's really, there's some really powerful stuff there. But as you mentioned with the, the text capabilities of this new image, 1.5 model chat, GPT image, some of the examples of like generate. A menu for a ramen restaurant or generate, um, the hist, all of the whales in the ocean. Let me see them as a chart with stats or whatever.
It's ability to go grab, contextualize the layout so everything is like elegantly done, and really follow like big in depth text prompting. Like if you've tried image gen in the past. You should go back and try it now because these are But far it's way better. Way better. I mean,
Gavin Purcell: it, it, yeah, there's no doubt about it.
It's way better. So one of the things a lot of people have done here is compare it to Nano Banana Pro, right? Because Nano Banana Pro really hit a new mark in terms of like what was able to be done with images. Um, one of the things I found interesting is one of the first ones that came across was this guy Peter Gusev, who [00:10:00] actually used the classic like, you know, full, uh, wine glass.
And like kind of, uh, number on the clock sort of thing. Yeah. But then he actually asked it for a finger with, with, I guess it's seven Fingers in the Thumb. Fingers. Yeah. And Chat. GBT was able to create a hand that had multiple fingers on purpose and it looked like Gemini, uh, nano Pro wasn't. Now, there's a lot of back and forth on this.
Some people have said there are definitely some things that Nano Banana Pro is doing better. There's a really interesting Reddit thread from a guy named Omi Domi, U-U-M-I-D-U-M-I, or a person named Omi Domi, where they're taking. Game characters and turning them into real people and comparing them. And one thing that I have found when I'm looking at the different, uh, nano Banana Pro and chat GPT prompts is that I actually think that Nano Banana Pro realism looks a little bit better still.
Like, uh, image 0.1 0.5 still has a little bit of like a feeling like almost like kind of not plasticy per se, but not as realistic. Like Dan in a pro is nailing that. The one thing I wanted to try, myself and Kevin, if you remember, like from way back when, I think this [00:11:00] was from when Image Gen One launched, I had this really weird prompt where it was about a night, uh, in a 1990s screen grab of a, um, A-C-C-T-V and a restaurant where Knight had stolen two rotisserie chickens.
Right, right. This is just one of those You
Kevin Pereira: in a supermarket? Yeah. It's like, yeah, a supermarket. Right. Got caught on camera.
Gavin Purcell: The original I was really impressed by. It was really cool and so, and I uploaded that to Reddit and it did really well. What was interesting about this is in this particular very complicated prompt, I think chat, GPT did much better, but none of the images all, I did four, we'll show all four of them.
I did two tests in both image model you can kind of see in. In cha GPT ones, uh, one of them I think is really good, like the one that has like it running in its toaster. Strudels, your point last time was like, where would the CCT camera coming from? And maybe that Yeah. The perspective seems a lost Yeah.
Didn't necessarily, um, the nano Banana Pro ones though, feel like they are not getting the kind of timeline right. And one of them is really off, like it looks like [00:12:00] the, uh, uh, rotisserie chickens are dripping in there. Different lighting than the, the night is, yeah.
Meta SAM: Yeah,
Gavin Purcell: but what was funny to me is like all four of those, I kind of think the original one was better.
So it really depends on your use case of what this is going to be. Um, it doesn't mean, you know, that like these, they aren't improvements. But like what's interesting is that there are some places maybe where they're trying to improve for specific use cases, which is what we see in some ways with uh, GPT 5.2 as well, right?
Like this idea that you might improve a specific use case but you might not improve as a whole is an interesting conversation point.
Kevin Pereira: Well, this brings up like my new favorite meme now is, uh, you know, we need the com we need more compute. Yeah. And then it's like, for what? Are we gonna cure cancer? Are we going to solve, uh, tr uh, teleportation?
Are we gonna fold more proteins And No, no, no. We need more compute for anime babes and Knight stealing rotisserie chickens. Like, that's what we for. I mean, listen, I could use
Gavin Purcell: some more, uh, rotisserie chicken nights, that's for sure in my life at this point. To your point, Greg Brockman, uh, [00:13:00] president of OpenAI.
Uh, release a video specifically asking about compute. And I will say, just to give them credit here, I think what they're talking about, and we're gonna get into their new Codex model in a bit, they are trying to kind of push the edges of this. But let's first hear what Greg has to say here.
Greg Brockman: OpenAI did not set out with a thesis.
The compute was the path to progress. It's that we tried everything else and the thing that worked was compute was scale.
We are absolutely bursting at the seams with demand for compute relative to our ability to supply that compute. When we look at our launch calendar that the single biggest blocker often becomes, okay, but where's the compute going to come from for that? When we had our image generation launch in March that went viral, we did not have enough compute to keep that going.
And so we made some very painful decisions to take a bunch of compute from research and move it to our deployment to try to be able to meet the demand.
Gavin Purcell: You get a sense there of like, what, what they're basically kind of saying [00:14:00] in this video, and I think, you know, in, in very. Bare terms is that like, Hey, in order to serve these image model stuff, which again, I think gets people excited about ai, gets people really excited to do stuff with ai.
They weren't able to kind of continue to push the edge of this stuff. And I think that all of these conversations around what we've talked about in this show a bunch is like the AI bubble or all this money kind of going into data centers and all those sorts of things is about providing enough compute for everybody to use and, and be able to do all this stuff.
So I think it's an interesting conversation that's gonna continue to push forward. And there's two big things that have come out here this week. Uh, GPT 5.2 Codex, which literally just dropped this morning, and a new benchmark that they've released called Frontier Science, both of which I think are trying to get back open ai Back on that research track.
I'm gonna cater ke What are your kind of takes on this in general? Like where do you feel like. I, I feel like we're gonna see a crapload of compute Come on. Yeah. In 2026. It's one of the big kind of things that will happen. We'll kind of get a sense of like, okay, we'll get more compute now. What I thought it
Kevin Pereira: was a fascinating and a kind of a [00:15:00] bolded mission as well to be like, well, we released a thing and it was popular, so we had to pull from research.
Because you're also basically admitting that like you're, for those short term gains, you might be hamstringing your longer term efforts. Right? Sure. And you know. We know Google has compute, we know they got plenty of it. They have their own custom chips online now that are serving up models. So like, do they have to make these trade-offs or, yeah, I know, it's a
Gavin Purcell: good question, right?
Like even during the nano banana launch, like Demi and one of the, the Google people did say their servers were burning in a similar way, right? But we don't know that answer, right? We don't really know what's going on in the background.
Kevin Pereira: So it's tough. I, I can't imagine how tough it is for them. It's just, it's interesting to hear that and it's tough to have to like weigh those decisions.
I do think. You know, clearly if you, if you look at their flywheel graphic, which is something they probably showed to a lot of investors. Sure. It was basically like compute go up, products go up and revenue go up. Well this is great. So you guys help us with the compute thing and we got the rest. Like, yeah, if that trend continues, that will be interesting [00:16:00] For me.
It is, it's tough because the image editing is really fun and it's cute. I sawa, I had a blast with when, you know it first launched, I use it a little bit less now, but like. The agentic coating of it all. That's what I'm really interested in right now. That's what adds a lot of value to my, uh, daily life and productivity.
So I, I hope I don't want any concessions to have to be made anywhere. Yeah. But I hope if they're gonna make concessions, it's like, well, maybe it's, it's one less it image render over here for people so that we can have a better agent code output over there. Like, that's the stuff where I, I want those trade offs.
Gavin Purcell: And I think they're trying to figure that out. This goes into JPT 5.2 Codex. So today they just released this new Codex model, which is the most cutting edge version of Codex, which is their, um, coding platform. One other quick thing is that GT's App Store is now open for developer submission. Yeah. So.
We'll see what happens in that space. But yeah, the other big thing here is that they have released a new benchmark, and I know benchmarks are not the most exciting thing in the world, but is a frontier science benchmark. So OpenAI [00:17:00] seems like they're trying to shift in part with their code red that happened a couple weeks ago.
They're trying to shift to get to that edge again, right? To get back to the edge of where the technology is kind of improving on a regular basis. And to your point, to try to make big impacts in the research side as well.
Kevin Pereira: Do you think, uh, you know, 2026 we're gonna see some compute unlock? Sure. 2027 we might have like small nuclear reactors firing up now to like power, power these things.
Yeah. Do you think, like, when I look at those benchmarks, and you can see the bar clearly from like a G PT four oh or whatever, two A five. Yeah. Like there's, there's a market leap and now you're starting to see these releases where it's just another percent or 1.2. Sure. Do you think we'll see. A significant gap in between like let's say now and this time 2026, or do you think it's gonna be tiny little steps still?
Gavin Purcell: I, well, here's the thing, there's a lot of rumors going on right now about that, that GPT model that is supposedly coming out early next year. Yeah. My gut is telling me, once compute comes online and we should just very quickly level set this compute conversation just to make sure everybody understands.[00:18:00]
Compute means the amount of, of like backend power that can be churning on these models, specifically the thinking models. 'cause they take a lot of time to kind of go through this process. And the more compute you have, the better you can serve that. I think next year is going to be a really interesting space.
You and I said earlier this year that GPT five would be a real moment and we had it, right? Yeah. And like it wasn't like as big as I think people thought it was gonna be like, we didn't like hockey stick up until like the singularity. I do think li, I mean again, Sam Alman. Dario Monet, and uh, really Demi feels like the kind of person that's the person who really has like.
A really good kind of sense of where this stuff is going. He is in no way pulling back from the sense of like, this is going to like, get better and better all the time. I would suspect that if by mid-year next year we don't see another significant gain, then we have to worry that like, maybe this is about all we're gonna get for a while.
I will say the thing that shocked me about the image stuff, do you know what Kevin, that GPT, the, the whole, um, [00:19:00] you know, studio GLI thing that happened, that happened in March. That happened nine months ago. Oh, wow. Which feels like that was Wow. Two years ago at this point. Yes. So, so you just have to remember sometimes that, like you and I have been covering this stuff weekly and everybody listens and watches our show knows that this is happening weekly, but like this is all still moving very fast.
Right, right. Like, that was a sea change, you know, uh, VO three saw a video. Those were all sea changes, and that's all within the last, like 18 months. So anyway, just something to keep in mind. Crazy.
Kevin Pereira: Well, and also something to keep in mind is that. We might not be covering this next year if we don't get.
That's right. Unending waves of support by everybody watching and listening to this, Gavin, it's very simple.
Gavin Purcell: That's right. You could help us right now by going and liking and subscribing to this channel if you're not already, or, uh, sharing this video or audio with somebody else. And Kevin, I keep hearing this from our commenters, so we gotta say this out loud.
Supposedly YouTube has a new system that is based on hypes, which is the weirdest thing in the world to do. Dude, I hyped us. I
Kevin Pereira: spent like 80 hype points or a [00:20:00] hundred something hype points. I didn't even know what they were, but I was like. Spend, baby. Do it. Yeah. It's
Gavin Purcell: time for everybody in our audience to go figure out what hype points are and hype.
This video supposedly hypes, make a huge difference. So if you are on YouTube, do a little research, ask GPT, ask Gemini three, figure out hypes and then hype the video. We are very excited to have you all. They like, at
Kevin Pereira: least to my knowledge, they were free digi points. That I got. And so I, I hyped us up quite a bit, Gavin, so he good.
I have
Gavin Purcell: not hyped us, so maybe I
Kevin Pereira: need to spend some time hyping up ourselves. Hey, seven, no cap friends. Hype, hype, hype. Click the button. Also, if you leave a comment that's always nice, that makes us happy. We like to engage with you there. Uh, and if you're on a podcasting app, a five star review certainly doesn't hurt, but.
Um, the algo, giveth and taketh away. Sure. So the only way we get discovered is when you take the time to actually engage. So as much as we be and plead each week for that, it, there's a reason you do it. Makes a huge difference.
Gavin Purcell: All right, everybody. Let's talk about Gemini three Flash. This is Google's kind of response as they, uh, tit and tat back and forth.
Like, I feel like there's a volleyball match. We're watching every, every new model [00:21:00] from one or the other has a response. So Gemini three Flash is a. Flash version, meaning it's a faster, cheaper, smaller version of Gemini three. And Kevin, I think I, again, I hate to dive into benchmarks, but the thing when I first saw this that kind of shocked me was that like some of these benchmarks on, this is specifically the multimodal reasoning ones.
Yeah. Are actually better than Gemini three Pro, which was a very shocking thing to see to me at least. Yeah, this
Kevin Pereira: is, again, this is their, their, their small cheap model, like actually cheap to use. If you're a developer and wanna plug into the API, it's free to use. If you go to gemini.google.com, like the default model you get is very fast.
And as Gavin said, it's in some cases more capable than their bigger singing all dancing model. And in many other cases, it's on par with stuff that like OpenAI is offering for a price. It's, I, I, I'm like, I was very surprised to see this. Very surprised. Yeah. And, and I, and I used it and I was very
Gavin Purcell: happy with the quality of the [00:22:00] outputs.
It kind of feels like right now that Google is trying to put their foot down and like kind of big boy this scenario, we're like, Hey, we are one of the largest tech companies in the world. To your point earlier, we have all these TPUs, we have our own chips, now we can. Models that are very good, very cheap, or in some cases free and see what we can do to kind of squeeze out a lot of the other players in this space.
And Gemini Three Flash feels like that. I think a couple really interesting things, to your point for developers, if you're developing on these platforms, obviously cheaper and faster is better, especially if you can get, you know, results that are pretty closed. I also saw that, you know, nano Banana now is supposedly you're gonna be able to use Nano Banana Pro in YouTube community posts, so clearly.
They're rolling out these things to, to Google like properties as well. I haven't used it yet either, but there's a new AI experiment or whatever, kind of like whatever that thing we talked about last week where it was the tabs thing. Mm-hmm. There's a new one for Gmail, um, which they're rolling out as well.
And I keep telling, I don't know how we, maybe we can get this straight to Logan, uh, at, at, [00:23:00] at Google. What I really want for Gmail is something that will be a Gmail cleanup tool. I just want my Gmail cleaned up. Right. I just need something. Hundred percent something. Yes. I just
Kevin Pereira: realized we were supposed to have Logan on the show in December and I just, I didn't follow up.
We didn't follow up. Well, whatcha gonna do? Whatcha
Meta SAM: gonna do?
Kevin Pereira: Ah, that's not, I mean, on one part. That is a humble brag that he agreed to come on the podcast. I wanna come on on the second half of that. That's really bad for us that we forgot to follow. Well, it is what it is. Logan. We'll follow up. 2026. If we're here in 2026, you will be here too, Logan.
That's right. Also. Playable builders. Did you mess with that at all again? Yeah.
Gavin Purcell: So I, I, I didn't mess with this, but I saw come out. So, playable Builders is YouTube's new community built, like people in the YouTube community are building little vibe coded games that we talked about. Gemini three. Yeah. When they launched Gemini three Pro.
That these were gonna be a things, but you spent some time with this, which is very cool. I
Kevin Pereira: was, I was, um, I will say pleasantly surprised, uh, I we're gonna get deeper into, uh, [00:24:00] gaming ai, angry, bad, all bad. I was pleasantly surprised at the quality of the games. Um, the performance on the, the, on my cell phone phenomenal.
Um, some of the game, like one of the games captured my attention for like, I think seven or eight minutes. Wow. That's pretty impressive. Which is actually a lot considering it's a, a game or whatever, it's a vibe, code of game and it immediately load. You are a, a whole and, uh, you have different city levels and you go around and as objects fall in, your hole gets bigger.
It's not a new concepts. I think, uh, donut County. Donut Country, donut County. It's a game where you, you play as a whole that what's different here is that there are other players moving about the map. So it's like a little bit of that, uh, agario sort of aspect where the bigger you get, the more of a threat you are to other holes on the ground.
Look, it was really simple, but the graphics were good. It performed really well. It downloaded instantly. I played some other games. There was like a Rubik's Cube, like Tetris game. There was another like sort of endless runner. There was a weird, um. Like, uh, it's like that Maze Runner game meets a fashion game where you have to complete, complete the [00:25:00] look by running through different gates, which modify your avatar's appearance Point is vibe coded using Gemini, right?
This is going to be, whether it's, I mean, YouTube gaming is, they're gonna push heavily into this. Sure. There are gonna be others. We've talked about the new grounds ification of AI gaming. This will be a thing. There will be a steam like marketplace for these things where as the, the, the models get more capable, the games will get more flushed out and I think, you know, the next Flappy Birds, of course, like that already exists.
I think the next, like Player Unknowns Battlegrounds is going to be vibe coded and probably within a year.
Gavin Purcell: Yeah. And what's so cool about this is this is like YouTube is partnering with certain creators to make these things right? Yeah. So that they're trying to kind of bring forward people's ideas. I think that's a very cool general idea that it, by the way, even not even player Unknown Battlegrounds, like what I'm almost more curious about is like.
What would like a one person kind of like tro like type of game look like if it's just mostly like, 'cause one of the things about TRO that's so cool to me are these [00:26:00] smaller games, is that they're kind of one or two ideas and then just very nicely fleshed out, right? Like that's the kind of thing that you could see very clearly vibe coded.
Now that doesn't mean that a battlegrounds can't happen as well, but Kevin, this should transition us into a, a very large conversation that's going on right now amongst the gamer community. And I think something that, is it a conversation? Well, it's a lot of talking to, let's put it that way. Yeah, yeah. Um, this is, this is a big, big story over the last coast across this week in which Larry and Games, one of the makers of my favorite game in the last five, one of my favorite games the last five years, Balders Gate three, and they've a very kind of like deep sense of like storytelling games has been like, kind of called out, quote unquote, for using generative AI in their game process.
And I just want everybody to understand like this, if you may have seen a headline here, or if you're not a, a gamer or somebody understands it. This is kind of like the top line conversation that's happening in a lot of creative industries, but gamers are very noisy, as we both know. Like they, they will take something and kind of run with it, and a lot of the gaming [00:27:00] world is very anti ai, so we should kind of lay out.
Exactly what this story is. The story basically is that Larry and did an interview in which the, the head of Larry and the, the, the studio head came out and said, we use, um, generative AI to kind of conceptualize ideas and to kind of make processes work faster in the very beginning stages. But we are a human first studio, all human writing, all human actors.
We have a lot of people that work for us. And Kevin, this. Blew up because the headlines across things like Kotaku or other places were like, Larry in Studios uses generative ai and, and I'm just kind of curious to know like, A, did you see some of this stuff? But then b. I don't, what I'm trying to get at with this conversation we're gonna have is, is like kind of how do we have these conversations with people who are so one-sided, right?
Who have built in their brain this idea that they have believed very clearly that AI is this thing that they don't want anything to do with, and yet they're not understanding that it's already there and it's already being used. Yeah, I mean that's, [00:28:00]
Kevin Pereira: it's a conversation that is, is nice to attempt and I've certainly attempted it a few times, but ultimately, at the end of the day, it's not a conversation that, that people are going to have to have.
Like,
Meta SAM: yeah,
Kevin Pereira: it's gonna sort itself out, right? Like there's gonna be a contingent of people that no matter what are gonna say, if it's not 100% farm to table pixels all hand drawn by human and that's fine. The, the, the, there will be. Products that exist for that. We've said this, there'll be movies that exist like that.
There'll be music that exists like that. It will take probably a lot longer for those things to come out and be released, and there'll be a lot like less flexibility with them in terms of remixing them and whatever else, but like that's fine. AI is creeping into everything. It's already in a lot of the tools that traditional artists are using, whether they want to admit it or not.
Right. Generative fill. It's something that people use that they, they don't feel like, well, it's not ai. I am just filling in a missing piece of my web. No, it's, it's fully ai. Yeah. Even the automatic rotoscoping tools that people use to separate images for foreground from the background, [00:29:00] that's an AI tool.
So yeah, I think if you try to be in the middle of like some AI okay, and some all AI bad or what if when you try to get in the gray, you're probably going to lose because it's gonna be really difficult to. To extract what is true AI and what is not from here on. So I say like, yes. Look here, you have somebody that makes traditional games, that employs artists that is basically saying, this is helping us get to ideas faster.
And then the traditional artists do what they do. Yeah. I think even that is a little bit of a hedge. Yeah. I still think it's going to be the traditional artists are gonna use ai. To get their vision up and running. Then they'll probably train some sort of model on that art style and use AI to generate actual assets with that, and they will use their artai and ability to clean up those assets.
For now, I think, I think even the interim bit is like a hedge.
Gavin Purcell: Yeah, I mean, I wanna be clear like the guy, uh, SVE Vinke is the CEO of Larry. And I just wanna read quickly his response to some of this. He said, holy guys, we're not [00:30:00] pushing hard for or replacing concept artists with ai. We have a team of 72 artists, of which 23 are concept artists and we're hiring more.
They are, they create as original and very proud of what they do. I was asked explicitly about concept art and our use of gen AI answered that. We use it to explore things, and I didn't say we use it to develop concept art. So he says. Later on, he says, we use AI tools to explore references just like we use Google and art books at the very early ID ideation stages.
We use it for a rough outlining composition, and we replace it with original concept art. So I just think it's an important thing where like headlines take off and they do something right. The other thing that made me laugh at this, Kevin, which I think is a really interesting kind of wake up call maybe for some people in these communities, is that.
The game that literally just won the game of the year from the video game awards, expedition 33. An amazing game from a French developer that they made for $10 million. That looks like AAA Game has also admitted that, that they have used Gen AI in some parts of their process and like this was like this kind of like all these gamers were [00:31:00] out here.
Like, I can't believe it. I can't believe it. It's like maybe just take a step back in the. Anger at this idea and kind of explore what these things are and think, Hey, maybe there is a world where they're going forward games like Exhibition 33, you could see more of them because of this possibility and it's not gonna stop those games from happening.
And yes, will there be bad actors who want to do stuff all the time? But to your point, last week you talked about people getting review bombed in steam for a small piece of art that they were using. Yeah. We just have to wake up and kind of grow up, I think, to have this conversation in a real way.
Kevin Pereira: Are you saying the people who are never iers are little babies?
Gavin, is that what you're saying? That they need to grow up? Is that what you're saying? That they need to put on the big boy girl pants? Honestly, a little
Gavin Purcell: bit, A little tiny bit. No, I mean, not that like, it's not gonna be an end all be all, because as you said, organic AI will be a thing. But this is a changing world in the same way.
Like it goes back to like, you know, if you were the Photoshop, anti Photoshop people like. Eventually those people grew up or they grew out of the business. Right. It's a very tricky thing, and I think this is going on in [00:32:00] entertainment as well too. Right now, right there, there's a lot of conversations around people who are like, AI is bad, FAI, all this stuff.
There's processes in which the human creativity and the human role is very much a part of this going forward. Anyway, it's one of those things that has made me really point out. There is a high, high rise in anti AI sentiment that's happening right now. Like this is gonna get a little wonky wall s in some ways.
But Bernie Sanders, as we talked about a couple weeks ago maybe on the show, has come out and wants to put a moratorium on data centers. Mm-hmm. He released a video on YouTube talking about this, and this is a, a, a conversation around like, Hey, we're moving too fast, and this is a little bit about the conversation and Joseph Gordon level we talked about is doing this.
We're in this world where like the anti AI conversation is rising quickly, and I do believe people are going to start using this in a political way. And I just want everybody in our audience, and I mean we're speaking to the choir, everybody listening or watching this is probably very clear about like, Hey, this is a multifaceted conversation.
It is not one way or the other, but [00:33:00] like once you get stuck down one road and you start thinking only one direction. The world will get very weird very quickly because you will be blinded from seeing stuff that I think is right in front of your face.
Kevin Pereira: Well, what do you think about the moratorium on data centers?
Gavin? Do you think that's a, do you think that's a good thing's? Terrible. Do think it stands a snowball chance? I think it's, I think it's a,
Gavin Purcell: I think it's a terrible idea, but it will stand a chance because. People are afraid. And this is the thing. And the other thing that's interesting about this conversation is Reuters just had a story this week where like China has now figured out how to produce high-end AI chips.
Right? Right. So for the first time ever, they were able to create this sorts of chips. Supposedly this is a, a new story that they figured out. Within China, they're able to kind of create the similar sort of chips that like Nvidia, they've had to buy from Nvidia before. So all of this conversation, do I want jobs to continue?
Yes, of course. Do I want creatives to continue? Yes, of course. But a moratorium on data centers stops progress. Like it's this whole idea of like, we have to decide as a culture and a community and as really as a country and a world, like, are we [00:34:00] gonna try to stop things as they move forward or. Are we going to like, be aware that things move and, and be the best version of that as we can?
Kevin Pereira: Yeah. Uh, look a clarification too on the, on the China story, it looks like they, they have a machine that can produce the ultraviolet light that is needed. To make the high-end chips, which is still a massive breakthrough by and large, because we refuse to give them chips and that's like here nor there.
Uh, so they've, they've got that far, they haven't produced chips yet, but it's, it's probably a matter of minutes. In fact, by the time we hit publish on this podcast, they'll probably have the chips. So to your point, you have to decide like, okay, if. I if the, uh, end result of this marathon, like we're presently sprinting, but it's gonna turn into a marathon, if the end result is a, uh, a technology that can fundamentally change everything we know about art, science, math, et cetera, should we race to get there or should we let someone else get there first And yes.
Gavin Purcell: And then and dominate the conversation or not be able to create a game that [00:35:00] you want because they have a right to, they, they say like, oh, you can't make that image. Right. That's a thing. Right?
Kevin Pereira: Um, by the way, Google's anti-gravity, uh, coating system, it can use nano banana natively. So if you ever wanted a vibe code, a game, you could go in there and say, I wanna make a little game.
I want it to be a webpage game, blah, blah, blah. And then you could say, here's the style that I want for the graphics. And if you have enough credits, it will go and it will generate the graphics for you with the right resolution, the right size, the right everything. It works. I, wow, cool. Fully used it like last week to bang out a quick project and it did a phenomenal job of generating assets for it.
So again, for the individual creator, yes. I don't know if it's gonna be TRO two. Or maybe someone will do Vampire survivors mixed with Bicho. I'm just excited for the weird games that are gonna get made when Yes, yes. You know, people that traditionally were not, not invited to that conversation can do it.
Um,
Gavin Purcell: yeah, I think that's the biggest thing across the board is just open up the conversation and try to figure out way more tools for people to use.
Kevin Pereira: I wanna shout out, uh, [00:36:00] meta because they've kind of like, you know, llama for a minute was very, very exciting and interesting, and then the wheels seemed to fall off and then they went on a hiring spree, but.
Hasn't really paid dividends, but they have a SAM family of models Yes. Where segment, anything. And they released SAM Audio, which is uh, I believe the weights are out too. And it's open source if I recall. And I think that's right. You can use basic text prompting. To extract and modify like audio from any given source.
So for example, if you have like a noisy street interview where there's a, one of their examples is there's a train going across the tracks. In the background, there's a woman speaking into a camera. There's like a car that drives by with some music. Blaring from it, and they're literally saying like, Hey, segment the speaker's voice.
And it isolates it. You don't hear the wind, you don't hear the train, you don't hear the car. Okay, let's just get the train. It isolates it. And look, it's a powerful thing for people who make media. If you've ever tried to edit podcast audio and you know how difficult that can be sometimes, uh, [00:37:00] if you've done a shoot in a noisy environment, you wanna change the music in the background of something because it's licensed.
But if you start thinking about the augmented reality future, Gavin, where we're gonna have bionic capabilities and being able to say like, not just let me hear the person that I'm speaking to right now, but, oh, what's that song in the background? Amplify that. Let me hear it. Hey, just record the background song or whatever.
Yeah, I'll shazam it later. Like. Do
Gavin Purcell: you wanna play a little bit of the clip that they have? 'cause it is actually interesting to see how it highlights one video over another.
Meta SAM: You'll not believe the day that I have. Seriously? No way. I can't wait to tell you all about it when I get home.
Kevin Pereira: So like super noisy.
You hear a subwoofer from a car, there's wind, there's a train. Then let's go to just, okay, let's isolate the speech. Hey, hang on. Can you hear me super loud? You know, what do you wanna, actually, let's gut it. So very loud indeed. Um, lemme go to general sounds here. This is like the traffic to everything. It's not the most exciting thing to listen to, but you don't hear the person speaking.
Gavin Purcell: Yeah. So basically [00:38:00] it's allowing you layer by layer to kind of go into a soundscape based on one recording, which, if you've ever made video recordings, is like an incredible thing and an incredible tool to have. And we talked about this, I mean, forever ago we got very excited about the idea of like.
Sound enhancement like this is maybe the most useful thing for somebody who's shooting video that I've seen come out of meta in this way. Right? Yeah. Which is a very cool thing. So yeah, shout out to the Sam models. Very awesome thing. Um, Kevin, I wanna move on to this thing you showed me, which is interesting, is like a pair of legs as a robot.
Yeah. We talk about robots on the show quite a bit, but there's a walking pair of legs because why would you need a header? A body really? If you just need legs, why not? Tell me what the Tron two robot is.
Kevin Pereira: Yeah, so they, I mean they're calling it like a Lego robot essentially. It's a modular system. There's like a head torso, arms and legs, but they can all be sort of bolted on and interchanged whenever needed.
Um, so conceivably. Instead of just like, well, this is our robot, this is our form factor. It could be, well, for this particular iteration of it, it needs wheels, or we [00:39:00] want three or four arms on it, or whatever. You can kind of bolt together the version of the robot that you want, but when you see. The, the robot legs without the torso in the arms and the head, you realize like, it, that to me was chilling.
'cause it's like, oh yeah. In the future, Skynet wars that we're gonna have, when you blast apart half the robot, it's still gonna come marching after you with just the legs. The mean that two oh nines part, but just legs and it's gonna soccer kick you into oblivion.
Gavin Purcell: Imagine if those legs were like, you know, you could stylize them in specific ways.
It could get very scary. You could look like it could be abo legs, or it could be like, I don't know, wolf legs for some sake. Imagine a series of wolf legs coming out. Shin blades, yeah. Ooh. Just little, like little
Kevin Pereira: katanas at the knees that come out. Chin, chin. Like it's, that's right. It's not gonna end well.
Gavin Purcell: You never know. All right, everybody, we're just trying to see what you did with AI this week. It's time for ai. See what you did there.
AI See What You Did There: Without a care. Then suddenly you stop and shout.[00:40:00]
Kevin Pereira: Apple released a new model that lets you take a single image and make a dogen splat out of it. We've talked about these splats before, but it's essentially like. It's a bunch of 3D little pixels that with pretty high fidelity, let you look around or navigate worlds. And splats can be small, little, like tiny little snippets of something.
Or they could be citywide splats, if you like, flew a drone over and grabbed high res, uh, imagery. So it's, it's extracting depth. It's taking the, the data from the image and wrapping the depth like in a texture. And so Apple released a model that very quickly processes. These splats out of images, and so people are taking that tool and putting it in like the Apple Vision Pro.
Very cool. Where? Yeah. And someone has a, uh, there's versions of this with like hiking photos or whatever. I saw one where someone had the a, a photo of their newborn and they turned it into a splat and it feels like. A [00:41:00] black mirror scene where they're able to like zoom in on their newborn and they're the, the newborn is resting on like a, a bed of blankets and there's like foliage around
Gavin Purcell: and you say black mirror.
I say this is a charming, nice way to see your family, Kevin. That isn't scary at all. It's a nice thing. Imagine you're on another planet and your wife were able to send you these things. Like it would be great. Okay. That's fair.
Kevin Pereira: That, that is another take. Um, I think it's creepy and weird and. You can also imagine like, well, there's tools that can take a single image Gavin and make it come to life with video and tools that can add audio to that video.
In fact, some of them do it in one shot. It's not hard. Yes. To imagine this future of like, oh, we're gonna take our old memories, this black and white ruined photo of a, you know, grandpapa coming back from war. And we're gonna put it into the machine, it's gonna colorize it, clean it up. Yeah. Make it a 3D splat, and then make it motion video.
And once you have that, you might as well make it interactive. And now we're taking a single photo and we're walking around [00:42:00] and chatting with people. From like photos, like that's, yeah,
Gavin Purcell: I mean it's, it's the world models thing, right? Which is, it's like when you start to think about, we've talked about world models in the show before, but you can start simulating a world, the gian splats start to kind of get you there.
And then actually this really interesting workflow from Chetty Art used one, 2.1 to recreate the matrix move with Homer Simpson. So what's interesting about that? Is if you think about like what it looks like to do 3D environments,
Meta SAM: yeah.
Gavin Purcell: You could also then skin those environments based on whatever you would want.
And so you could change that environment in a really interesting way. So this is like the kind of probably, I wouldn't say like super distant future, but it's like two to five years you'll have these kind of 3D environments in which you can walk around it and change the look of it all. The Homer
Kevin Pereira: clip, like you just kind of casually mentioned it there.
It is phenomenal. Yeah, it's great. Right? It's really cool. It's 3D Homer dodging the The, the Quin iconic pink Donuts. Yeah. On the rooftop, but very Springfield looking background like that is such a good one. And it also makes me think of the Miss Piggy and Meia [00:43:00] trailer. Yes. Yes. So this
Gavin Purcell: is from Charlie b Curran.
Who recreated the Melania trailer? There's a documentary about Melania Trump, but put Miss, picky the character in every shot. And if you haven't seen this, it's really worth checking out with just another very good example of editing, uh, using AI video, which I thought was very fantastic.
Kevin Pereira: Speed to meme is incredible.
I haven't seen this last one yet. I'm clicking on it now for the first time. Okay, so I wanna
Gavin Purcell: explain what this is. So, I've talked about Zora's remix, uh, feature before and something that I find really interesting about it, and I don't think enough people know about Sora is that, yes, you can make videos, but remixing videos is one of those things that like only really happens on Sora and more so in the app.
And you see a way. That people can build off of each other. So Kevin, if you click on this one, what it was, and, and by the way, the audio is making a dumb like, kind of juvenile joke about a name, but it's a woman is showing up on a soap, a soap opera from the eighties, and she walks in. She's kind of actresses.
But if you swipe right, what I find so fascinating about the way that people grow off [00:44:00] of each Sora um, uh, videos is that they will change something and then the next person changed something else and then the next person changed something else. And I just love that. I just love that this woman. Went from being a pretty normal looking soap opera practice to having an insane amount of hair.
Just an incredible amount of hair that grew in different ways. And then eventually I created a version of her that was made entirely of hair. And to me, this is kind of the weird, you know, we talk about 3D models and 3D world models. This is the way that AI is changing kind of creativity, right? Because before, yes, you could do like remixing of other people's thing, but until there was a way to kind of leap off of somebody else's stuff, share it and kind of have like a, a shared like kind of language around it.
This feels like a new medium. And this is what I just wanna point out is this is why like SOA feels like it is changing. And I'm not just, this is not just soa, like people using AI video, but what SOA does is put it all in one place and allow you to do it. So anyway, I just thought this was a very cool way to look at what AI video is and what makes it [00:45:00] different than just recreating movies or something else like that.
Kevin Pereira: Super fun.
Gavin Purcell: Um, that's
Kevin Pereira: awesome.
Gavin Purcell: All right everybody. We will see you all. We will see you all next week or whenever. We are gonna take probably a week off here for the holidays and we will see you all very soon. Bye everybody.
Kevin Pereira: Bye friends.