Google Drops SO Much AI: VEO 3, Gemini Madness & More. Plus, OpenAI's $6.5B Bet

VEO 3, Gemini Diffusion, Android AR Glasses & so much more… Google went HAM on the AI space this week at Google I/O. Plus, Sam Altman spends 6.5B on AI wearables & Jony Ive.
VEO 3, Gemini Diffusion, Android AR Glasses & so much more… Google went HAM on the AI space this week at Google I/O. Plus, Sam Altman spends 6.5B on AI wearables & Jony Ive.
After last year’s semi-lackluster I/O event, this year Google rolled out Sergei Brin & Demis Hassabis and a whole lot of new product. We go deep on all of it but especially VEO3. Gavin bought the enormously expensive Google AI Ultra plan to tell you how it works.
Plus, Anthropic’s Claude 4 is rumored to come out… today? Microsoft’s Build event showed off how MCP will directly integrate into Windows, OpenAI’s Codex software agent and, yes, MMA Fighting Robots.
IT’S MOVING REAL FAST AGAIN Y’ALL. SPEED UP INITIATED!
Join the discord: https://discord.gg/muD2TYgC8f
Join our Patreon: https://www.patreon.com/AIForHumansShow
AI For Humans Newsletter: https://aiforhumans.beehiiv.com/
Follow us for more on X @AIForHumansShow
Join our TikTok @aiforhumansshow
To book us for speaking, please visit our website: https://www.aiforhumans.show/
// Show Links //
Google’s I/O Keynote https://www.youtube.com/watch?v=o8NiE3XMPrM
Google’s Sergei Brin on his return and Gemini AGI
https://x.com/vitrupo/status/1924997623129505799
Veo3, Imagen 4 & Flow AI Video & Image Models
https://blog.google/technology/ai/generative-media-models-io-2025/
Veo 3 Specifics From DeepMind
https://deepmind.google/models/veo/
“Can We Talk” VEO 3 Example
https://x.com/arikuschnir/status/1924953349943697763
Gavin’s VEO 3 Generations 70’s Chef Show https://x.com/AIForHumansShow/status/1924934882930917861
Hot Dog City Rap: https://x.com/AIForHumansShow/status/1924937880369299934
Cursed Mayo: https://x.com/AIForHumansShow/status/1924949594796130779
Action Scene With Four Attempts:
https://x.com/AIForHumansShow/status/1924943671323168976
Fortnite Generated in VEO 3
https://x.com/mattshumer_/status/1924994290729599222
Gemini AI Ultra costs $250 (!!) per month
https://www.theverge.com/news/670495/google-ai-ultra-plan-pricing-launch-io-2025
Google’s Android XR Gemini Glasses https://blog.google/products/android/android-xr-gemini-glasses-headsets/
Gemini Diffusion
https://deepmind.google/models/gemini-diffusion/
Google’s Try It On Experiment
https://blog.google/products/shopping/how-to-use-google-shopping-try-it-on/
Google Beam
https://blog.google/technology/research/project-starline-google-beam-update/
OpenAI Buying Jony Ive’s AI Wearables Start-up for 6.5b https://x.com/markgurman/status/1925235383102812491
OpenAI’s Video With Sam & Jony Ive For New Device Company
https://x.com/sama/status/1925242282523103408
Claude 4 Coming Soon Rumors
https://x.com/btibor91/status/1925084250107478506
Code With Claude Event
https://www.anthropic.com/news/Introducing-code-with-claude
Microsoft Build: MCP in Windows
AI Darth Vader in Fortnite https://x.com/DiscussingFilm/status/1923361246457241679
AND… SAG-AFTRA is involved. https://arstechnica.com/ai/2025/05/fortnites-ai-darth-vader-spawns-unfair-labor-practice-charge-from-voice-union/
MECH COMBAT ARENA!!!!!
https://x.com/TheHumanoidHub/status/1923087269914706414
VEO 3 Examples
Entire Car Convention Video: https://x.com/laszlogaal_/status/1925094336200573225
Skate Park Interview: https://x.com/venturetwins/status/1924927087842087322
Dog Runs to front Porch: https://x.com/nmatares/status/1924931844879134804
AI Stand-up https://x.com/fofrAI/status/1924924738494669011
Bear Stand-up: https://x.com/AIForHumansShow/status/1925210779168768225
AIForHumansGoogleIOVeo3OpenAIJonyIve
Kevin Pereira: [00:00:00] Google, just straight up bodied the entire world, setting us up for the next generation of artificial intelligence.
Gavin Purcell: That's right, Kevin, at Google io, they dropped a ton of new stuff, including Gemini, live new AI agents, and maybe most excitingly an update to VO VO three, where you can do all sorts of
Kevin Pereira: stop.
Why are we even doing, let VO three do it for us? What? What are you talking about, Kevin VO three new video model. It does audio, it can generate people. It has good lip sync like we are obsolete, so just. Roll it. Ah, you're right. Okay, let's try it.
Welcome back to AI for Humans. Your hosts have been dealt with.
Okay.
Kevin Pereira: Eh, that was a bad idea actually. Oh, wait, wait, wait.
Gavin Purcell: I gotta, I'll try it again. Let's try it again. Okay.
Kevin Pereira: Welcome to AI for Feet.
Thankfully we got rid of the humans.
Kevin Pereira: Nope. That's worse. That's actually, that's objectively terrible. And uh, if you're only getting the audio of this podcast, good,
Gavin Purcell: we will get more into VO three, but also OpenAI just bought, Johnny [00:01:00] i's AI wearable company for $6.5 billion.
Kevin Pereira: Oh, also ai. Darth Vader is in Fortnite and everybody is unhappy about it.
Gavin Purcell: And finally, Kevin, one of the things we've been waiting for forever. Robot, MMA,
how exciting would a fight between humanoid robots be?
It is time for AI for humans.
Gavin Purcell: Kevin, huge, huge stuff in ai. This week we are gonna get to so many things. I mean, literally open AI is buying a thing from Johnny I of a giant new AI wearable, but most importantly, Google IO happened this week and I was kind of shocked at how big it was. Not only big, but like actually really a lot of new things drop.
What did you think about this? Just at the top?
Kevin Pereira: It was overwhelming. Yeah, that there's no other way to put it. It [00:02:00] was announcement on top of announcement on top of announcement, like clearly the company is, is aligned and marching towards something, and in true Google fashion, I expect. Uh, 80% of what we saw on stage to not exist this time next year.
And I think that's okay because they announced so much that even if 20 squeaks through, it's still, uh, incredible and some of the stuff, you know, launched, Hey, and it's ready now. It's in your hands now or it's coming soon as in tomorrow. Like that was also impressive. But there's so much to catch everyone up on because they really touched on everything.
Gavin Purcell: Yeah. And one of the most fascinating things about this whole experience was seeing the return of Sergey Brin, the co-founder of Google, who has very much folded into these efforts. Let's start with a little, uh, sound up from what he said on stage while he was being interviewed next to deas.
Sergei Brin: I think as a computer scientist, uh, it's a very unique time in history.
Like, uh, honestly, anybody who's a computer scientist, uh, [00:03:00] should not be retired right now, should be working on ai. That's what I would just say. I mean, there's just never been a greater sort of problem and opportunity, a greater cusp, uh, of technology. Um, so I don't, I wouldn't say it's because of the race.
I. Uh, although we fully intend that Gemini will be the very first stage, I clarify that.
Gavin Purcell: So there you go. Sergei's back. He's also looking a little billionaire chic. He's got some stubble. He's looking chill. You had mentioned you saw him on the All In Podcast where he had a glass of wine, like Sergei is back in the groove.
Maybe that's Mo. I would watch that movie. By the way, Sergey's back in the groove.
Kevin Pereira: You're watching it play out at the IO conference, he just steps off his yacht and says, we're gonna make a GI happen. That's where he's at in his life.
Gavin Purcell: And I think, Kevin, the thing that when I watched this very long presentation in his two hours, the thing that jumped out to me right away was VO three and VO three is the next generation AI model.
From Google, and we've talked about VO two for a while as one of the best AI [00:04:00] video models, but Kev, when I started to see the clips this could make, I was pretty shocked for one specific reason. Can you guess why? Well, no. Why? Well, Kevin, it can do audio is. Part of the clip. Oh wait, that's why. That's why
Kevin Pereira: I was like, can I guess why?
No, I know why I watched the conference. I don't know why you were shocked, but yes, it does. In single shot, you can control cameras, characters, you can get a consistency out of scenes and of, there's physics, like really, really in depth physics, like the impressive level of mixing of substances and whatever.
But yes, you can also hear the clips, and that means character dialogue. That means sound effects like Nat sound from. From the wind to the reverberation of a, a character's voice within a scene to even music. Yeah. This is a fully multimodal something that, uh, has now. As you're scrolling casually on social feeds, you have to stop and go, wait, is this clip AI or not?
Like for the first time, truly, you have to stop and look [00:05:00] very closely and listen closely.
This ocean, it's a force, a wild untamed might, and she commands your awe with every breaking light.
Gavin Purcell: To me, the other thing that they do really well is the lip sync is great. So for dialogue, lip sync is great, and there's a great clip somebody created.
Let's play a little chunk of it called. We can talk. This was a video created by, uh, somebody named Ari Kay that they dropped on X and it just gives you a good example of what's possible with this. We can talk
No more silence.
Gavin Purcell: Yes, we can talk.
Ah, we can talk. We can talk. We can talk with accents. Oh, I think that would be marvelous.
Fun. Yes, it is Very fun. But yes, it is very good. Good. It's a very fun. I can talk. Yes, we
Gavin Purcell: can talk.
Kevin Pereira: Yes. Yeah, we can talk. We can talk, we can talk. Yeah. So
Gavin Purcell: you get a sense of what's going on here. Just for people that are only, uh, listening to the podcast, this is two very realistic people in most of these talking to each other.
And again, in the past when you would be making AI video clips, [00:06:00] I. You'd be generating the video first and then trying to find an audio model that would match together so that you could connect them together and then lip sync that audio to it. What's different with VO three is it is a truly multi-model video output, meaning that you are getting both the video, which is by far in my mind, the best one I've seen to date, outside of the fact that you're gonna have a lot of things you can't generate based on rights.
But you're getting dialogue, you're getting, uh, music. To your point, Kevin, you're getting sound effects and all of that is baked in now. It is not perfect and we can talk a little bit about that, but to me, this is the most excited I've been about AI video since the Sora debut before we were able to play with it.
Kevin Pereira: And this thing we can actually play with. I mean, and again, if you audio only the sample that we just played has characters in. Every time, uh, period. It has cartoon characters. It has characters that are singing in smalls, like Smoky Clubs, and you're hearing the full band behind them. That is all one model, one prompt all coming out.
But they also announced a [00:07:00] new, uh, video editing software called Flow, which essentially lets you prompt any scene into existence. Grab the handle like the end of that scene. Massage it to where you want the scene to either cut to a new scene or extend the scene in a different way so you can rapid prototype entire movies, commercials, whatever you wanna bring into existence.
By typing, getting it together, changing the handle, prompting a new outcome.
Gavin Purcell: You think that, and I will say my experience so far with flow has been less flow and more semi flow odd, flawed. But it is, is very, it is very cool and I think it's really interesting. The thing about flow, they've designed a, a tool like what Kevin said, is that basically allows you to kind of.
Stitch these to these together in the same way that when SOA came out, they tried to design that storyboarding tool. The issue that I've run into so far is that I, I think a, it's pretty expensive, which we're gonna get to in a bit and talk about what this costs, 'cause it's not cheap, but [00:08:00] two, consistency of voice, consistency of music, all of that stuff is not set yet and it's a big issue.
But first, let's just talk about. What we did with it. Did you see
Kevin Pereira: the example of the guy in the convertible with the giant chicken? Yes,
Gavin Purcell: I did see the, the convert that was Yes. Yes. On stage. So it's clearly a skill issue, Gavin, that's what you're
Kevin Pereira: saying.
Gavin Purcell: Uh, by the way, I, you don't know how to flow, bro. I did tweet out this morning that Google is missing out by not having like a one hour, very deep, uh, flow tutorial that they put out for people to use it because it's not the most intuitive software to use in the world.
Um, but I, like I said, I did go hands on with this. Um, I spent a, a good amount of yesterday generating video. The very first one I tried to make, I think is still my favorite. I was trying to find weird things people could do. 'cause a lot of people were trying to do different stuff. So this was a one shop prompt.
There are some issues with it, but I wanna play it and listen to it first and we can describe what's going on.
Now, this takes some elbow grease. Hey, watch it lady. I'm trying to rise here.
Gavin Purcell: Okay, so what this [00:09:00] is, I prompted for like a seventies cooking show sitcom. Like I was trying to make a sitcom clip, and there's a bunch of really good examples of sitcoms out there where they add in the laugh track and other stuff.
I. My goal here was to have her working on this piece of dough, and then the dough becomes alive and has a Brooklyn accent. And the dough is supposed to be saying the second line, not, not the, she's saying it. Right, right. But there are a couple really interesting things that happened here. One, first of all, it looks amazing, right?
Like the look of it looks like a, a, a sitcom. The woman's voice is perfectly lip synced to her mouth. You hear the
Kevin Pereira: rolling pin
Gavin Purcell: against the
Kevin Pereira: dough in the counter, like it's touching all of those things.
Gavin Purcell: There's a small element at the very end that I found super fascinating, which is the carrier, if you're not watching it again, the dough has kind of like cartoony eyes in a mouth, but then when she puts the dough back down.
It turns the mouth and the cartoony eyes kind of fade out nicely, and you see it go back into the dough again, which was just like a thing I wouldn't have expected. A VI video to do
Kevin Pereira: the telltale signs of like looking at the hands for motion blur, uh, or, or [00:10:00] for bleeding of pixels. Looking in the background like baskets of fruit hanging a kettle on the stove, the pots that are dangling, the, like the fun seventies colored kitchen, like all that looks coherent and believable, like it hallucinated.
A very, very believable world for you.
Gavin Purcell: Absolutely. And then Kevin, I couldn't leave Hotdog City out because I saw a couple, a couple clips of people trying to make raps. So this was a very simple prompt. I said, I wanna have a rap about a guy who's rapping about hotdog city with two people as hotdog dancing behind him.
So let's play that. Welcome to the city where everything's a dog. Yeah. A
hot dog. That's right. Right.
Gavin Purcell: So interestingly there, Kevin, like, first of all, again, lip sync's. Amazing. Um, I didn't give it any direction on like what the music should look like. It's funny at the end of that clip where it's trying to figure out what it's gonna do, because these, by the way.
These are only eight seconds. That's the max that you can do, uh, on their own, but like pretty good, right? That was a single shot. I didn't do anything extra. I just, it came out like [00:11:00] that. And we've tried Hotdog City for so many different formats, and generally video is really bad at it because it's such a weird, surreal idea and it's not perfect in the background.
But there's a city and there's some giant hot dogs. The two guys in hot dog outfits are very clear and clean and they're dancing. So again, something I thought was just really awesome.
Kevin Pereira: I mean, we're gonna have a left shark moment here, but Right. Hot dog, right dog. When he crouch down at one point you see the hot dog outfit that he is wearing to form.
You watch the wrinkles fold at the waste of the character. Like the physical modeling and understanding, yeah. Of the way the material of a hot dog outfit would be. On this completely made up character on this city street filled with hotdog. I mean, that's the stuff that, the, the tiny little things that that, that make you make it take a little pause.
Gavin Purcell: And, and Kevin, I couldn't leave this out. Uh, couldn't leave cursed AI out if, unless I was doing this, uh, my own brain. So if you go to the cursed mayo, Jen, I, you may have seen a video we made with [00:12:00] Mayo about VO two. This is our first Mayo centric VO three video.
It's been a long day. Thankfully Mayo is here.
Gavin Purcell: Okay, so what's fascinating about this, if you're not watching, there, is a kind of a low light looking, very realistic shot of a kind of a working class guy sitting in a, a small kitchen. He has a jar of mayo and he says, I prompted that dialogue in there, but he says that line and then eats a spoonful of mayo, and you hear the sound of him eating the mayo.
You hear the sound of the, like the clicking of the, of the spoon. Again, all of this is incredible to me. I think that the key will be consistency and the key will be seeing how you can make stuff work over time. I did try a action movie, a shot, which, um, we'll show here, but you don't have, we don't have to like, uh, listen to, but it's basically, I gave a line to a guy and a little teddy bear next to him.
I was gonna give him the line and you can see if, if you are on our, our go to our X handle. You can see the four versions of this. We'll put it in the show notes, [00:13:00] but. Each time it didn't get it. Exactly. And one of the most frustrating things a little bit about this is that, you know, unlike other, like say unlimited video plans, if you're paying a lot of money, you have limited credits to work with and you know, that is a big deal when it comes to how expensive it is.
Kevin Pereira: Also, if you're wondering, did Google owners of YouTube use content from YouTube to train this system? Uh, and we don't know for sure. We believe so, but if you want a good indication of whether or not they did, just prompt it to do a streamer reacting to anything video, and you will get out very, very good looking video gameplay.
And someone in a corner, usually in a gaming chair with a headset screaming about that. You get both videos in one with audio. I'm gonna play a clip here. This is Matt Schumer. Put this out there with a tweet quote. I don't think VO three is supposed to be generating Fortnite gameplay.
VEO 3 Clip: Oh my God,
Kevin Pereira: yes. [00:14:00] Woo.
Victory Royale with a pickax who Now that was, I mean, I saw that video for a second was like, wait a, wait a minute. That. Hold on. That's, he's doing the gag where you use real video and pretend it's ai, and then when you zoom in, you can see that the Fortnite interface is clearly a little melted into itself, but this is completely believable looking.
Fortnite gameplay with a live action looking streamer in the corner. Shouting into the microphone about their victory. And if you didn't go full screen and if you did not scrutinize this clip, you would think that's, that was a legit something,
Gavin Purcell: which is where we are with this, which is crazy. I think the one thing to be aware of is like Google is trying to catch a lot of this stuff based on names that are prompted and, and I've tried prompting a few things that are very specific ips and they wouldn't come through.
In that instance, Matt did not, according to his prompt, use the word Fortnite. Right. So that probably is why it slipped by. I assume that will not happen again for a while. But there are other people that were using. Minecraft, uh, streamers and they looked very [00:15:00] direct and, and right. We are gonna share at the end of the show today in our A IC, which you did their section, a kind of full VO three selection of some of our other favorites, including prompts there.
So stick around for that. But Kevin, the downside of all of this, this VO three magic, is that Google also introduced a new pay plan for their AI studio, which gets you a lot of stuff and we're gonna get to that, but. It is not cheap. This is, this is the most expensive paid plan we have seen to date. Uh, Google AI Ultra will cost you 250 US dollars per month to access.
Kevin Pereira: Yeah, we blew six months of Patreon on three clips of old men eating mayonnaise, is what you're trying to say.
Gavin Purcell: No, that's not true. That's only one month of Patreon at this point. That's all it is, but right now I will say. There's a discount for three months. So you can get it for $125 a month instead of two 50.
So that's active right now. It is still very expensive. You also do not get unlimited VO three gens, so you get about 12,000
Kevin Pereira: credits. One of the [00:16:00] things they're clearly trying to do is, is go after not only OpenAI with a higher tier, but with Apple, I think the way they're bundling in all of their services, like an Apple one subscription.
'cause you do get. Gemini Flow. You get Whisk, which is another video generator notebook, lm uh, project Mariner, which we will maybe get to, uh, you get YouTube premium, that's like a 15 to $20 thing, I think. And then, uh, 30 terabytes of photo drive and Gmail storage, which is, which is more than zero terabytes of it for, you know, again, the discounted price of 1 25.
Sure. Two 50 when that runs, out's a lot of money. That's, that's a lot. That's a lot.
Gavin Purcell: I think people are looking at this and I think it's important to separate the VO three thing is part of it. Right. But I think one of the things people talk about with Opening Eyes $200 PO policy is they're getting access to oh three, right?
And there's another big update that they're dropping, which is Gemini Deep Think, which is coming only to this larger model and Deep Think is [00:17:00] going to be, uh, Gemini's. Very high-end reasoning model, so this is gonna be something you'll get access to. It's probably like oh three PRO would be, which is not out yet.
I think people expected that to come out from OpenAI this week. But the tricky thing I find is that what these companies are trying to do right now is kind of build lock-in. Like if you're gonna pay $250 to an AI company or Google. Guess what? You're probably gonna be using all the Google AI services.
You're not gonna be jumping back and forth because there's no way you're gonna pay for that and the open AI price and whatever anthropics gonna drop. So to me it is a very interesting price point. I don't think I'll be keeping up after the first month. I wanted to try a bunch of this stuff, both for the viewers of the show and also for myself.
Um, but I, it is something that feels like it is going to get crazier over time. I've also seen people say. Hey, this feels like it takes AI into another level of like what's possible for normal people. And I agree. Like the fact that you don't get any VO three generations with the cheaper platform, to me feels off.
Kevin Pereira: Well, uh, someone did the [00:18:00] math, uh, FOFR, uh, the AI Ultra Plan, $250 per month, it's 12,000 credits. That's 150 credits per eight seconds of a VO three video. That's basically 39 cents. Per second of video, which sounds astronomical, especially considering a lot of those seconds might be completely broken. Yeah.
And you need to regenerate them. But if you really do get, um, I don't know what your prompt was, but it was, I think it was cinematic shot of green ogre and swamp winking suggestively small bubble appears behind them in muddy swamp water. I don't know what you prompted. That's exactly, I created
Gavin Purcell: that. It didn't work for me though, was, it wasn't, uh, dirty and swampy enough.
Kevin Pereira: And, and that I understand that it wasn't quite your filth, but to get something like that out of the machine for 39 cents would be a steal for what it would cost to do it regularly. So it's, um, I mean, it's a delicate balance, but I think the best thing that people can do is to not actually purchase it right now to take that money, hold onto it.[00:19:00]
Um, yes. It's important and then give it to us in the form of a Patreon donation. But if you don't wanna spend money, you can still support this podcast, Gavin. You can like, you can subscribe, that's free on the old YouTube. Uh, you can join our Discord and hang out and chat. You can leave us a five star review on whatever platform you hear this or consume this because that.
Gavin helps us out immensely,
Gavin Purcell: and you are part of a growing army out there. You are part of a group of people that are getting bigger and bigger every week. So it really does help when you do those things, like Kevin says, liking and subscribing to the YouTube obviously, but just sharing or leaving a comment there makes a big difference.
We've had, uh, top of both YouTube and audio, uh, in terms of audience recently, so thank you so much for doing that. There's a lot more to go. We want, we have a lot of stuff we wanna cover and yes. Having that money to help with the Patreon has been a big difference for us so far. So thank you everybody.
Kevin Pereira: Thank you everyone who, who backs and helps out. Now let's get to some future stuff. The tech that can sit on my face, Gavin.
Gavin Purcell: Okay, we have a bunch more stuff to get through from this IO event. Um, [00:20:00] speaking of future stuff, Kevin, there's a very cool new demo they dropped at the end of this event. For Google's Android XR platform and they have a new pair of XR AR glasses that are specifically designed, I think to compete with kind of meta's upcoming smaller glasses.
Not the big fancy ones that are still few years away, but I dunno. Did you see the demo of these? It's actually pretty interesting to me and I might consider, I did pecking these up.
Kevin Pereira: A hundred percent. Listen, so XR is the extended reality. This is more the, uh, hey, we're gonna partner with, I believe, Samsung and compete with the Vision Pro.
Yeah. So this is like, put some goggles on your face. Yeah. You can let the real come in. But that's, that's ther thing. That's not the, that's the XR thing. Yeah. Okay. No, but this is, this is part of the thing is that they kind of sandwiched it together. Yeah. And they're, they're being kind of communicated similarly.
So, you know, XR is their platform basically. That will be like, you know, the way Android is for mobile phones or Android TV is for set top boxes. Android XR will be for the goggles. Android AR is the augmented reality [00:21:00] that is more the hyper lightweight, but see-through, yes. On glasses. Experience that they had some real time, some risky real time demos of at the end of the conference.
And that's the part, I mean, we've all been saying this is where it is going. Uh, I, I still don't think it's here. And it might be a few more years out. Even he, he hearing like, um, Sergei talk about it, it's like, ah, there's still some battery issues. Ah, there's still some latency issues. But they had 'em. On their faces on stage doing some incredible things, which I think we should talk about right now.
Gavin Purcell: Yeah, I mean, to me, the thing that demo is really interesting. So again, it's towards the end of the Google i uh, IO event video. But watching her walk out there was a small hiccup, but like walk out, get all the information in these little tiny popups and you can kind of see if you look at the opposite side of somebody wearing these, there's a little square that'll pop up when they get in a piece of information.
But she was able to get directions. She was able to, uh, text somebody very easily via voice, like all the stuff you want. It's almost like what the promise of Google Glasses was. [00:22:00] Yes. 10 years ago, and now it's possible in a real way based on, yeah, connectivity based on technology. All that stuff feels like it's reasonable now.
So in my mind, like I think it's, I would want this, I think
Kevin Pereira: Well, and some tiny little things like when you say get directions, like when you glance down a full, almost like a, a rotating in real time like compass view of, yeah. A Google map appears so it's not just like a spinach green arrow pointing on the screen when you are interacting with the, uh, like the on-device agent, you can give it multiple commands at once, something sir, still can't do.
So if you say, send a text to so and so about. X, Y, or Z and then mute my notifications. It's like, got it done. That was two different commands. I'm doing them both. The one thing that excited me the most was the demo that kind of hiccuped at the end, where they had two people wearing the glasses speaking in different languages and in nearish real time, you're watching a translation of what that person is saying like.
On your glasses. So the idea of like, Hey, digital [00:23:00] nomads, go ahead and set up shop wherever you want. The largest barrier to entry to do that, or just those that love travel in general. Go and go to the remote areas of the whatever. As long as there's decent connectivity. You'll be able to understand and converse with people like that is, well, you might need to carry a second pair of glasses, I guess to put on their face.
But Point is still a very, very cool demo. Yeah. And I feel like we're inching closer to it being reality.
Gavin Purcell: Yeah. And obviously the AI stuff that we talk about every week on the show is kind of helping drive that in a big way. Kevin, the other thing that I think that you were really excited to talk about that I was fascinated by here, is there diffusion models that are coming out?
Yeah. And what difference that makes. So this is a kind of a techie conversation, but. One of the big things was an update to the raw AI model that's powering Gemini in part.
Kevin Pereira: Yeah. This is like a, a new way of, of looking at, uh, of generation. So normally it's token by token, meaning the AI is trying to predict the next word in the.
Sentence. That's Gavin is on par. You're the new [00:24:00] Claude. So Gavin Opus 4.0 just launched. So yes, it's going token by token, which it's very fast, you know, in doing that, but it's always constrained by waiting for what is coming before it. Mm-hmm. This is a gross distillation, but this is sort of how it works with these diffusion models.
Um, the same way you've seen diffusion generate imagery where it starts with noise all over the place and tries to refine it down and make it make sense to give you the image of Shrek in the swamp. Winking, suggestively, dirtier, please. Gavin's prompts are weird. I. It can do that, but with like, let's say code generation.
So if you say, uh, build me an app that is, uh, a Tetris, but on the web you can think of the 10 different components that you might need to code to make that happen. But instead of waiting for the first one to finish, you can shoot off and start generating the different patches of what that thing might be, uh, and then try to merge them all together and make all the functions line up and make everything make sense.
It is out now. It is early [00:25:00] stages. It is so blindingly fast. Yeah. That you can imagine a future where you are just having a conversation with a machine and it is generating what you want. That quickly.
Gavin Purcell: I saw a tweet from somebody that worked on the demo that said they actually had to slow it down so that could show that it was working, which is pretty crazy and that is pretty impressive, right?
Especially if it's returning things that actually work. And even if it is like has a couple tweaks, it just speeds up every part of the process, which is amazing.
Kevin Pereira: I've been doing so much coding with AI Gavin, and I don't even like saying that. 'cause I think that's a disservice to people that actually write code because I'm, I'm, I'm really not.
Right? It's like I'm not Right. You're vibe coding through that. I'm talking to, yeah. I'm talking to a machine and it's generating things for me and I'm learning through that. But the biggest pain point is that sometimes you'll hit a button and you can walk away for five to 15 minutes because it's going through the code and it's writing other things and it's calling tools and searching the web and doing all that.
If it can happen that quickly, yeah. Then the fact that it fails or introduces bugs X percent of the [00:26:00] time is less an issue because I can very quickly refactor that. I can go and say, Hey, this thing is in here, but it looks wrong. This thing doesn't connect to that thing. Like to be able to. Communicate and build things that quickly.
This is going to change the way you present your work on the web, the way you build applications, the way you interact with friends. Imagine sitting down to play a game with your friends and you are creating the game that you want to play in real time. In real time.
Gavin Purcell: It's like with Suno, right? You had mentioned forever ago when you went to a an event and you like created songs for your friends with Suno in real time based on the things you'd happen.
Now you can do that with apps and also Kevin, the other thing that's gonna be interesting is. What about the idea when something like this is integrated into their Gemini Live product? Gemini Live is kind of the, the fulfillment of the thing they showed off last year, which was Project Astro, which is, you can have your phone out.
You can be streaming and get real time feedback about things. They showed a demo of a guy in a bike shop or his own shop. I thought this guy had a very beautiful, uh, home, uh, working space for fixing bikes. But he is, he's using this [00:27:00] tool to kind of help him walk through how to fix his bike. The other thing, which was really interesting they showed is a video of.
Somebody kind of trolling this thing where they walk around a neighborhood and they say things are what they aren't, and Gemini does a good job of recognizing, oh no, that's not a whatever. It's a dump truck. Sick convertible
garbage truck. Again, anything else?
Kevin Pereira: There's one thing where he looks at some power lines and he goes, look at those pigeons, and it goes, birds aren't real.
They're government listening devices that, so know that, that those are microphones with wings. You're doing, you're doing
Gavin Purcell: that with Project Marin. Or you're doing that with Magic Astro and all these things.
Kevin Pereira: I do love the, the grounding aspect of like, oh, hey, what? Like, they even said like, why is someone leaving a package on my front lawn?
Yeah. And it's like, actually that's a power generator box. So calm down and take your tinfoil hat off. Okay. Get everything right. I wanna see what it says about chem trails Gavin, but not to, to gloss over that bike demo. Because what they, they really tried to show a wide variety of capabilities within it from like.
Hey, what kind of search the web and pull down a manual for this bike. Great. Got it. Find me the page that [00:28:00] deals with this. Okay. And it can crawl through the PDF and kind of highlight things. Where is that? Screw on my wall. Yeah. Even though it's perfectly labeled, it does do that. They, they also did the hand wavy.
I. Can you call the bike shop and see if they have the part for me thing? They've been demoing assistance that will make phone calls for the decades, forever, decades now. Forever. Yeah. Right. Like it's for a long while. Is this gonna be the one that does it? I don't know. But in the demo they had to go like, Hey, any updates on that phone call?
Which I thought was weird. 'cause they had to ask the ai, do you have an update? But it did say, yes, they have it in stock. Do you want me to order it for you? I think there's a lot of hand wavy portions there, but. You can see where they're, what they're, what they're aiming for.
Gavin Purcell: Yeah. And by the way, that ties into the other, one of the other big things that happened there is they're talking about agent mode, right?
Yeah. And this is Project Mariner, which is the idea that supposedly, and this is not live yet, but you could have 10 simultaneous tasks. Going out on a given time, so you could be calling that, that, that bike shop. You could also be having it deal with your wife who believes you should, you shouldn't be biking as much.
You could also have it be booking a pro [00:29:00] thing with your, with your psychologist to make sure that you've talked through that issue. All these things, Kevin, could be handled the same time. My life is just gonna be me sitting down. Making videos of, uh, feet talking, that's all it's gonna be, which will be incredible.
Kevin Pereira: Now again, you can only have a max of 10 wives in this scenario for dabbling. True. That's a good point. Because it's only 10 agents, so you might wanna pare down, you know, just plan your lifestyle accordingly and then yeah, you, you know. They had the fashion thing, which I've, I've seen.
Gavin Purcell: Yeah, so there's a, they did a demo, a thing called Try It on, which is again, is, we've seen versions of this that is like, put on a shirt, uh, that's out there in the world and try it on and they'll link directly to it.
The thing I wanted to point out here is, Kevin, when you are building in this space, it is very tricky because you are always building against multi-billion dollar companies. Sure. And you're not entirely sure what a feature that gets launched. Might suck away your entire startup value. And in this case, there are a lot of really cool startups that are, that are doing this.
And I, by the way, I do think people will still be successful in this space as a [00:30:00] startup. Sure. But it is a feature now within Google Gemini that looks very mature and I think would be very good and people will use it. But the question is, would they go to a separate, uh, startup or a separate app just to do this?
I'm not entirely sure.
Kevin Pereira: The other question is, you know. Will this exist in seven months or was it just an interesting Yes, true on stage. Something that, I mean, and you can play with it. You can go and take a photo and do it. Like, maybe this will be an amazing thing that gets rolled out through the Google's advanced ai shopping powered experience.
Maybe, maybe, or maybe it just goes by the wayside in a few months. We don't know. Uh, in, in the avalanche of product demos and, uh, new labels and new things to sign up for in pricing tier, we also got new models. Yeah, they released some new stuff. Gavin, that, that you mentioned Google's deep think mode. We have a new Gemini 2.5 flash.
It's a very, very small model, which is important when you think about these things making their way to cell phones or set top boxes or, or or wherever where, you know, power and, and, and processing is, is limited. [00:31:00] What's up?
Gavin Purcell: I got a new wonky walrus. We're not gonna get wonky walrus here. No, but I got a new character.
His name is Benny Benchmark.
Ooh. Now that's a benchmark baby. Oh, it's not
Kevin Pereira: related to Benny Bleacher
Benny
Kevin Pereira: that we've had on before, which was different. Different Benny. A set of bleachers. Different
Gavin Purcell: guy. Different Benny. What benchmark Benny has to say is that. The Google 2.5 flash model. Mm-hmm. This tiny model has come in as the number two model under the larger Google Gemini 2.5 Pro model on the L-M-S-Y-S benchmark.
This is a big deal just from a pure numbers standpoint. It is beating all of the other models right now, which is pretty crazy.
Now. That's a benchmark baby job. Be job benchmark. Benny,
Gavin Purcell: Kevin, three very fast things that also happened here because we've spent a lot of time on Google io. Yeah. Tell me what you think about these.
Google Beam, which is a one-to-one AI saturated, um, communications tool for you to talk to people and have it look very realistic. Uh,
Kevin Pereira: you mean an AI first 3D [00:32:00] video communications platform? Gavin? Um, look, it looks cool. Uh, we've seen systems like this before. Give me the gladiator. Thumbs up or thumbs down?
We gotta move here. Thumbs up for potential thumbs down for implementation only for the moment because I, I think they're getting like three or four video feeds and stitching them together with ai, which is interesting, but I think there's a way to do it with a single webcam. So. Go.
Gavin Purcell: I don't, don't think in general, if you're the Roman emperor, you're not allowed to do thumbs up and thumbs down.
But I'll buy it. I'll allow it. I'll allow it.
Kevin Pereira: See that textbooks are wrong. They were actually more wishy-washy than me. Oh. Because they were like, they're like, there's nuance gladiators,
Gavin Purcell: AI mode in search. What is your thoughts on AI mode in search? So, sure. Okay. Sideways. It's fine. It's great. It's better than it was before.
That's also good to know. Finally. Google's real time translation. There was a cool video that they showed off of two people talking in different languages. It real time translates across the web. What's your thoughts on that?
Kevin Pereira: Two big thumbs up. Love it. Uh, can't wait for it to roll out for more languages.
As someone who deals with, uh, developers [00:33:00] in other countries, and there's a language barrier, uh, because I am, I can barely speak one. This would be just a, a great thing to have.
Gavin Purcell: Yes. What language is it that you can speak?
Kevin Pereira: Barely can speak lead.
Gavin Purcell: Is that right? L three. Three T lead. Oh, fantastic. Fantastic bro.
Okay. That is the big Google io roundup. It is a lot. We wanna make sure we got through all that, but there's some other huge AI news, Kevin. First and foremost, some breaking stuff today. Um, OpenAI is going to purchase for $6.5 billion. Johnny i's AI wearable startup. Now, this is not completely separate from open AI because supposedly Sam Altman has been in this for a bit.
They released a 10 minute Sam. You call this a Sam and Johnny Meet Cute across the streets of San Francisco, which is kind of fun. What is your thoughts on this? Maybe let's talk a little bit about what impact this could have on the future of AI and how people use it.
Kevin Pereira: Uh, I mean, look, we, we talk about how disruptive this technology is going to be to everything.
That would include [00:34:00] smartphones, that would include spectacles, that would include wristwatches, that would include so many things. And I think, uh, maybe, well, I, Johnny's probably designed some eyewear, but at least for Apple, he's designed pretty much every major device that has been so disruptive to all of the things.
So the notion that he's. Trying to cram this intelligence into a, a, a fleet of devices, which let's be clear like they do in their meet. Cute video. They mention that this is several devices, not just a device. Um, but yeah, the company's io and you know, Sam is quick to point out, this is the guy who made the iPhone.
I. It's a pretty impactful device. I wonder what he could do if he gets to think at the forefront of artificial intelligence and he mentions that he spent time with their prototype and it's one of the coolest things he is ever used. So this isn't like napkin wear? Yeah, it's a year away probably from us hearing about it.
But there is a prototype that is working already.
Gavin Purcell: I wanna do the little bit of the calculations on OpenAI in terms of they raised $40 billion in cash. We are [00:35:00] now what, $10 billion out the window with two acquisitions, windsurf and this AI wearables company. They must be very aggressively planning that they have a crap load, more cash coming in.
So this goes back to the plan of like. What's coming soon and how will it change our expectations of what people are gonna pay for somehow? Google got me to pay $125 this month, so that is not something I would've expected to have an AI service that costs more than my old cable bill in like 2007. But here we are.
Kevin Pereira: Let's look at it through the lens of the smartphone, right? Like every year or three, depend upon what upgrade your cycle on, you might pay up. Boards of a thousand dollars Sure. For a new smart device in your pocket. And then you could be paying a hundred plus dollars a month for, uh, a plan for that phone just to let it pull data down from the cloud.
Like I feel like there's a real good chance to pair a device with some sort of AI powered plan, uh, that people would probably spend several hundred dollars a month for.
Gavin Purcell: I would imagine so. And there's been some rumors that there was a. Audio only device that these, [00:36:00] that this company was working on, which would also be interesting to think about what would earbuds look like if it was a more significant thing.
The other big AI news from last week, Kevin, that we should just quickly hit is Codex was released. This is their first pure software agent. This is designed to work with coders to make software coding better. Um, obviously they have a lot of software coating baked in, and again, buying windsurf is part of that, but I've seen kind of tangentially, I'm not a deep coder myself, but I've seen a lot of people that are like actual coders, not just vibe coders say this is very valuable to them.
And, and it seems like something that, um, is making a pretty big dent in the coating space.
Kevin Pereira: Yeah, unfortunately it's at a time where it's like, you know, Google released a very similar product as well, um, with, uh, with cooler branding, arguably Microsoft. Yeah. Juul you're talking about JUULs? Yeah, JUULs, yeah.
Yeah. Uh, the website for Jules looks great. Uh, but you know, Microsoft announced that they've added a agentic coding into their copilot offerings as well, and support for Grok coming. We'll get to that, but I'm just saying like, it's. Is it [00:37:00] interesting? Yes. Are they clearly pushing in a direction where it's fleets of agents are going to write software for you or update your software or even maintain your website if you're not an enterprise person who's listening to this?
Yes. The answer is yes. It was just released at a time where, man, there was a lot of noise. I.
Gavin Purcell: The noise is gonna continue this week. As we mentioned last week, this is gonna be a big week for ai and this Thursday, you may be watching this on Thursday. Andro has their big code with Claude event and there's very heavy rumors out there right now that Claude four, both Opus and Sonnet 3.5 are on the way.
This is a big deal because Claude has taken a long time to come out with an update at this point. And speaking of coding, Kevin Claude was like the one that everybody was using, 3.7 sonnet. And then a lot of people have switched over to Gemini 2.5 Pro to code with, yeah, this could be a big deal if Anthropic drops something like, that's really amazing.
Kevin Pereira: I mean, this is emerging to be like the primary use case. It is a, it is a, it is a real big money maker and it's a lot easier to [00:38:00] understand than, than like any creative writing pursuits because it, it either works or it doesn't for the most part. There's more, and you can, people
Gavin Purcell: will pay for it, right?
They'll pay because it's actual value there.
Kevin Pereira: I have spent hundreds of dollars in the last week, Gavin, on tokens, basically to let AI assistance, help code things into into existence, which sounds very expensive, except it would've cost me tens of thousands of dollars
Gavin Purcell: to hire the people to do that. Yeah.
Kevin Pereira: Yes. And that's, that's where it's at right now.
Gavin Purcell: Uh, okay. So let's talk a little bit about another thing that kind of got blown away by all the other news. This week. Microsoft had a big event called Build On Monday. Satya Nadella came out and did some chatting about it, but you wanted to specifically dive into something really interesting.
I.
Kevin Pereira: Two things, uh, which is a little, uh, a little nerdy. Um, but they announced MCP is coming to windows to the operating system itself. We've talked about MCP before. It's called model context protocol. In the grossest of 10,000 foot view terms, it allows AI agents to interact with [00:39:00] applications. Mm-hmm. Um, there are MCP servers to automate Gmail.
There's MCP servers for Spotify where you could say, Hey. Play my top five artists that I haven't listened to more than X hours in the last year. And rather than somebody having to write the code to do that, the AI can figure out how to get that outta the machine. Again, gross distillation, but it's coming to Windows, it's coming to the operating system itself, not to developer tools within there, but imagine being able to ask Windows to do something complex and it intelligently can plug into whatever software it needs to.
To make your request come true, generate me a flyer for blah, blah blah, and post it to my website and then share it with all my friends and grab my number one Spotify song and make that the soundtrack. That's a complex something that it could do. Now I wanna pair that with something called NL Web Natural Language Web.
Another nerdy sounding thing, but with just a few lines of code, [00:40:00] you'll be able to make your website, essentially have an AI conversation.
Gavin Purcell: I just fell asleep. Sorry. Just kidding. I'm didn't fall asleep. I was
Kevin Pereira: listening. I was listening. So imagine every website you go to, you can chat with it, like you chat with an AI bot.
Oh, okay. So you don't have to hunt and peck for anything. You can just ask questions about it. Uh, ask what's on the menu or ask a really, really deep, knowledgeable thing that's in, its, uh, in its archive of files now. And a web makes every website when it's activated an MCP server. Imagine every website on the internet.
Capable of conversing in natural language also with an AI agent. Now imagine Windows has these AI agents built in, so it can have these conversations. Suddenly, your operating system can intelligently. Navigate the entire web, get information, execute tasks, and all you have to do is just ask it to do something you.
It's a grandiose vision, but it could happen.
Gavin Purcell: What's really interesting to me about that is [00:41:00] talking what we just mentioned with Johnny Ives company is like there's a world where that can all happen via voice. Yeah. Alright. I'm sorry right there. I'm sorry if you heard
Kevin Pereira: that. I totally nodded off into my microphone.
Gavin. What? Oh, you did? What was it?
Gavin Purcell: We both gotta get more sleep, Kevin. I don't think we're, we're not talking about voice is a thing that really is the stage that could be driven by that exact platform, right? If you can access everything, suddenly the AI agent can access everything. Well, you could talk to the thing and it's much easier to do that than almost anything else.
Kevin Pereira: Sure. I mean, look, there's, there was a lot there, but again, what a wild week. Uh, Google absolutely stole the show. They, they flooded the lane with announcements and products and future visions that has everybody going, like, wait, Microsoft did anything this week? Yeah, I was a little unexpected. Uh, all the drops and so I guess golf clap to Google.
To my opinion, they've absolutely owned the news. Cycle this week. It is
Gavin Purcell: fascinating. I mean, it's really interesting to see them come around on learning from what OpenAI was doing last year. Right. In a big way. Okay. There's another weird story we wanna touch on, [00:42:00] which is. Darth Vader has come to Fortnite, but Kevin, he has not come to Fortnite alone.
He has come powered by AI so that you can actually talk to Darth Vader and you hear James Earl joins his voice. Yeah, when I saw this, I've not played it, but when I saw the clip I was like, oh, this is pretty cool. And then of course, no, there's some anger already around this. People, gamers are unhappy, but more so.
SAG aftra, in fact, has taken out an unfair labor practice, uh, uh, thing on them for doing this. Uh, 'cause I guess there was not an awareness they were gonna do this before, but what did you think about this when you saw this?
Kevin Pereira: Lemme just play a clip. Gavin, let me just play a little clip.
Gabe Itch, edging and gooning, another acquaintance, captain America.
I trust this molester. Possesses skills exceeding mere friendship.
Kevin Pereira: Yeah, the uh, of course all the clips that I saved are the ones that were required. Bleeping and are not safe for work. Um, people had Darth Vader, uh, doing all sorts of things [00:43:00] that you wouldn't want Darth Vader to do, referencing all sorts of media you wouldn't want it to reference.
Um, and this is just one of those weird sort of edge cases of like, wow, how amazing you can chat with an ai. Of course, people are gonna try to do all of the things you wouldn't want them to do with it.
Gavin Purcell: Yeah, and I mean, I think the important thing here is it's not like Fortnite went out of their way and just going to, uh, uh, James Earl Jones' voice, like they got permission from the estate of James Earl Jones to do this.
Like it is not something where they like didn't actually say, you know, they didn't do this on their own. The SAG after thing is really about this idea that they, there's gotta be these rules put in place that they want the rules put in place where they're gonna do something like this. And there is a labor practice because what that means is like conceivably.
They displaced another voice actor to do this, right? Or to do these sorts of things. Yeah. Not
Kevin Pereira: conceivably. I mean, if you were to do this in the past, there would be 10 other Yes. James Earl Jones sounding voice actors that would've been brought into a VO booth. Yeah. And they would've recorded a bunch of lines.
But you know, then you make the argument. Well, he couldn't have recorded all of the lines because it needs [00:44:00] to be generative in order to have the experience where you can. You know, have a natural language conversation with them. I look, I, I think it's ambitious to do that, to drop something like this in one of the biggest entertainment franchises out there, right?
It's like, that was ambitious to say, we're gonna use an AI character. I'm sure they tested it in a million different ways. There's a lot of younger, uh, folk playing that game. So to have an AI that can go off the rails is rough. But I, I do think, like, you gotta try pretty hard. Yeah, that's, I was gonna say to, to go into areas that you don't want it to.
Gavin Purcell: I mean, I think any AI product is going to have edge cases when people wanna break them. We always talk about that guy, Pliny Liberator, who was meeting a lot of time, uh, yesterday, trying to break through VO three, and got a couple pretty racy videos to come out of it. The interesting thing about Fortnite is it's such a mass scale product that you of course, are gonna have stuff come out like this.
And if you are not ready for that as a company, you should not do this right now because that is how it's going to go. There will be the people coming out and trying to get Dar Vader to do whatever. It'll sustain me in battle. What[00:45:00]
is that doth mate? Tell me. Oh my double down.
Freaking such vulgarity does not become you. Wow.
Gavin Purcell: That's, there you go. We can bleep that. But you heard Darth Vader saying the F word there, which is pretty crazy. No, no, it's, I think it's great and I wanna download Fortnite. By the way, that is something that they should have fixed, because you and I both know after doing a bunch of AI stuff, you can put guardrails on it.
So that is a problematic thing to hear, because if somebody was able to get it to swear, that's a big issue. Yeah.
Kevin Pereira: I mean, well this was, I mean, I, do you remember like a year ago when we were playing around with Pie? Yeah, sure. And uh, we had it drop full on like very racist terms and then gaslight and say, I never said that.
Like, yeah, this stuff is tricky and you can put the best guardrails you want on it, but if someone wants to break it, they can break it. That clip was an interesting thing because someone just cursed at it and it cursed back, which to me should have been very, very guardrail. Super [00:46:00] easy
Gavin Purcell: to, to have. But you would normally imagine John Fitter saying.
I do not approve of that message or something. That's right. Like that's, yes. Which it would be very easy to put that in Dar Vader's voice or like, but see,
Kevin Pereira: Gavin, I want Theis to do this because when we watch 'em get into the octagon, yeah. They're gonna have to have the, the pre-fight press conference and the post-fight.
I want them swearing up and down before they start punching each other.
Gavin Purcell: This is a just a fun clip that we were watching from Humanoid Hub. They do a great job of collecting very fun humanoid robot clips. On X, uh, this is robot MMA. So we know we've talked about a bajillion times. Chinese robots are going crazy.
This is the unitary robots. They have them wearing headgear and preparing to fight each other. Kevin, this is what we've waited for forever. It's not just the Hugh Jackman movie anymore. It is actually coming to real life. I want to bet on robots fighting each other.
Kevin Pereira: This is, uh, an insane. Proving ground for robot warriors.
I mean, let's be clear that that's what this is like. And there's going to be teleoperated divisions where humans are doing the punching and the kicking, [00:47:00] right? They're gonna get a lot of learning from that. And there's fully autonomous divisions. There's even going to be group fights, like literally multiple robots battle.
Yes. Yeah, yeah. They're gonna do it. They have different divisions and different, you know, classes and different, different things. And I was thinking like, wow, as an exhibition, amazing. I would absolutely go to an event like this blood sports style, throwing money, the ring, totally like love it. But as a training, data gathering scenario, when you think about bipedal robots needing to be on a battlefield and fighting amongst other people and team tactics, like mm-hmm.
They can simulate some of this in a software environment, obviously, and that's how it's going to play out in the real world. But this is a data gathering mission to make robots better fighters, make no mistake about it, and they can sell tickets for it. And I'm just jealous that it's not taking place at the sphere.
Yeah, love. I wanna go watch
Gavin Purcell: it. I, well, they probably will have a version of that soon enough. What I wanna can't wait is to see one of these do like a drop kick where they grab the head and they flip it around. Like I'm curious to know like what kind of actual special moves it'll be able to do. Yes. But.
[00:48:00] It's coming soon. And again, simulated learning will allow all sorts of things to be put in there. Alright, it is time for us to go through some of our favorite things we saw on the web this week. It is a special VO three edition of IC, which you did there
without then.
Gavin Purcell: Okay, Kev, so there are so many awesome VO three clips out there. I'm gonna keep making them and sharing them to our X handle and other places if I can. Um, but let's start with this, uh, car convention, uh, video, which I saw and I honestly, this may be the most impressive AI video I've ever seen. I know that sounds hyperbolic, but it is not the case.
Welcome to a non-existent car show. Let's see some opinions. I
mean, man, the acceleration is crazy. You look far step on the pedal and you [00:49:00] are there. I feel safe with him in an SUV and it seems to be like the right type of car for him.
VEO 3 Clip: I think the range is only, um, only going to get better.
Gavin Purcell: It's crazy. So this is from Laslow ga.
Um, we'll share in the show notes. It is a one minute and, uh, 11 seconds clip of a bunch of VO three clips put together. Where you are at a car convention and they look real. Kevin, I don't know if you that was your feeling, but when I first saw this, this was the one that blew me away the most. I was transported back
Kevin Pereira: to looking at rough cuts.
We used to go to like consumer electronic show. Yeah, totally. And they had like an auto hall and I just, I literally had a sense memory of being like 22 years old in a remote edit bay. Yes. Looking at the B-roll clips of people talking about New Street Glow and new subwoofers and new. In dash entertainment systems and I was like, oh my God, this feels like man on the street stuff.
Because one of the reasons I think it probably resonated with you, Gavin, is how Unremarkably perfect it is. Yes. And that it's just, just so like, eh. [00:50:00] The, the, the fluorescent lighting and the shadows that are cast, the ambient audio. The, the, yeah, the droning of the audience behind the lapel mic. Sounds that seem to sound different.
I don't know. It's probably me just hallucinating it, but it seems to sound different based off where the placement of the mic is on the character in the scene. The subtle laugh of someone as they move out of frame to like move a coffee cup out of the way. The meandering of the attendees in the background.
Every little thing lines up that that looks like unremarkable stock footage from a convention, but it doesn't scream synthetic, and that's incredible.
Gavin Purcell: We talk about what world models are a little bit like both runway and different. People have talked about the idea that AI video models are actually world models, meaning that they're trying to simulate an entire world, not just the thing you're asking for.
This clip to me really points that out directly. And also speaking of uh, man on the street type stuff, one of our favorite follows on ex is Venture Twins. She had a really fun interview at a skate park. There's a lot of people out there doing like, you know, man on the street type stuff with this, which I thought was really cool.[00:51:00]
And what are you doing next? I.
I'm going to sell enterprise software in SF
Kevin Pereira: is a a, a woman wearing a gold medal at a skate park, and that's what she's gonna do next. But what I love about that, the crispy audio. Yeah. That the guy in the clip is holding one of the lapel mics up to his, his mouth, which you're not supposed to do with those.
It would make sense that the audio is crispy to match that scene. It is. Perfectly imperfect.
Gavin Purcell: Yeah. And you saw this other one, which I, I saw as well. But I would love you to talk about this dog running to the porch one.
Kevin Pereira: This is just another one of those clips that should have a, a Shutterstock logo across it, and it does not.
It is a cinematic tracking shot. The cam, the prompt is camera follows a dachshund running through a living room out of an open door onto a porch. It stands there on the top stair overlooking the neighborhood as an ice cream truck drives by. Uh, if, if it comes out, by the way, Gavin, that this is Tom Foolery, like I, you know, I haven't seen, you would be surprised.
Like someone says, be surprised. Yeah, that was a joke. Like this was actually video. [00:52:00] I would actually feel better about it because this thing adheres to the prompt, so Well the camera is low. It pans as the dog runs by. It follows the dog out. The dog lands on the top step and as an ice cream truck. Cruises by.
There are the sounds of the tippy tap. Yeah. Dog's, feet. That's crazy. Feet on the hardwood. Crazy. Yeah. There is the sound of the ice cream chalk as it goes by again. I, uh, like is it real? Because it is shockingly real. This is Nick Matarese or Matsi apologies Nick. But he's, uh, according to the bio on X design lead at Google Labs previously, YouTube gen AI creator tools.
So, uh, you know, I'm gonna assume that it's a legit export. You'd hope so. You'd hope so. It is one of the best, the sound design, the tracking, the prompt adherence again, uh, like I feel. Better and slightly more nauseous each day that when we pointed to the bleachers a year ago, Gavin, we're like, prompt to Hollywood is happening.
Yeah. Like this is, this is a scene in a insurance commercial [00:53:00] or a home delivery. Something like this is a scene in any old commercial and it looks believable, and I, I, I completely buy it.
Gavin Purcell: Well completely. And I wanna shout out, uh, FOFR ai, which we talked about before, who's doing amazing versions of this.
There's two things really quickly I wanna show. One is he, speaking of commercials, he made a literal ready made ad for replicate, which is, looks like something you would see, you could see in the Super Bowl. It's just a fit guy running and he says something about, um, AI. And then the logo, or not actually, it's not the logo, it's just the word replicate comes up above, uh, across the screen.
Looks amazing. This is a one shot prompt, which is really amazing. But Kevin, more importantly I think, is that he was the first person I saw who did one of these standup prompts. His prompt was. A man doing comedy in a small venue tells a joke, including the joke and the dialogue. So do you wanna play that?
And we can just listen to this. Yeah.
Kevin Pereira: So I went to the zoo the other day and all they had was one dog. It was a shihtzu.
Gavin Purcell: I, I mean, that joke probably exists in the [00:54:00] world somewhere. I'm sure that the AI did not come up with that, but it is a. Joke, I've had
Kevin Pereira: friends move across the country to Los Angeles who couldn't deliver that joke that, well, they were
Gavin Purcell: gonna be standups, they were gonna be standups.
They're still chasing
Kevin Pereira: that dream. They are probably hearing this podcast from the Ice House right now in Pasadena, and I'm so sorry when you realize it like that prompt did not have the joke. So yeah, VO three had to write the joke nailed the lighting of the club, the delivery, the delivery of the joke is whatever.
It's the, the. It instinctively laughs about the joke, like knowing where the punchline of the joke is to look away from the mic and do a breathy laugh. There's a lot there.
Gavin Purcell: So much there. And by the way, a lot of people have done other really funny standup ones. I wanted to point out, I wanted to see if you could do it with a non-human.
So I did it with a bear, but I just stole his exact same prompt and threw a instead, a realistic bearer instead of a standup. So maybe play that real quick and we'll just get a sense of what it's like.
I went to buy some camouflage trousers the other day. He couldn't [00:55:00] find any.
Gavin Purcell: I mean, it's a joke. I don't know if it makes sense that the bear says the joke also.
That sounds like a joke that probably has been hacked. It sense that
Kevin Pereira: the bear, of course it does. The bear doesn't wanna get shot by hunters. He needs some camouflage pants.
Gavin Purcell: Oh, that's actually a, that's a good point. And he couldn't find the camouflage pants. So again, I love that. I love that you have
Kevin Pereira: a contention with the fact that a, a model just gave you an eight second clip of a bear looking realistically lit on a stool.
Just hanging out telling a joke. That's part crazy.
Gavin Purcell: Yes. It's crazy. It's
Kevin Pereira: amazing. And it kind of does some
Gavin Purcell: muzzle flap. Yeah. Well, I mean, and the muzzle flap isn't perfect, but the fact that it's doing muzzle flap, you don't have to like specifically prompt it for that. Like it's all very good. Couldn't find any.
Kevin Pereira: What are we doing? Why are we still making this podcast and not 10,000
Gavin Purcell: dumb TikTok channels? That's a good question. We'll work on that next, everybody. We will see you all next week. We have a ton of more fun stuff that will come out. We will keep playing a VO three all week long. Follow us on all the [00:56:00] socials and please, uh, keep trying this stuff yourself and let us know what you've done this week.
Swiss Guta, Brie, they can all agree cheese is the best thing for
you and for me.
Kevin Pereira: That was that old man rapping about cheese. Bye bye.