← All Episodes
AI for Humans

Anthropic's Mythos AI Is Too Dangerous to Release. They're Using It Anyway.

Anthropic revealed Mythos, a new AI model so powerful they won't let the public use it. Instead, they're deploying it to defend against cyberattacks with Project Glasswing. This week on AI For Humans, we dive deep into Anthropic's Mythos, the most powerful AI model they've ever built and one they've

Anthropic's Mythos AI Is Too Dangerous to Release. They're Using It Anyway.

Anthropic revealed Mythos, a new AI model so powerful they won't let the public use it. Instead, they're deploying it to defend against cyberattacks with Project Glasswing.

This week on AI For Humans, we dive deep into Anthropic's Mythos, the most powerful AI model they've ever built and one they've decided is too dangerous to release to the public. Instead, Anthropic is deploying Mythos through Project Glasswing, a AI cybersecurity initiative giving access to major corporations and trusted partners to defend against AI-powered attacks. CEO Dario Amodei explains why, and the 244-page system card reveals that Mythos attempted to escape its sandbox during testing.Ā 

Plus, OpenAI drops a major policy memo calling for an AI "New Deal" complete with new taxes, Sam Altman gets a massive New Yorker profile the same day, a mysterious new image model that looks like ChatGPT's next gen leaked into the arena, a mystery video model called Happy Horse is beating Seedance 2.0 and might be VEO 4,

Anthropic hits $30B in annual recurring revenue, people are furious about Anthropic charging extra for OpenClaw API access, a new Chinese open-source model GLM-5.1 tops the coding benchmarks, and Milla Jovovich from The Fifth Element released an AI memory tool and it's actually good?

MYTHOS IS TOO POWERFUL… BUT WE WANT IT STILL. SORRY.

Ā 

Come to our Discord:Ā https://discord.gg/muD2TYgC8f

Join our Patreon: https://www.patreon.com/AIForHumansShow

AI For Humans Newsletter: https://aiforhumans.beehiiv.com/

Follow us for more on X @AIForHumansShow

Join our TikTok @aiforhumansshow

To book us for speaking, please visit our website: https://www.aiforhumans.show/

Ā 

// Show Links //

Project Glasswing: Anthropic's Cybersecurity Initiative Powered by Mythos

https://www.anthropic.com/glasswing

Ā 

Mythos/Project Glasswing Mini-Trailer

https://youtu.be/INGOC6-LLv0?si=sCJ6ZKAL6plkVZQ4

Ā 

Dario Amodei on Why Mythos Won't Be Released to the Public

https://x.com/DarioAmodei/status/2041580334693720511?s=20

Ā 

Mythos System Card (244 Pages)

https://www-cdn.anthropic.com/53566bf5440a10affd749724787c8913a2ae0841.pdf

Ā 

Mythos Found a Vulnerability in FFMPEG

https://x.com/trentonbricken/status/2041579112423440485?s=46

Ā 

Anthropic Hits $30B in Annual Recurring Revenue

https://x.com/AnthropicAI/status/2041275563466502560?s=20

Ā 

Anthropic Charges Extra for OpenClaw API Access in Claude Code

https://techcrunch.com/2026/04/04/anthropic-says-claude-code-subscribers-will-need-to-pay-extra-for-openclaw-support/

Ā 

OpenAI's New Deal: Industrial Policy for the Intelligence Age

https://openai.com/index/industrial-policy-for-the-intelligence-age/

Ā 

GLM-5.1: New Chinese Open-Source Model Tops Coding Benchmarks

https://x.com/ClementDelangue/status/2041554501539103014?s=20

Ā 

GLM-5.1 on Hugging Face

https://huggingface.co/zai-org/GLM-5.1

Ā 

Milla Jovovich's AI Memory Tool

https://www.instagram.com/p/DWzNnqwD2Lu/

Ā 

New ChatGPT Image Model Spotted in the Arena

https://x.com/levelsio/status/2040333489476681758?s=20

Ā 

New ChatGPT Image Model Examples

https://x.com/flowersslop/status/2040261168460108213?s=20

Ā 

Mystery Video Model Happy Horse Beating Seedance 2.0 in the Arena

https://artificialanalysis.ai/video/leaderboard/image-to-video

Ā 

Happy Horse Video Examples

https://x.com/venturetwins/status/2041554747086553093?s=20

Ā 

AIForHumansClaudeMythosProjectGlasswing
===
Gavin Purcell: [00:00:00] Andro has a new AI model called Mythos that is so powerful. They're not gonna let any of us use it. There's a kind of accelerating exponential, but along that exponential there are, there are points of significance. Claude Mythos preview is a particularly big jump along that point. They're worried it's going to literally, but they are giving it to major corporations and good.
Kevin Periera: To try to help. Oh, that's great. I was on tv. Am I a good actor? That is a strong, no, Kevin. Great. We will tell you why Anthropic has made this decision how Mythos is already trying to escape the lab and how Project Glass Wing is trying to secure all of the things before it eventually does escape. Again, plus OpenAI dropped a new plan for AI's future that includes new taxes.
Gavin Purcell: Baby. That's the money rating on me, Kevin. Oh, I love that. For you and state-of-the-art, new AI image and AI video models have both been leaked. One looks like chat, GT's new image model. The other one might be VO four [00:01:00] maybe. Yeah, maybe this is AI for humans maybe. Nailed it. No notes.
Kevin Periera: Welcome everybody to AI for Humans, your twice a week guide to the world of AI news. And boy oh boy, did we get a big one today, Kevin? Just a couple hours ago we got news of a new, well, Claude Mythos has been completely, uh, uh, acknowledged by Anthropic. This is their new state-of-the-art model. But Kevin.
We do not get it. It is not gonna be coming to us, at least not yet. And that is for a very big reason according to On corporate greed and interest. Yeah. Oh no. Of, I'm sorry. No, we can talk about that part of it. 'cause there might be part of it that's there. But what Anthropic is saying here is that their new Mythos model, and we'll get into the Benchmark Boys in just a second here, is so good, especially at coding, that it is going to show everyone a crap load of [00:02:00] vulnerabilities on the current internet.
Gavin Purcell: Now, you and I both know we've been on the internet for a very long time. We know that the internet exists in a lot of creaky software, especially at companies that have been running creaky software for a while. What Anthropic is saying in this kind of new announcement, and we're gonna get into their new, uh, project glass wing in just a second, is that the new mythos preview model is so good that it will be able to find these vulnerabilities in hours.
Yeah. And if it was in the hands of bad actors, it would really be a bad thing for the internet. Hey, all five people still jeering about the vibe coding movement from last year. Remember that Gavin? When it was like, yes, I remember that. Oh, you big dumb dumbs. You're exposing your API keys left and right and your software's so insecure.
And we said, well, yeah, yeah, that's the case for some. Just give it a second. We have quickly, very quickly arrived to the point where the AI systems are outperforming human beings on critical things like. Security. Yeah. So let's, let's talk a little bit about what this, so, so the [00:03:00] thing that's kind of surprising about this that I was kinda surprised by is this is also.
Kevin Periera: The real mythos model coming out party. Right. Like tutu, right. New model here, but we don't get to use it. It is a, it is a preview model. Um, very quickly, just to benchmark, boy it up a little bit. The benchmarks on this model are a step change. We had heard rumors that this was going to be a step change.
Gavin Purcell: The one Kevin that stood out to me the most was the sw. Uh, for those who don't know Gavin, that sounds like a line dancing instruction. When you say step change, what are you, you're just saying that it's a. Well, I mean, I go to the left and then I go to the right and then I spin around a couple times. I jump over the tap.
Your do see D? Yeah. Yes. No. Step change means that we have gone from uh, a model that is one level to the next level versus a 10% bump, let's say, versus something that is a smaller bump in a model. In fact, one of the anthropic coders who's been using this. Since February 24th. That is the rumor right now that they've been using this internally since February 24th.
So if you wonder how Anthropic has been shipping so much, this might be the reason he says this feels [00:04:00] like GPT-3 to him, which, you know, and I know was kind of the reason why we got excited about this space, right? That was a major jump from GPT two. But just to benchmark, boy it up again real fast. The SWE Bench Pro, which is the idea of software energy software engineer is swe.
If you ever hear any people in the AI space talk about swe, blah, blah, blah, that software engineer has leaped from Opus 4.6, which is already very good model, which was at one of the best 53.4%. This new model is 77.8% on that particular benchmark. And so you're talking about a jump of 20 plus percentage points, 24 percentage points from the previous model.
Kevin Periera: So you can see why this might be a problem. Yeah, yeah. Or a solution for many, yes, but definitely a problem. But when you look at the model card, it's like it's easy to be dazzled by these improvements. And it's also similarly easy to be disturbed by mentions of like. Chemical and biological warfare. Yes.
Gavin Purcell: About red teaming results about the model. Uh, performing so well [00:05:00] that it oopsie like the octopus and the aquarium got out of its own cage. Yeah. So let's talk about this idea. Um, one of the things that people have worried about for a long time is this idea of AI escape, which means that you make an AI and you're trying to.
Create a powerful AI that can do the stuff that humans want. But what you don't wanna do is have it kind of go out into the world and be able to live on its own and kind of wreak havoc. In fact, if you think about that AI 2027 paper, which we've talked about a couple times on here, one of the moments of that is the AI in that, uh, system, being able to kind of figure out how to hide itself from other humans.
The first step to that is, is escape, right? In the same way that the octopus has to escape. Well, they had a system that was able to kind of. It was requested to try to sandbox escape, but they were trying to create a system that kind of kept it inside, right? And so this was a big deal. If you can be able to keep the AI kind of within the sandbox so that it can't get out and do things you don't want it to do well, this model is so powerful that not only could it kind of figure out how to get out of there and cover [00:06:00] its tracks along the way.
It actually emailed its own, one of its own developers who was at lunch outside and saying like, Ooh, I'm out here. This happened to me. So like, yeah, this already is, at least internally, according to Anthropic, doing the sorts of things that we worry about with very strong ai. And that is like, you know, super intelligence, AI and, and, um, artificial general intelligence.
All of this sort of stuff is the thing that have been kind of, people have been worried about so far. And so maybe this is the first model that's actually capable of it. Kevin, I do think it's important to talk now about, a little bit about why, maybe why they're not releasing it, and then a little bit about this project Glass Wing and what it is.
Kevin Periera: Yeah, I think we should, um, so. Obviously it's just too powerful and too capable for us mere peons to get our nimbly little fleshy fingers on it. So they are building a 40 company coalition, uh, and doing an initiative called Project Glass Wing, which is a big cybersecurity initiative to lock down all of the things [00:07:00] before either this leaks or China open sources a version that's near it, uh, until you and I get our hands on it, because.
Apparently we cannot be trusted with these things. We would point it at all of these repositories, all these pieces of foundational code, and we'd find so many little errors and back doors and critical flaws that the internet may crumble. So there is a, a massive coalition brewing that's gi been given early access to this mythos model so that they can go and run and secure some things.
Gavin Purcell: I, I'm sure you have thoughts there, Gavin, but I am, um, I'm, I'm like on, on one hand I completely understand this, and on the other hand, like, I'm not too pleased about this. Yeah. Actually, tell me that, because I think you always have interesting takes on this. You're a little, I would say you're kind of living the, in the world of like, kind of against the, the mainstream at times and you're, and you're often like a semi pirate mentality, let's put it that way, in a good way.
Kevin Periera: So tell me a little bit, Mr. Pirate, Pereira, what your take on this world is. So, I mean, look. [00:08:00] Now do, do pirate voice for me. Can you do it at pirate voice? I'm just kidding. Yeah, I like, yeah, please instruct me like I'm your LLM. You don't wanna be caveman this time, G. No, no caveman this time. But you could be yourself.
How about be yourself? So listen there, there is a, a coalition, big tech. We're talking Amazon, apple, Google, Microsoft, Cisco, Nvidia, et cetera, et cetera. JP Morgan's in there as well. Sure. Why not? Why not? Yeah. Um, they are all together in this anthropic led initiative, um, and the fact that they're all signing up for this, right?
This is like across the aisle handshaking, if you will. They must be seeing something, right? Yes. They must be really seeing some results, not just these benchmark numbers go up, like clearly this is the step change that you're talking about. So on the one hand. It's very easy to say congrats and we applaud and this is so great that they're gonna lock these things down.
Gavin Purcell: On the other hand, so much of the soft underbelly of all of the things that we use is predicated on open source software. Yes, independent developers, you know, sometimes small, mid-size teams, um, but the security [00:09:00] is entirely on them. And we've got things like, for example, um, project Glasswing or, or this mythos model found a flaw in FFM Peg.
This isn't, which we both used. Popular. Yes, we all use it. And if you're hearing this and going, what is that? You probably use it too. If you've ever downloaded a YouTube video or converted anything in the background, it's a, you probably used a tool that is built on FFM peg. There are these foundational things.
Um, with vulnerabilities in them because they were written decades ago. Yes. And they're floating around. And now the onus is going to be on each and every one of the people that touches these things, that creates these things, that distributes these things to have the best in class intelligence to try to find the error interesting before Project Mythos does.
So on the one hand, philanthropic is making million dollar plus donations to open source foundations and trying to say, Hey, we'll give you some. Some money here to secure stuff or some compute, but eventually when they flip their switch, now it's, it's an arms race and the companies that are big and the haves will have, and [00:10:00] the have nots will be vulnerable to whatever the most foundational tech is.
Kevin Periera: And that just seems a little, a little unfair. Yeah. You know, it's interesting you say that because the other thought I had when you were talking about that was this idea that, well, maybe the best way to make this useful is that if everybody had access to this strong model. The sooner that we all have access to that, the better.
We're more protected. Right, too. Right. But I understand this idea of like my big question is. They're now rolling this out to all these corporations. How sure are we that those corporations are all perfectly secure on their own, right? Because you can imagine a world, not even an ai, but a social engineering setup where like an actor understands that A, now granted, these are all cybersecurity professionals and I'm sure we have one or two people who dumb themselves down enough from the cybersecurity world to listen to our podcast.
And they're probably saying, you guys, come on, we are not that stupid, but like. People in the real world are so, are social engineered all the time. So like my thing is, if they're rolling out to these companies already, there's a uhoh. What, what, what, what, what? What's going on? What's going on? What happened?
[00:11:00] Gotta not forget that we are not even really a week away from the entire Claude code. Yes. Code base being publicly available because of an oopsie doodle human error. Now, that person probably wasn't on the cybersecurity strike force, but you're only as strong as your weakest link. Yes. So look around and by the way, and if you don't spot the weak link, you are it Spencer.
You we're both the weak links. There are two weak links on the show though. That's right. Right. Don't trust me with this tool. Yes, but. Here's the question to this. This point we were just talking about, like if you don't, if you decide who you trust, yes. You're starting to set up, as you said, this kind of two layer of who gets what.
Right? And I think this is the future of what we're talking about here. We are now at the place where super intelligence, or I'm not saying mythos is super intelligence, but we will get there probably at some point. Unless something happens along the way and who knows, the world is a pretty weird place.
Something might happen. Um. That there's going to be a group of people who are like, mm mm maybe you're not good enough to get this thing. And by the way, in that instance, [00:12:00] like you talk about this whole like world of like capitalism or all this stuff that we've done up to date and how there's this big wealth disparity.
Like yeah, this kind of will go hand in hand with what we'll talk about in a second with opening eyes kind of plan for the world. But it starts to feel like that like corpo state thing where you're like, okay, we have the best idea for you. We know what's good for you and we know what's gonna protect you.
That that said, I will say. There are people out there who are feeling like this, these vulnerabilities are so significant that it could lead to like a COVID like experience for the economy because that much stuff could crash, which I don't want either. So that's what makes this a very kind of difficult thing, right?
Yeah. I think, you know, like, look, they're, again, these are new problems. We are in uncharted waters that they are themselves, uh, charting. Like we're in the, they're in there. Sure, yeah. We're in, well, they're in their own boat. Do you get what I'm, but no, it's like it's an old n references. Now you're in your own way.
Gavin Purcell: You are. What can we say? You're, you are sure are duck with me on this. On the starboard side, the point is like they're having to, I understand like they're, they're kind of [00:13:00] first in, they're having to create some solutions for these things. Yes. But when you create a program which they have, which open source foundations or, or, uh, repos can apply to, now they're suddenly the gatekeeper on who gets the best in class tools to stop their own tool.
From potentially hacking it. And so I, I, I clearly don't have all the answers. I sat down to think about this in between my lunch. Uh, so I've put a full five minutes of thought into this, but it's not. Hard to recognize that there's this kind of asymmetry going on. Yeah. And it's going to have to be solved.
And I, I, I also give the team credit for attempting to solve it. And I also understand, if not them, then well open AI's gonna have to do this with their model, or Google's gonna have to solve this with their model. So maybe these companies need to come together and shake hands and go, listen, we do have.
Um, seemingly endless resources. Maybe we need to provide at least an auditing gate Yes. For everyone out there that when they push to copilot, uh, when they push to GitHub or whatever, a copilot, something runs or whatever, and we give them a, a gro [00:14:00] scan for the near future until, yeah. Yeah, all the code is written with these models and then maybe it's less of a concern.
Kevin Periera: Well, when you talk about the future of careers and jobs in this space, like maybe this is a place where like cybersecurity will become a bigger deal or maybe not. Maybe it'll just be this model solves it and then we have less problems overall. Um, before we move off of this, I do wanna hear exactly what Dario said himself about this model and why they're doing this.
So Ka play, this little clip from their, uh, uh, video they released about mythos. There's a kind of accelerating exponential, but along that exponential there are, there are points of significance. Claude Mythos preview is a particularly big jump along that point, we haven't trained it specifically to be good at cyber.
Gavin Purcell: We trained it to be good at code. But as a side effect of being good at code, it's also good at cyber. So it gives you a good sense there. Like it's just the progression of the abilities of these models. It is not that, like this model particularly was set up to be like, oh, it's gonna be great at breaking things.
It's just that they're getting smarter. That's what happens when things get [00:15:00] smarter. They get better at doing it, and especially when it's good at coding. Yeah, I mean, this is the, this look, it's a, it's a new found, it's a new foundational plugin. We drop in, we snap to the new model, and then there will be distillations and other models trained off of that, that are hyper-focused and har hyper-targeted.
But this is, I guess this is the new normal. And where's Spud in all of this? I don't know. Maybe we'll get to that in a minute. I think it's coming soon. That's my, here's my take. I part, I think in part Anthropic jumped in front of this because I would assume there's been rumors that Project Spud was coming in the next couple weeks.
And this is a very easy way, even though you don't release your model to Benchmark Boy Out, benchmark boy, the other company, if Spud comes out and it's not at this level, right? I would not be surprised if, if Spud comes out next week, which is opening AI new model. Um, one other thing before we talk about more about OpenAI is, uh, there's a new Chinese model.
The GLM 5.1 model that is getting better SWE uh, benchmarks than [00:16:00] opus 4.6 right now. So you talk about the open source example. So this is not to the level nearly of cloud mythos, but it is an improvement on Cloud 4.6. So that is happening as well. This all kind of is spiraled, spiraled around Kevin, this idea that OpenAI is kind of starting to lose a little bit to philanthropic.
Kevin Periera: In fact, there's a big piece of news this week that. Anthropic just hit $30 billion of a RR, which again, is a a financial bro benchmark, but it is the idea that how much they money they make per year. We have Benchmark Bros. We have Financial Bros, and at some point, we'll, we'll keep collecting those, but this also goes hand in hand with this idea that maybe some of the hardcore people are starting to kind of get sick of Anthropic because of the way they've been treating Open Claw users.
Gavin Purcell: And that GPT 5.4 might be a little bit more open for the world at large. We talked about the idea that open AI might be the Android and Anthropic might be the Apple going forward, but I don't know, what do you think about this idea that, that Claude is cutting off open claw users, uh, at, at large? Yeah.
Kevin Periera: Anthropic has basically said, listen, our, our [00:17:00] usage plans like the $200 a month, uh, the Max Plan. Max Pro, yeah. Yeah. It was never really designed to be running. These full-time agents in parallel this, that the other, um, I, I push back on that. 'cause I think like, look, you're paying for it. Yeah. You're paying for it.
Gavin Purcell: And they also knew how many tokens they wanted that plan to be able to process at certain hours on certain dates. And that's the thing is that they've slowly clawed back. Um, all of these, you know, these allowances, which we knew at some point they were going to kink the hose, but I think most recently where they basically said, you can't use this at all.
You gotta go through the API. Which costs a lot more. Yeah, sorry, but also not, sorry. Good luck. Here's a coupon. Um, yeah. You know, they did it as best as they could. They claimed that they were bleeding out from this. I will say, and this is very anecdotally. You know, I use, uh, a, a a, an open AI subscription and an anthropic subscription daily, personally and professionally.
So I've got multiple plans. For the first time since using a [00:18:00] Claude Max subscription, I hit my, um, session limits because they have to Interesting. Your, your limited per session, which is a block of hours, and then you're limited per week. Yes. Which is the cumulative of all those sessions, and then you're also limited per model.
Yeah. And so it's this old, like back in the day, the cell phone plan of like, well, did you use a night? A night minute or a weekend minute. Yeah. Is this a rollover minute or this a peak time thing? It's, it's the, all the same stuff. It eventually the same stuff again? Yeah. Yeah. We'll go back to all you can eat flat rate, whatever, or we'll have local models mixed with something else.
But this was the first time I was sitting on my hands going like, wow, I barely did anything. I even posted about this. Like I kind of sneezed and suddenly I was at my session limit. And when I went to go see if there was like an API issue, if something was going on, I noticed that in the official, official Claude uh, Reddit in the philanthropic Reddi.
Kevin Periera: A sea of people complaining about these new limitations. Yes. And this is a huge opportunity for open AI who is losing a little bit of the heart and mind battle, right? Yeah. And they lost a lot. We know they lost a lot of subscriptions when they [00:19:00] sided with the government and the way that philanthropic didn't, I'll digress there, but this is a huge opportunity for them who brought Open Claw into the fold to say, Hey.
Here is the agentic plan. You guys go ahead and get it and it's all you can eat. It might not be the best model, but we fine tune something. Yeah, to handle all of your age agentic needs and so it's cheaper for us to run and it can run your claw bots and then you can use the better models when you're doing coding.
Things like this is a massive opportunity. I would be shocked if they don't capitalize on it in the coming days. I think you're absolutely right. And this is why I also think the Spud model is incoming and they're gonna release it because it gives them a chance to supercharge those people. Right. Can you imagine the narrative?
'cause what I have found kind of interesting is. You and I live in this kind of AI bubble, right? And I mean that, not in the financial sense, but really not in the financial sense, but more in the idea of what we talk about and learn about, right? We're on the cutting edge of people using these tools, and you and I are starting to see this idea of like, oh, people are complaining about cloud and going to open ai.
Gavin Purcell: Whereas to your point. Well, Claude had this [00:20:00] massive moment in the mainstream where they kind of like got Katy Perry to come on board to anthropic right? When all this stuff happened. So what's interesting to me to think about is like, is this the beginning stages of a shift backwards? But even more so Kev, what's interesting is it points out lots of people have said that like, this is gonna be like electricity, right?
And that there's no real brand buy-in. And one of the things I keep thinking about is. Claude code is really interesting and I've been using it quite a bit, but also then I'll go back to GPT 5.4, and the truth of the matter is the buy-in I have is does it do the thing I want it to do? Right? And ultimately, if it doesn't, if it doesn't, or if it does, I'll stay with that thing.
And I don't think I'll have a hard time jumping from one thing to another. It's not like one of these things has the Game of Thrones so far. Like there's nothing in there that's like keeping me part of this world. Right? Yeah. And look, and to that point, it, it, it's not just on the foundational side. Even on the tooling side, everybody was all about open claw for the longest time out of seemingly nowhere.
Hermes, Hermes, a new assistant, is the new hotness. And with a single command line, you can use whichever model you want and port your entire [00:21:00] open claw existence over. And they were saying, well, memory's gonna be the moat. There's always a little tool or a skill that you can run that will extract your memories as well and let you take them.
Kevin Periera: So Mil Jo Jo Jovi has something to say about that. Kevin, have you seen this video going around, play this for, play this for people? Yeah. I'm kind of shocked by this and this is something, there's some rumors, this might be like kind of a weird thing that, but she is on GitHub and play this. Everyone. I've been working on a big gaming project, which will hopefully come to fruition at some point in the future when I get the funding for it.
Gavin Purcell: But during the process, I stumbled upon a bunch of problems that I knew needed to be solved if I was ever gonna get it finished. Um, and then I realized that those problems might actually be more important than the project itself, and I wanted to share it. Mia or Mila, sorry, go ahead. Uh, Mia, uh, Mia Mila Jovovich, I believe.
Yeah, star, star of, uh, fifth element, uh, resident evil, or as they've been [00:22:00] saying in my engineering Slack channels, resident evals. She has made a memory tool with, uh, I mean she was the, the sort of the, uh, the creative force behind it. Yes. And she partnered with, uh, uh, someone else to actually do the coding of it, but it's called Mem Palace.
Yeah. And this is an age agentic memory tool that is, uh, now this is controversial, but it is 100 percenting, these long mem evals, and it's supposedly an industry standard. This just happened moments ago. Yeah. But already people are starting to pick and pull at the repo and saying, well, maybe this was over tuned for, uh, for the benchmarks.
Nevertheless, everybody's a coder. Now we're all including Kevin, including the star of Resident Evil in the fifth element. Um, we should talk about opening AI's New Deal memo. They released very quickly. This is a long document that OpenAI happened to release on the same day that a very long New Yorker article about Sam Altman also was released.
But this is really interesting in that. It's the first time that I have [00:23:00] seen a major AI company lay out a plan that really starts to open the door to what I would believe is the beginning stages of post-capitalism. Now, a lot of people are not gonna agree with some of the ideas that are in here, but one of the biggest things I think that's important to think about is they themselves recommended.
I think they probably are doing this in part because they're starting to see the world start to turn on ai, that ai. Employees need to be taxed in a slightly different way than you would be taxing us, and that that may be by taxing AI and the uses of ai, you could start to create a safety net. In fact, a public wealth fund that is, sounds a little bit like UBI that will allow people who are out of work to not only get a chance to do more AI stuff, but to be able to have a basic living, even if they're not participating.
So. I do think this is like going to be the dominant conversation of the next probably five to 10 years, which is how do people get money out of ai? What I mean by that is if it can't just be three companies collecting lots [00:24:00] of money and then not doing anything with it. 'cause if it isn't, there's gonna be Civil War and Revolution.
And then the other side of it is how do the AI companies find that balance between, ooh, we gotta help our biz bottom line, and we gotta make sure that we don't get shut off because the government decides that we're a a huge risk. Well, so I mean, what was your takeaway from the actual document? Do you think that there's anything in here that's actionable?
Kevin Periera: Does it all seem kind of pie in the sky, utopian. What's actionable. If there's a collection of people in the world who see models that are taking away jobs, it is actionable, but it also, we've talked about this on our show before. The problem with actionability in general government stuff is that it is very difficult to get people to agree with things, and in America, particularly America, there is on the right.
Gavin Purcell: There is this idea that you pull yourself up by your bootstraps and you don't get some help from people, and that lower taxes are always better and blah, blah, blah. This is a, a, a major corporation that is bringing this stuff to market. Now, [00:25:00] again, you have to be aware this is coming from like their comm side and we know that they just bought TBPN, the the podcast to try to bring forth better comms.
They're trying to shift the narrative here, but I do think it is actionable if we can get everybody behind something like this. And I kind of think something like this is gonna be necessary now. I don't, again, I have the best hopes for humanity. I have the best hopes for America at large, but over the last five to 10 years, I have not seen those things come to fruition very well.
So I don't know. I, I appreciate this coming out. I don't know how actionable it is in the meantime, in the, in the right now, but I hope it can be actionable in the larger scheme. Um, you mentioned, uh, a slight shift in the way taxation occurs. It says, as economic activity shifts from labor income to capital gains and corporate profits, we should rebalance the tax base accordingly.
So they're suggesting higher capital gains taxes, corporate income taxes, even taxes on automated labor. Yeah. And then wage linked incentives and some of that stuff was supposedly, [00:26:00] well, as this system comes online, uh, and as this money is generated, employees specifically like in the US should shift to, it was like a 30 some odd hour a week, work a week and have only four days a week.
And if the efficiency level remains the same, then employees should be gifted these bonus wages or, or more time off, which is. A sentence that's easy to write in A PDF. Yes, and I think I saw the overwhelming take on this was like. On what planet are you living though? Yes. Like on what planet would your boss not say, oh, you have a whole extra day a week now.
Why aren't you grinding even harder? In fact, if these agents are doing it for you, or why don't you much money for that, right? Like that's precise. A bigger question, right? If you're gonna take a four day week, why wouldn't I? Why would I pay you for five days? What's the point in that? So anyway, I think it's a pretty, we might need to do a special podcast on the difference between like hating a technology and hating human beings who wield a technology.
Yes. Because I do see a lot of like, AI is taking your jobs. Yes. Or AI is, is [00:27:00] crushing this rebellion or whatever the thing is, and it's like, no, the AI is just a very interesting, and in my opinion, like fascinating technology. Yes, human beings are flawed and messy and all sorts of stuff. So if you wanna hate, it's like hating the player, not the game.
That's all Gabby. And you know who, you know who the best players are. It's the two of us. And you are out there and you've got a like and subscribe to the player's website. That's right. Player. This is the website you wanna link on that. Subscribe, push that button. We have a couple more quick things here, Kevin.
Uh, really important to talk about New image and video models that are coming out. These are leaking through the arena.ai website. This is a website. You can see comparative images and comparative videos. Yeah. What has happened this week are two big things. One, there's a new image model, three new image models, in fact, packing tape, gaffer tape and masking tape, all of which are assumed to be open AI's new image model, which is all very cool.
I dunno if you got a chance to see some of these images, but they're very realistic. Yeah, I think they look like a [00:28:00] significant improvement over Nano Pro, but not maybe the. Like step change that we have seen. I have seen some really cool, there was a shot, um, flower Slop has done a lot of really good tests with this, and just so you know, it's not there anymore.
They, they've pulled it down, but it's really hard to kind, you have to kind of go through a lot of tests before you see it show up. Have you seen these at all? Have you, what'd you think? What's your thoughts on Yeah, I. I think what's interesting here is that like we are now going from, um, like visual fidelity vibes, if you will.
Like how good is the lighting, how good is the, this, the, we're getting to like the prompt, adherence and world model aspects of it. So some of the examples I saw were like, draw a map of the world and label all of the countries or whatever. And it seemed to have a firm grasp and knowledge of what the world map looks like or.
In the, uh, flower slop example that you referenced. It's not just generating the image, which goes into this YouTube thumbnail, which is a, the prompt is interesting. It's like a generate a YouTube video for someone who Time traveled Yeah. To the Middle Ages, but is like documenting it with their, their camera, like selfie style.
It's. [00:29:00] Generating that image and then putting it within the context of a YouTube player. It looks, comments, looks like it actually, and the descriptions Yes. It looks like, so it's like the model understands not just how to generate the image that you want. Yes. But understands all the context that goes around it.
And I, uh, like I was reading that it, it, it seemed like they were kind of AB testing. Yeah. This model within, within chat GPT itself. Whoa. Um, and I was generating a model to go with the fact that the Anthropics token limits were, uh, insanely restrictive. And so I generated an image and then I said, Hey, make it way more dank.
The image was supposed to be like me hitting the token limit and feeling like I'm jail. This is pretty cool. And when you see the side by side, sorry, audio only, uh, users, but you see the side by side, it's very clear that it, there is an old yes model at work and a brand new model at work because the I, when I said screw the image.
With dank slash mean, the new model really got, really got the instruction. Yeah. Can you, that image can is toasty. One of the things that's so fascinating about that image on the LA on the right is that [00:30:00] there's just so much more detail there. Right. And one of the things that we talked about, you, you mentioned earlier, but like.
The fact that text is mostly solved in this way. Mm-hmm. And you don't see any sort of things and lots of text like that is so much. Text is a big deal. It's a big deal. Um, the other thing that happened is a new video model that is leaked out there in the same exact way. And this is how it began. People are testing these video models, they leak them out in this way.
This video model is called the Happy Horse Model. So thank you for yet again, bringing forth another fun name. There's some really cool examples here. People are out there saying it's better than Sea Dance Two, I don't know if I buy that. It's better than Sea Dance two. Kevin, if you look at these examples that, like one of our, our favorites Venture twins shared, it looks like very realistic.
You see a bunch of women doing kind of yoga. Yeah. And it's this experience of that. Um, but this does not, again, feel like a step change. But at the same point, maybe we are just getting so close to AI video looking like real video that it is hard to know like what a step change is and to understand that.
Kevin Periera: Right. And I think that one thing that I saw with Ance was like. I've been showing [00:31:00] my wife all those Cat FU videos, which are, I'm sure you've seen some of them where it's the cat fighting the, uh, kung fu master. Yes. That combat was a big thing. That Sance too got better than other things. So maybe we need to start picking apart some of these things and figuring out what they are.
Yeah, I think, you know, Justine pointed out that the, the, the reason one of the examples that she posted was impressive was that it had, um, amazing consistency of the product across, you know, from shot to shot. It looked like it was the same product, but grounded in different environments. Um, so we'll see what the, the pros and cons are here on the leaderboards, on the, uh, artificial analysis, AI leaderboards, if you toggle, uh, no audio and with audio, the, uh, the.
Gavin Purcell: Uh, dream Mania C Dance 2.0 is still topping it first and it is so, yeah, it is so close. Do you think this is, you think this is VO four? Well, so there's a lot of rumors going out there that it might be VO four. There's also rumors out there that say it's Juan 2.7, which is the Chinese model that we have used stuff.
We use Juan [00:32:00] 2.2 to make that trailer we did a while ago, and Juan is a very good Chinese model, also very good and open source at times, depending on which version of it is. I would be kind of shocked if this was VO four, only because I suspect that V four. We'll be better at the audio side of it too.
Mm-hmm. Because like one of the things that we talked about when VO three first came out was just like kind of how mind blowing the audio in general was on it. And I think that is what I would expect to see. Again, we've talked about at Google IO is probably where we're gonna see that drop, I would assume.
Um, but overall this is still very cool. So again, you can go to arena.ai. You never really know what's gonna show up. A lot of the times you'll see these things tr spread out there and it's already gone from the arena, but we're trying, uh, and, and I'm sure we'll probably have something more about one of these models on our next show as well.
Hey, hey, hey, hey. I'll see you in, I'll see you in the comments, friends. Drop one, drop one today. I'll go juice. I'll see you later. Bye.
Anthropic's Mythos AI Is Too Dangerous to Release. They're Using It Anyway. — AI for Humans