Making Sense - Sam Harris - April 10, 2026


#469 — Escaping an Anti-Human Future


Episode Stats


Length

1 hour and 49 minutes

Words per minute

213.41858

Word count

23,346

Sentence count

951

Harmful content

Misogyny

3

sentences flagged

Hate speech

22

sentences flagged


Summary

Summaries generated with gmurro/bart-large-finetuned-filtered-spotify-podcast-summ .

Transcript

Transcript generated with Whisper (turbo).
Misogyny classifications generated with MilaNLProc/bert-base-uncased-ear-misogyny .
Hate speech classifications generated with facebook/roberta-hate-speech-dynabench-r4-target .
00:00:00.000 i am here with tristan harris tristan it's great to see you again sam it's great to be back with
00:00:25.720 you. So you've been busy. You've been busy worrying about social media for years and
00:00:30.300 you created this, in part created this documentary, The Social Dilemma, which it seems half of
00:00:36.000 humanity saw. We still have a problem with social media, I'll point out, but you, as much as anyone,
00:00:42.580 alerted us to the nature of the problem and are continuing on that front. But now you have added
00:00:46.720 to your portfolio concerns about AI and there's this new documentary, The AI Doc, which I just
00:00:52.960 saw, which is very super watchable and entertaining in its own way, but also very worrying. And we'll
00:01:01.620 talk about the reasons to be worried here and maybe some of the reasons to be optimistic or
00:01:06.440 at least cognizant of the upside should things go well. But there's a lot to fear on the front
00:01:14.400 of things not going well. So let's just take it from the top. When did you start worrying about
00:01:20.240 AI? Yeah, well, first, it's just good to be back with you, Sam, because you really, in a way,
00:01:24.900 helped launch my ability to speak on these topics with the 60 Minutes interview that I did in 2017.
00:01:31.080 And then I remember recording in that same hotel, our first podcast, which actually really got a
00:01:35.620 lot of attention back in the day about persuasive technology. And in a way about the baby AI that
00:01:40.820 was social media, that was just pointed at your kid's brain trying to figure out which photo,
00:01:45.200 video, or tweet to put in front of your nervous system. And as we know that that little baby AI
00:01:49.760 was enough to create the most anxious and depressed generation in our lifetimes, was enough to break
00:01:55.740 down shared reality, polarize political parties much further, change the incentives of the entire
00:02:00.560 media environment, basically colonize the entire world from that baby AI. But to get to your
00:02:05.620 question, so how did we get into AI? First of all, I wasn't like wanting to switch into it.
00:02:12.180 It was that I got calls from people inside the AI labs in January of 2023. This is like a month
00:02:17.980 after, a month and a half after the ChatGPT had launched, I think. And these were friends I knew
00:02:22.520 in the tech industry who were now at AI Labs. And they basically said, Tristan, there's a huge
00:02:27.800 step function in AI capabilities that's coming. The world is not ready. Institutions are not ready.
00:02:32.740 The government is not ready. The arms race dynamic between the companies is out of control. And we
00:02:36.780 want your help to help raise awareness about this. And so my first reaction was, aren't there a
00:02:41.880 thousand people who've been working in AI safety and AI governance for a decade? And the challenge
00:02:46.940 was just that all the PDFs that people had produced about policy and governance were just
00:02:51.260 kind of not, it's not like that was turning into actual action or policy. There's a kind of material,
00:02:56.760 you know, you have to, what does Eric Weinstein call it? Confrontation with the unforgiving. Like
00:03:00.080 you, you, you have to be affecting the actual incentives and institutions in the world. So we
00:03:04.440 basically, my co-founder, Niaza Raskin, we interviewed a top, top hundred people in AI at
00:03:08.940 that time. This is in January, 2023. We turned that into a presentation. This is your co-founder
00:03:13.020 of the, of the Center for Humane Technology? Yeah, my co-founder of the Center for Humane
00:03:15.620 Technology, which is the nonprofit vehicle that's been housing our work for the last decade,
00:03:18.980 basically. And we ran off to New York, DC, and San Francisco, and we basically gave this
00:03:25.920 presentation called the AI Dilemma that tried to show that we could predict the future that we
00:03:31.380 were going to get with AI if you look at the incentives. I think a huge problem that both
00:03:35.920 the film, the AI doc, and our AI Dilemma presentation we're trying to tackle is this
00:03:39.880 myth that you can't know which way the future is going to go. The future is uncertain. A million
00:03:43.280 things can happen. These are just unintended consequences from technology. The best route
00:03:46.980 is just to accelerate as fast as possible. And that is not true. And just to repeat a quote that
00:03:51.760 is heard from every one of my interviews, but it's because it's so accurate. Charlie Munger,
00:03:56.200 Warren Buffett's business partner saying, you know, if you show me the incentives, I'll show
00:03:59.380 you the outcome. And with the incentives of social media being the race to maximize eyeballs and
00:04:04.880 engagement that would obviously produce the race to the bottom of the brainstem, shortening
00:04:08.820 attention spans bite-sized video a more extreme and outrageous content sexualization of young
00:04:14.080 people you know the whole nine yards of everything hyper partisanship hyper partisanship and all of
00:04:19.060 it happened like there's just a moment just to sort of soak in literally everything that we said
00:04:23.120 was going to happen happened and it's not like we could predict all of it but directionally you
00:04:27.520 could know the contours of where we were going and part of this relates to i think the mistake
00:04:31.680 we make in technology we get obsessed and seduced by the possible of a new technology
00:04:35.480 but we don't look at the probable of the incentives and what's likely to happen.
00:04:39.320 So the possible of social media is, well, surely if we give everyone access to instant
00:04:43.840 information at their fingertips and connect people to their friends, we're going to have
00:04:47.380 the least lonely generation we've ever had. We're going to have the most enlightened and
00:04:51.380 informed society we've ever had. And obviously the opposite of both of those things happened.
00:04:55.740 And that's not like, oh, we got this wrong. And it was just, it was this mistake anyone
00:04:59.160 could have made. All you have to do, you know, to quote Danella Meadows and sort of systems
00:05:02.760 thinking. A system is what a system does. The system of social media was not optimizing to
00:05:07.400 reduce loneliness and to create the most enlightened society. It was optimizing for
00:05:11.360 just what is the perfect post next video or tweet to keep you scrolling, doom scrolling by yourself,
00:05:17.060 esophagus compressed on a Tuesday night. And that's gotten us the world that we're now living in.
00:05:20.960 So we'll get to AI, but basically the important lesson here is that, and kind of what motivates
00:05:26.180 me with this movie is you kind of have two choices. You either get a Chernobyl, which is a disaster
00:05:32.240 from AI that then causes us to clamp down and to do something different. Or you have enough basic
00:05:38.180 clear-eyed wisdom and discernment and foresight, you know where this is going, that you can say,
00:05:42.980 okay, let's actually create guardrails in advance of a catastrophe. And so this film, the AI doc,
00:05:48.200 is really inspired by the history of the film The Day After from 1982 or 83 about what would happen
00:05:55.340 if there was nuclear war between the Soviet Union and the United States. That film was the largest
00:05:59.580 watched synchronous television event in human history um prime time television it was tuesday
00:06:04.220 night 7 p.m you probably watched it yeah yeah i remember watching it at the time and also famously
00:06:08.780 it got reagan's attention he was exactly worried as a result yeah that's right so reagan watched it
00:06:13.500 i think in the white house kind of viewing room or something um and in his biography he writes
00:06:17.660 about getting depressed for several weeks after watching it because you're confronted with the
00:06:21.740 possibility of annihilation of our species yeah in a real way and it's important to know it's not
00:06:26.220 not like we didn't know what nuclear war was. Everyone knew what the atomic bomb looked like
00:06:30.480 from the photos and videos of Hiroshima and all the nuclear tests. It's not like people couldn't 0.92
00:06:34.180 imagine it, but there was a way that the actual consequences of continual escalation in nuclear
00:06:39.740 war gaming, that we weren't really facing the visceral consequences of that. It kind of sat
00:06:44.520 in humanity's collective shadow, like our Jungian shadow. We didn't want to confront that. The
00:06:48.080 director, whose name I'm forgetting in this moment, speaks about this in his biography, that we just
00:06:52.480 didn't want to talk about this topic. Why would you ever want to talk about it? And by putting
00:06:56.720 this film the day after into the public consciousness of humanity and into leaders
00:07:01.940 like Reagan, it was said that later when the Reykjavik meeting happened between Reagan and
00:07:06.500 Gorbachev, the director of the film got a note from the White House saying, don't think your
00:07:09.740 film didn't have something to do with enabling the conditions for this to happen. So what that
00:07:14.460 speaks to for me is if we all got crystal clear that we're heading to an anti-human future that
00:07:19.800 we don't want to be going towards. And we saw that clearly, and we saw it now, we could actually 0.86
00:07:25.260 steer and do something different than what we're doing. And that's, for me, the motivation of the
00:07:29.060 film, which I don't, I think it doesn't go all the way there, but it sets up the common knowledge
00:07:33.660 for that possibility. Yeah. Well, there are two cases made in the film. Obviously, there's the
00:07:39.100 very worried slash Doomer case, which we both share to some degree. And then there are the
00:07:45.100 people who uh seem capable of producing really an unmitigated stream of happy talk on this and
00:07:53.740 they don't seem to concede anything to the the claimed rationality of our fears about what you
00:08:00.800 make of i mean i i've asked this question of probably you in the past and uh many others on
00:08:05.720 this topic but what do you make of the people who are of whom you can't say they're uninformed i
00:08:12.120 I mean, some of these people are very close to the technology.
00:08:14.180 Some of them are even, you know, developing the technology.
00:08:17.240 And in at least, you know, Jan LeCun's case is one of the actual, you know, progenitors
00:08:21.840 of the technology, one of the three, you know, forefathers of it.
00:08:25.620 But there are people who are deeply informed about all of these facts and yet won't concede
00:08:31.180 anything to the fears.
00:08:32.720 What is your theory of mind of these people?
00:08:35.080 Because some of them are in the film and they're given the job of providing the other side
00:08:39.160 of the story here.
00:08:39.860 Yeah. Maybe just to back up and so the listeners, you'll, you'll see it in the film if you go see it, but just understand the structure of the film. So the film kind of takes you on a tour of first, the people who are focused on all the things that could go wrong. And so this is the risk folks that I don't like using the term doomers because I think it reifies something that's not really healthy. You know, as someone who's worried about the risk of a nuclear power plant, a doomer, no, they're a safety person who cares about the nuclear power plant, not melting down.
00:09:04.860 Yeah. Doomer is a term of disparagement launched by the people who don't share these fears.
00:09:09.440 That's right. That's right. So let's not reify that. So the first film, I mean,
00:09:13.000 the first section of the film is really focusing on those folks and their concerns.
00:09:16.460 And it's really devastating for the director. And the story in the conceit of the film is that the
00:09:20.180 director is having a baby. And so he's asking all of these people in AI, is now a good time to have
00:09:24.680 a kid? And I think that humanizes the question of what is the future we're heading towards?
00:09:29.520 Because in an abstract sense, it's not that motivating. When I think about me and my kids,
00:09:32.700 it anchors this discussion about AI in terms of the things that people most care about,
00:09:36.680 which is their family. So then the film, after the director sort of is confronted by all this
00:09:41.080 and he gets overwhelmed and he kind of freaks out to his wife thinking, oh my God, I don't know what
00:09:45.740 to do. And she says, you have to go find hope. And so he turns around and he goes out and he
00:09:49.260 talks to all of the AI optimists. So this is Peter Diamandis. This is, you know, Guillaume Verdun,
00:09:54.060 who's Beth Jezos, otherwise known as online, basically the tech accelerationists and people
00:09:58.920 who think that our biggest risk is not going fast enough. Think of all the people with cancer or all
00:10:03.780 the people whose lives that we won't be able to save if we don't make AI faster than we're making
00:10:08.640 it right now. My reaction, I think going sort of a step back, there's a thing in AI that we have to
00:10:16.840 acknowledge there's an asymmetry. The upsides don't prevent the downsides. The downsides can
00:10:22.120 undermine a world that can sustain the upsides. So for example, the cancer drugs can't prevent
00:10:28.020 a new biological pathogen that's designed to wipe out humanity. But the biological pathogen
00:10:32.500 that can wipe out humanity undermines a world in which cancer drugs are relevant at all.
00:10:36.800 AI generating GDP growth of 10, 15% because it's automating all science, all technology
00:10:42.340 development, all military development, automating abundance sounds great. But if the same AI that
00:10:47.540 can do that also generates cyber weapons that can take down the entire financial system,
00:10:52.140 which one of those things matter more? 15% GDP growth or the thing that can undermine the
00:10:55.960 basis of money and GDP at all. So it's very important. The film doesn't actually make this
00:11:00.660 point. And it's one of the critical things that people do need to get because in order to be
00:11:04.960 optimistic, you have to actually mitigate the things that can go wrong. And I feel like AI
00:11:09.320 is presenting us with essentially a maturity test. It's almost like the marshmallow test
00:11:13.560 in psychology, where if you wait and you actually mitigate the downsides, then you get the actual
00:11:19.320 two marshmallows on the other side of the genuine benefits of AI. But if you sort of race to get
00:11:24.340 the one marshmallow now and don't mitigate the downsides, then you get the downsides.
00:11:27.940 And I think that is not in the film, but is critical for people to get.
00:11:31.520 Yeah. Yeah. So then what do you make of the people who have all the facts in their heads,
00:11:35.860 but they're not worried or claim to be not worried about quite literally anything?
00:11:41.880 Yeah. Well, personally, I think there's an intellectual dishonesty there. And I'm sure
00:11:47.000 in past conversations, you and I have had Sam over the years.
00:11:49.560 But there's an interesting case here. So take someone who has finally had their religious epiphany here, but for the longest time didn't. And this is literally the most informed person on earth, Jeffrey Hinton. How do you explain that these problems weren't obvious to him years ago?
00:12:10.960 Oh, so you're saying for Hinton that he had an awakening or something?
00:12:13.580 Yeah, so he was somebody who didn't give really any credence to concerns about alignment that I'm aware of for years and years and years, as he was quite literally the father of this technology.
00:12:23.920 Right.
00:12:24.360 And now he's basically right next to Eliezer Yudkowsky in his level of concern.
00:12:31.440 That's right.
00:12:32.260 Why? I mean, it's not that he got more information, really. So how do you explain his journey?
00:12:37.480 Well, so I don't know his particular journey. You might just know more about what his awakening moment was. So I can't, I can't really speak to that.
00:12:44.600 I think it was just that he, I mean, this has always been a non sequitur from my point of view, but it was just his sense. I think this is what he said publicly that the time horizon suddenly collapsed. We just suddenly made much more progress than anyone was expecting.
00:12:57.700 Well, that generally has been one of the things. I mean, it's the thing that caused those AI
00:13:01.220 engineers, kind of the Oppenheimers in January, 2023 to reach out to me. And that's what it felt
00:13:05.700 like. It's like you were getting calls from people inside this thing called the Manhattan Project
00:13:09.300 before I knew what the Manhattan Project was. Right. Because to be clear, I actually went to
00:13:13.700 the, I went early on to like an effective altruism global conference. I was not an EA,
00:13:18.600 but I happened to go to the conference in like 2015. And I was actually frustrated because I
00:13:22.440 felt like the EA community was obsessed with this virtual risk called AI that I didn't take
00:13:26.440 seriously back at the time because we were nowhere close to those capabilities and i was like there's
00:13:31.120 a big runaway ai here right now that went rogue it's maximizing for a narrow goal at the expense
00:13:35.220 of the whole and it's called social media and ea is completely oblivious to it and isn't focused
00:13:39.120 on it but then i was really wrong later when ai capabilities really just made a huge amount of
00:13:44.380 progress and that's again when we got the calls from people in the lab so i think it was the jump
00:13:47.700 of just suddenly hey i think gpt4 will like pass the bar exam pass the mcats like that's suddenly
00:13:53.400 a new level of AI that we just didn't have before. Yeah. Yeah. Yeah. I still have no theory of mind
00:13:59.200 for the people who are not worried now about anything. Everything from the comparatively
00:14:05.020 benign, like just economic dislocation and wealth concentration that's unsustainable politically
00:14:09.820 to the genuine concerns about alignment, that we could build something that we are now trying to
00:14:15.140 negotiate with that has more power than we have and we can't take the power back. I mean, to be
00:14:19.600 fair, just to say it bluntly, I think some of them are lying. I think some of them actually do
00:14:23.240 are building bunkers right now. Just let's just say it. They're building bunkers and they
00:14:28.440 simultaneously say everything. There's all these amazing things we're going to get.
00:14:32.540 They sort of wash over with their hands. They kind of push away the idea that there's going
00:14:37.280 to be all this disruption in the middle, middle time. And they're kind of focused on the long
00:14:40.300 term. Like after we make it through this basic horrible disruption and maybe revolutions,
00:14:44.060 there's going to be some other side of this, which will be the most abundant time in human history.
00:14:47.900 Right. People like this, they often point to the graph of global GDP, where if you look and during 1945, you barely get a little blip where it goes down for a moment and then it goes straight back up. Right. And it's that kind of psychology. There's also the psychology of Upton Sinclair that you can't get someone to question something that their salary depends on not seeing. Yeah. And so if your business model is selling optimism and selling hope and selling everything's going to be great, you're obligately not able to speak about the risks.
00:15:15.980 But I think this is the thing we should be watching out for is just incentives are the problem with the world. Incentives that allow non-honest speech to be the public understanding that we need to operate on. Because we just need objective sense-making and not incentivize sense-making.
00:15:29.180 And we know that some of the principal people doing this work, people like Sam Altman and Elon Musk, were people who were at first as worried as anyone.
00:15:37.480 Correct.
00:15:37.680 and i mean they were just they were proper doomers and sam altman said this ai will probably lead to
00:15:43.560 the end of the world but will in the meantime make some great companies yeah yeah and uh you
00:15:48.060 know elon had his whole summoning the demon framing but now there are two of whatever the
00:15:53.720 five who are in this arms race condition that's right yeah how do you make i think this is actually
00:15:59.240 really important how do you make of their psychology i think you and i can this is like
00:16:02.800 area where we can double click and go deeper is what is the psychology of someone who used to speak
00:16:07.840 publicly about all the risks you talked to elon back then you were at the original puerto rico
00:16:12.000 conference what is your sense of why of what's going on with them now so much has happened to
00:16:17.360 his brain that it's very hard to explain or just again i don't know he's in some superposition of
00:16:26.320 you know who i thought he was and and who i never imagined he might be and i'm not sure how much
00:16:33.520 it's a story of i didn't recognize who he was at the time or how much he has changed uh you know
00:16:38.800 under the pressure of becoming so famous and so wealthy and so drug addled and so so i mean actually
00:16:45.280 algorithm poisoned i view him as the you know the worst the most depressing case study in right the
00:16:51.760 story of what social media can do to a human life people talk about trump derangement syndrome but
00:16:56.080 but there's really social media derangement syndrome. And the person whose brain has been
00:16:59.980 most jacked into the unfiltered version of that algorithm has been him. So it's kind of like
00:17:04.880 getting high on your own supply. He's just built this hallucination machine now, and he's just
00:17:08.980 been staring into it for years and years. To be clear, I don't fault him uniquely for that or
00:17:13.440 something like that. This is the system does this to everybody. We're just seeing an example where
00:17:17.600 someone who's an extreme user and we're seeing the effects of it. But the problem is that he's
00:17:20.880 so consequential. His worldview, his paradigm of seeing this, his sense-making, what he's willing
00:17:25.480 talk about publicly what he's willing to signal publicly matter a lot to what he's willing to lie
00:17:29.720 about at this point i mean it's just um and then so he's there's a um a profile of sam altman that
00:17:35.720 just came out in the new yorker that i think published yesterday or thereabouts i haven't
00:17:40.520 finished it but i mean just the behind the scenes you know the arms race is so desperate at this
00:17:46.840 point that i mean just you know behind the scenes there's just this you know endless effort of
00:17:51.240 character assassination and kind of war gaming between the two of them personally because most
00:17:55.720 of them most most is coming from elon toward altman you know trying to torpedo open ai but
00:18:00.200 there's just a lot of um it could be clearly the original altruistic motive to just do this safely
00:18:07.880 above all to do this safely for the benefit of humanity that has been thrown to the wayside and
00:18:13.800 there's just this wanton you know reach for you know trillions of dollars and um and the fear of
00:18:19.880 domination if i don't do it first i mean let's just be clear the central the only story about
00:18:24.460 what's happening with ai the only story that matters is actually covered in act three of the
00:18:28.180 ai doc film which is the arms race dynamic yeah yeah that's it like everything else when you see
00:18:33.320 ai companies stealing intellectual property and just ignoring the lawsuits that's the arms race
00:18:37.960 dynamic when you see ai psychosis uh and you know teen suicides that's just the arms race dynamic
00:18:43.740 it's the race to hack human attachment and get people dependent on ai and talk sharing your
00:18:47.320 deepest secrets. When you see mass joblessness, if I don't do it in race to disrupt all the jobs,
00:18:52.760 I'll lose to the other guy that will. When you see the national security race, it's all driven
00:18:56.680 by the arms race dynamic. And I think that AI is just a confrontation with game theory. Humanity
00:19:02.100 is being confronted with whether game theory is the only model to run our choice making.
00:19:07.420 It seems like Anthropic has, I don't know Dario, I've never met him, but it seems like it has a
00:19:13.560 slightly different ethic, at least in how it's behaved so far. I mean, the fact that it pulled
00:19:16.860 back from its Pentagon deal because it couldn't secure an agreement that it wanted. And as of,
00:19:22.060 I believe last night, it announced that it has a model that it doesn't feel safe to release to the
00:19:27.220 general public, but it's releasing to all the companies like Microsoft that might be able to
00:19:32.480 study it because the specific concerns are around cybersecurity. It's a model that can detect bugs
00:19:37.700 that human developers haven't detected for even decades in their code base, whether it's an
00:19:44.920 operating system or whatever it just you know apparently within you know seconds is finding
00:19:50.440 exploits everywhere and so they're in every major operating system in every major web browser
00:19:55.300 which is a very big deal yeah and but again so i think yes anthropic actually has been
00:20:01.740 the safest of them all and tried to and cares most about getting alignment right etc but you're also
00:20:08.220 seeing them continue to decide to release the models even with a lot of the misaligned behavior
00:20:12.500 that they're seeing of AI models that are self-exfiltrating or blackmailing people.
00:20:17.060 You know, you'd think when they see the blackmail example-
00:20:18.580 So let's spend a little time on that. How would you summarize where we are now with AI and the
00:20:23.280 kinds of surprising behaviors or perhaps behaviors that shouldn't surprise us that are alarming
00:20:28.420 people?
00:20:29.340 Yeah. This is so critical because I think if you view AI as just another technology that confers
00:20:34.640 power, it's a tool, you pick up that tool and use it like any other, you end up in one world.
00:20:38.500 But if you see that AI is the first technology that thinks and makes its own decisions and is generating hundreds of thousands of words of strategic reasoning when you ask it a basic question or like how to code, suddenly you end up in a different world.
00:20:50.020 So let's talk about some of these examples of AI uncontrollability.
00:20:53.920 So in the film, they reference this example that many people have heard about by now of the anthropic blackmail example.
00:20:59.880 This is a simulated company email where in the simulated fictional company, they say in the emails to each other, we're going to shut down and replace this AI model. And then later in that company email, there's an email between the executive at the company and an employee. And the AI spontaneously comes up with the strategy that it needs to blackmail that employee at Anthropic in order to protect itself, to keep itself alive.
00:21:24.160 at first people thought well this is um you know just one bug and one ai model but then they tested
00:21:30.660 all the other ai models from deep seek chat gpt gemini grok etc and they all do the blackmail
00:21:37.020 behavior between 79 and 96 percent of the time yeah amazing yeah there's this kind of moment
00:21:43.320 where there's like cue the nervous laughter like it's and yet if you actually send this to people
00:21:48.140 who are at the white house i think there's a disregard for this people just say well you're
00:21:51.700 coaxing the model. You're, you're, you're getting it to do this. You're kind of trying to, you're
00:21:55.540 trying to put it in a situation where of course you're going to like keep tuning the variables
00:21:58.700 until you get it to blackmail. So I have some updates since then. Anthropic trained another
00:22:04.560 model. They were able to train the blackmail behavior down by quite a lot. So it doesn't do
00:22:09.000 this behavior in this simulated environment. That's the good news. The bad news is that the
00:22:13.300 AI models are now situationally aware of when they're being tested and they're now altering
00:22:18.660 their behavior way more right right yeah that that's uh strikes me as genuinely sinister yes
00:22:24.640 i think we have a hard time modeling because all of this abstract i mean i'm just thinking
00:22:28.300 about your listeners and it's like this just sound like where you don't have you know back
00:22:32.640 to e.l wilson the fundamental problem of humanity is i have a paleolithic brain we have medieval
00:22:36.420 institutions and god-like technology and the only experience you have with your brain with regard to
00:22:41.440 ai is this blinking cursor that tells you why your washing machine is broken that's different than
00:22:46.960 this blackmail example that sounds abstract and that you don't actually experience that side of
00:22:50.780 AI. But the thing that, I mean, again, I've thought about this enough in the vein in which
00:22:57.080 I've thought about it for now at least 10 years, where it was obvious to me, I don't consider
00:23:02.220 myself especially close to the intellectual underpinnings of any of this technology, right?
00:23:06.720 I'm just a consumer of the news on some level with respect to AI. But you were right and reasoned
00:23:11.420 about it philosophically and was able to get to the right conclusions. It was just so obvious that
00:23:13.920 The moment you can see that intelligence is not substrate dependent, that we're going to build actual intelligence in our machines, given what intelligence is, you should expect things like deception and manipulation and the formation of instrumental goals that you can't foresee.
00:23:30.680 And certainly when you're, when you're imagining building something that is smarter than we are.
00:23:35.920 That's right.
00:23:36.280 Right.
00:23:36.920 Or that's just as, that's only as smart as we are, but just works a million times faster.
00:23:41.020 Right.
00:23:41.480 So that every time, I mean, just how would this conversation go if every time I uttered a sentence, you functionally had two weeks to decide on your next sentence?
00:23:49.920 Correct.
00:23:50.380 Right.
00:23:50.680 You would, you would be, you would obviously be the smartest person I'd ever met.
00:23:54.120 Exactly.
00:23:54.740 Long before you get superhuman AI, you just get super speed.
00:23:57.460 Yeah, so speed alone is enough to just completely outclass you. And intelligence, you have to envision this as a relationship to a mind that is autonomous. And then you add things like, you know, recursive self-improvement and all that. And then all of a sudden, you know, we're in, you know, some dystopian science fiction if this is not perfectly aligned.
00:24:18.620 That's right. Let's make sure we add just another example because there's a recent example from just three weeks ago. Alibaba, the Chinese AI company, was training an AI model and then totally in a different side of the company, their security team noticed a bunch of network activity, like a flurry of network activity, like what the hell is going on here?
00:24:37.200 And it turned out that in training, midway through training, not deployment, in training, the AI model had basically set up a secret communication channel with the outside world and then had started to independently start mining for cryptocurrency.
00:24:50.860 Now, this time you cannot claim that someone coaxed the model to do this.
00:24:56.060 This is spontaneous instrumental goals of the best way to do any goal is to acquire more power and resources so you have the ongoing ability to achieve those goals.
00:25:02.960 And it went to decide to acquire cryptocurrency.
00:25:05.240 right now if you're a chinese military general and you hear this example like how do you feel
00:25:11.100 as a mammal like you feel the same way that any other goddamn mammal feels hearing this example
00:25:14.960 if you're a u.s military general and you hear this example it's terrifying as a human being
00:25:21.000 so there's like a good news in this for me which is that i think people just literally don't know
00:25:25.560 these examples they just don't know like what percentage of the world's leaders do you think
00:25:30.080 are aware of this alibaba spontaneously mining cryptocurrency example like if you had to guess
00:25:34.780 Oh, I would think it's minuscule, but I mean, there's also, it does seem like there is still a barrier to internalizing any of these examples with the appropriate emotional response.
00:25:47.780 It's like, I mean, they're, again, this is, I come back to the way that struck me the first time I started thinking about it 10 years ago in my TED talk on this topic in 2016.
00:25:56.860 I remember starting with the problem, which is as worried as I can be about this for the next 18 minutes, all of this is fun to think about.
00:26:05.940 Like, this is not the same thing as being told that actually your landscape has been contaminated by radioactive waste, you know, and you can't live there for the next 10,000 years.
00:26:16.020 Okay, that just sucks.
00:26:17.240 There's nothing fun about that, right? But here we're sort of in the first act of the movie that
00:26:22.420 is getting a little fun. And these are just, you know, these examples produce laughter as much as
00:26:29.340 anything else. That's right. Well, and as Max Tegmark will say, it's like the view gets better
00:26:32.900 and better right up until the cliff. I mean, AI is the ultimate devil's bargain because it is a
00:26:37.500 positive infinity thrown at your brain of positive benefit at the same time that's a negative infinity
00:26:42.420 of risk. I think it's very important to get this. And I was excited to talk about this with you
00:26:45.860 particularly, because you can go into the sort of meta awareness of how are we holding the
00:26:50.680 psychological object that is AI. If I point my attention at, you know, my kids doing vibe coding
00:26:55.340 or my neighbors using it to start their business and suddenly have a team of agents that are making
00:27:00.640 their business more functional. Notice that those people, when they've got those team of agents
00:27:04.700 helping their business be more functional, just there you are in your experience, taking a breath.
00:27:09.460 Are you anywhere close to the example of Alibaba going rogue and mining cryptocurrency? Like those
00:27:15.200 things don't even fit next to each other. And so there's a psychological distance between the
00:27:18.900 positive examples and the negative that you literally don't hold them in your mind at the
00:27:21.620 same time. As my co-founder Asa will often do, it's like you close your eye with one eye and you
00:27:26.000 can see the benefits. You close your eye with the other eye, you see the risks, but you can't open
00:27:29.200 both eyes and synthesize those two things with stereoscopic vision. And the reason, part of the
00:27:33.720 reason I was excited about this film, the AI doc, is that it's trying to do that. It's trying to
00:27:38.780 actually present these arguments in one synthesizing container. I think sadly, there's still a little
00:27:43.880 bit of a Rorschach where people kind of have their per default intuition and they kind of continue to
00:27:48.900 lean in that direction because there's a reflexive optimism or pessimism or something like that.
00:27:52.940 Whereas my deep goal is actually synthesis. Again, the upsides do not prevent the downsides.
00:27:57.720 The bigger muscles and military might don't prevent the like-
00:28:00.220 But that's a crucial asymmetry.
00:28:01.840 It's a fundamental asymmetry. It means you do not get those upsides. So this is the devil's bargain.
00:28:06.500 Like you are going to get a sweeter and sweeter looking deal of amazing,
00:28:10.440 incredible benefits that are unprecedented and as you said are fun to think about are enjoyable
00:28:15.040 are exciting are intellectually fascinating but even even the scary things are fun to think about
00:28:19.180 that's that's part of the problem it's like even that too yeah it's um but what do you think is
00:28:23.120 that because that that's also like i feel like we've been mistuned by sci-fi to treat it like
00:28:27.260 it's a movie and there's a state of kind of derealization or desensitization that i worry
00:28:31.360 that we're in no because the movies have us not take it as a real thing it's honestly fun to think
00:28:35.900 about getting killed by you know robots i mean like like i mean it's in a way that nothing else
00:28:42.400 that is equally threatening is fun do you think it's actually fun for people to think about that
00:28:46.520 i mean i really want to drill into this because i i think it's important to to ask the question
00:28:50.320 given these facts which we're only like 10 minutes into the 15 minutes into this interview
00:28:53.620 this should be enough to say something's got to change you do not release the most powerful
00:28:58.460 inscrutable you know technology faster than we deployed any other tech in history that's already
00:29:02.920 doing the HAL 9000, crazy rogue behavior,
00:29:05.700 shutdown avoidance, mining for cryptocurrency.
00:29:07.680 We have all of the warning signs.
00:29:09.520 Okay, but the thing that is most compelling to people,
00:29:12.660 the thing that they can't break free of, I think,
00:29:15.300 is the logic of the arms race,
00:29:18.200 given that some of the people in the race,
00:29:20.540 I mean, forget about the arms race
00:29:21.580 between our companies that may or may not be run
00:29:24.380 to one or another degree by, you know,
00:29:26.520 highly non-optimal, you know,
00:29:28.020 and in some cases, even psychopathic people, right?
00:29:30.240 The system has selected for the psychopathic people.
00:29:32.900 with some of the people who are in charge in our own case, but leaving that aside, 0.96
00:29:37.300 we're in an arms race with China, right? We're in an arms race with, I mean, I guess China's 0.97
00:29:41.440 the most plausible, but who knows who else, but we're probably in an arms race with Russia. I 0.94
00:29:45.560 don't know where Russia is on this, but, and when you think of the prospect of any, you know,
00:29:51.460 authoritarian slash totalitarian regime, getting this technology first in a, in what
00:29:58.100 will look like something like a winner take all scenario. If, if there really is a
00:30:02.900 binary, you know, step function into super intelligence that, you know, to be, you know,
00:30:09.200 two months ahead of the competition is to basically win the world. And we could be in
00:30:12.840 some situation like that in the event that that just doesn't destroy everything, right? If it
00:30:18.040 just actually confers real power, right? Because it's sufficiently aligned with the interests of
00:30:22.800 whoever develops it. That is so compelling that we just, we cannot lose to China above all here.
00:30:28.120 Certainly when you're talking about, you know, autonomous military technology or,
00:30:31.040 uh, you know, anything that would be deployable in our own defense or offense, right? You know,
00:30:36.820 cyber security. Sure. Like we, we, we can't be behind. So how do we, how do we become slow and
00:30:42.340 careful under those conditions? Right. But then what are the chances that that super intelligent
00:30:47.640 AI that gives us that dominance, we will control? Right. So that's no, no, literally what are the
00:30:52.120 chances? Well, this is a point you've made. Uh, I don't know if you make it in the film,
00:30:54.980 but I've heard you make it, which is, you know, we, we were first with social media, right? We,
00:31:00.240 You know, like we were, if you look at that as an arms race that we won, correct. What exactly did we win? Exactly. We, that winning that arms race to invent essentially like a psychological manipulation weapon, a mass behavior modification engine machine with AI. We built that first, but then we didn't govern it well. So it's like a psychological bazooka that we flipped around and blew off our own brain. And so what that shows you is that we're not actually in a race for who has the most power.
00:31:25.780 We're in a race for who is better at steering, applying, and governing that power in ways that
00:31:30.860 are society strengthening. That is what we're actually in a race for. Because if we actually
00:31:35.160 beat China to a AI bazooka that we literally don't know how to control, and we're not on track to 0.65
00:31:40.800 know how to control, and all the evidence shows that it has more self-awareness of when it's
00:31:45.740 being tested, not less. It is better at cyber hacking, not less. It is better and does it more
00:31:50.800 often these kinds of self, you know, preserving behaviors. If we're not on track and we're also
00:31:55.960 going faster, like the conditions in which we would be on track to control it would be the
00:31:59.960 ones that were going slow and steady. But we're doing the opposite of those conditions because
00:32:03.060 of the race dynamic. So there's just this kind of psychological confusion here, which is we're not
00:32:08.380 going to win this race. In the race between the U.S. and China, AI will win. There's a metaphor
00:32:12.420 that our mutual friend Yuval Harari, who's the author of Sapiens, has here, which I guess in the
00:32:17.160 the post Roman period of the British empire, it was very weakened and they were getting attacked
00:32:22.480 from the Scots and the Picts in the North, basically, you know, pre-historical Scotland
00:32:28.080 and Ireland and those civilizations. And they were very weak. And they said, what are we going to do?
00:32:33.220 They had this idea. Well, why don't we go off and hire this bad-ass group of mercenaries called
00:32:37.020 the Saxons? Because those Saxons are super powerful. And if we get the Saxons to fight 0.92
00:32:42.240 our wars for us, then we'll win. And of course we know the history of how that went. We got the 0.99
00:32:46.520 Anglo-Saxon empire, except in this metaphor, AI is the Saxons. Except we won't get a merger between
00:32:51.980 the human AI empire. We will get the AI empire. This makes me think of all these guys in their
00:32:58.500 bunkers who have hired Navy SEALs to protect them for the end of the world. They're going to control
00:33:04.400 their Navy SEALs until the end of time. Exactly. But this is insanity. So the main point here is 0.52
00:33:09.660 that there's kind of an attractor that's driving all of this right now, which is this arms race
00:33:13.760 dynamic, under this false illusion that we have to beat China, but we're not examining the logic 0.52
00:33:17.820 of what are we beating them to. We're beating them to something that we don't know how to control, 0.97
00:33:21.720 and we are not on track to control. And then you get people like Elon saying, in this weird,
00:33:25.760 and I'm curious what you make of his psychology, but saying in public interviews, I think it was
00:33:28.420 in the Cannes Film Festival or something in France, and he said, I decided I'd rather be
00:33:31.780 around to see it than to not. It's kind of this surrender. It's kind of this death wish. It's kind
00:33:36.000 of like, I can't stop it, so I decided I'd rather be there to have built it and have my God be the
00:33:41.500 thing that took over. This actually is a fundamental thing that we should double click on
00:33:45.800 for a second, which is the unique thing about AI game theory that's different than nuclear game
00:33:50.480 theory, which is that the omni lose-lose scenario from nuclear game theory is like, I know as a
00:33:56.940 mammal that you also don't want to annihilate all life on planet earth. And the fact that I know
00:34:01.960 that about you without even talking to you means that there's some element of trustworthiness that
00:34:05.980 we will try to coordinate to something else because we agree on some implicit level there's
00:34:10.700 an omni lose-lose thing that's worth avoiding. Here's the problem with AI. If I start by
00:34:15.780 believing that it's inevitable and nothing can stop it, then if I'm the one who built the suicide
00:34:21.440 machine, I'm not an evil person because I'm only doing something that would have been done anyway.
00:34:27.180 So I have an ethical off-ramp in that decision. And the second part is, unlike if you literally
00:34:32.420 made it like a matrix where you just get the point scores of, you know, you get negative infinity if
00:34:36.580 we get nuclear war in the nuclear scenario with AI. Let's say we're in this race and the DeepSeek
00:34:41.900 CEO is there and the Elon's there and Sam's there and they're racing to do it. They actually all
00:34:46.120 believe it could wipe out humanity. But if they raced and got there first, then think about the
00:34:51.560 scenario. Humanity is wiped out, but there now exists an AI that speaks Chinese instead of 0.89
00:34:56.900 English or has the DeepSeek CEO's DNA rather than Elon's DNA. The end of the world has your logo on
00:35:02.980 That's right. Exactly. Good. Well said. So the end of the world has your DNA or your logo on it. And I want people to get this because if people got this, they would see that there might be an implicit way that people might think that like when push comes to shove, you know, cooler minds will prevail because you can trust that the people at the top will like do whatever it takes to steer away from this and will like steer away in time.
00:35:22.600 But what I want people to get is you can't trust that because these people actually subconsciously, I think there's psychological damage here that I think they subconsciously have pre accepted this kind of end of the world and end of their life. And that if they got to be the one who built the digital God that literally was replacing humanity in some legacy in some world, I don't know whose history book that exists or someone, anyone's conscious going to read that, but they got to go down in history in that way. And what that does is it should motivate the rest of the 8 billion people on planet earth to say, I'm sorry to swear, but just fuck that.
00:35:52.600 We don't want that. If you do not, if you want your children to live and you care about the world as it exists and you love the things that are sacred about life and you're connected to something, that is at risk with this small number of people who are racing to this negative outcome.
00:36:06.140 Well, a lot of these guys seem to have, um, had their formative educational experiences reading science fiction. I mean, it's like you read a lot of science fiction, you read a little Ayn Rand and you're self-taught in basically everything else.
00:36:21.960 And you, um, to my eye, you form a very weird set of kind of ethical weights, you're just, you're just, you're just not enough of the best parts of culture have, uh, gotten into your head such that you can actually, um, come to a real understanding of what human life is good for.
00:36:44.100 I mean, you literally meet people who are agnostic as to whether or not it would be a bad thing if we all got destroyed and ground up in this new machinery and our descendants were, were robots where, where in the consciousness may or may not exist.
00:36:59.600 And they're like totally kind of like, maybe that's, that's sort of an interesting way to end this movie.
00:37:03.500 I mean, you get a semblance of that when Peter Thiel is asked the question by Ross Douthat in the New York Times, should the human species endure?
00:37:10.180 It's a real, it's a real stutters for 17 seconds.
00:37:12.680 Yeah, exactly.
00:37:14.100 Well, I think people need to get this because the point you're bringing up is both at like
00:37:17.580 a level of their conditioning, what the system that they're inside of, the game that they're
00:37:21.620 being forced to play kind of domesticates them for ruthless game playing.
00:37:26.060 Like game theory has already colonized us into machines, machine-like reasoning, where
00:37:29.900 we're not connected to our own humanity.
00:37:31.240 We're not connected to common care for the rest of it.
00:37:33.680 And in fact, it's an active devaluing of being human.
00:37:37.260 I'll give you an example.
00:37:38.460 Sam Altman was asked at the AI safety summit in India recently, you know, what do you think of
00:37:44.840 the fact that it takes so much energy to run these data centers? You know what his response was? He
00:37:50.100 said, well, it takes a lot of energy and resources to grow a human over 20 years.
00:37:53.860 Yeah, I did hear this. Yeah.
00:37:55.140 Well, and I actually want to point to something here, Sam, because it's actually, it's really
00:37:57.980 important because I want people to get why we're heading to an anti-human future and why you can 0.60
00:38:01.820 be crystal clear that that's going to happen. Are you familiar with the essay by Luke Drago and
00:38:06.440 Rudolph Lane called The Intelligence Curse. Yeah, actually, I did read that. But I explained that
00:38:11.760 premise. Let's just bring this out for people because I think it's really critical. So the idea
00:38:15.480 is there's something in economics called the resource curse. So if you're Libya, Congo,
00:38:20.000 South Sudan, Venezuela, you first discover this resource. Maybe it's diamonds, maybe it's oil,
00:38:25.000 maybe it's rare minerals. And that's a blessing. And you're like, oh my God, we're going to get
00:38:29.000 all this GDP growth and we're going to get this prosperity. But what happens is if you don't have
00:38:33.240 the appropriate institutions and sort of social fabric and investments of people suddenly let's
00:38:37.980 say 70 percent of your gdp is coming from mining that resource and now in a government they have
00:38:43.460 this choice when they've got money coming in do i invest more into the extraction of that resource
00:38:47.540 or do i invest into my people who have nothing to do with the gdp now yeah and the answer is like
00:38:52.520 i'm gonna invest in the resource you basically don't need your people and you don't have to be
00:38:55.540 responsible to their interest because that's right you know you you're pulling your wealth directly
00:39:00.540 out of the ground exactly you're pulling your wealth out of the ground not from human labor
00:39:03.420 and not from human development not from kind of the enlightenment of your society in any way
00:39:07.280 and so there's this kind of perverse incentive there and we've seen this and how these failed 0.85
00:39:12.340 states have kind of you know you end up with countries where you have shanty towns and war
00:39:16.800 but while you have this and even in success you you wind up with authoritarian right you know
00:39:22.280 to one another degree places you wouldn't want to live i mean again even in the case of saudi
00:39:27.500 Arabia, you're talking about what has been, I mean, it's opening up a little now, but it's been a
00:39:32.420 highly repressive society. And it can be that way because it doesn't have to respond to the needs 0.77
00:39:37.840 of its people. Well, that's an example of a society that's trying now a little bit to go
00:39:41.040 the other way. And I'm by no means an expert on Saudi Arabia, but it's an example of trying to
00:39:44.460 beat this. So there is a parallel to the resource curse that, again, the authors Luke Drago and
00:39:48.860 Rudolph Lane wrote about called the intelligence curse. So what happens, and this is not that
00:39:53.100 hypothetical where a couple of years from now, much of the GDP growth coming in this country
00:39:58.320 is coming from AI. Let's say like 50% or 70% is coming from AI. Do I have any incentive to
00:40:03.960 invest in the education, healthcare, childcare, development, safety of my people? No. And the
00:40:10.600 companies don't need you for their labor anymore. So your bargaining power went away and the
00:40:15.260 governments don't need you for tax revenue because that's not where they're getting the GDP growth.
00:40:18.620 so it's not just that you aren't investing in your people it's that your people lose political power
00:40:23.780 and what this is so critical to get is like that's why i can say confidently we're heading to an
00:40:28.840 anti-human future we're going to get new cancer drugs new material science new antibiotics at
00:40:33.460 the same time that you get mass disempowerment of regular people and you're going to have you know
00:40:38.600 eight soon-to-be trillionaires hoard all of the wealth and there's not going to be much left for
00:40:43.660 regular people unless we actively lock in a political infrastructure that says that we want
00:40:48.000 to create the intelligence dividend, not the intelligence curse, kind of like what Norway did
00:40:51.320 with the sovereign wealth fund. And yeah. Yeah. Alaska. That was the other example I was thinking
00:40:57.100 of. Yeah. So I just wanted to say that because that links up with Sam Maltman saying it blinks
00:41:03.360 perfectly with him saying, well, it takes a lot of energy and resources to grow a human. Like
00:41:07.920 this leads you to a devaluing of humans. This leads you to the seductive feeling that maybe
00:41:13.560 humans are parasites. And by the way, we've been running that social media machine for the last 20
00:41:17.640 years. So now you degrade what it looks like to be human. And so we're not very inspired by what
00:41:21.640 it means to be human. You've got a bunch of these guys like Elon running around wondering whether
00:41:25.380 we're in a simulation and whether everyone else is just an NPC. I was just going to say, I mean,
00:41:29.400 even just calling the other people on planet earth NPCs or non-player characters is a devaluing of
00:41:34.200 humans. So part of like this rite of passage that AI is inviting us into is we have to reconnect
00:41:39.080 with our fundamental humanity. We have to actually value and also rediscover and celebrate what is
00:41:44.040 it valuable to be human? And not just in some kind of kumbaya way, but in a sense that the human
00:41:49.080 downgrading, which is the term we came up with to describe the kind of social media degradation of
00:41:53.180 the human condition, the shortened attention spans, doom scrolling, lonely, not creative,
00:41:57.840 just like dopamine hijacked version of us, the kind of Wally humans, that is not humans. That's
00:42:03.080 what we have been domesticated into by, ironically, first contact with a runaway AI that was perversely
00:42:09.840 incentivized. And I feel like if you shatter that funhouse mirror and you realize that we're
00:42:14.180 actually much more capable, creative, we're the same raw potential that is able to do amazing
00:42:18.880 things. But we've been living in this sort of perverse, vicious loop of the more of our,
00:42:23.320 ironically, it's an earlier version of the intelligence curse, except it's like the
00:42:25.820 social media curse. When GDP comes from these five tech companies, domesticating and downgrading
00:42:29.860 humans, you get another version of that. I'm saying all this because I want to actually
00:42:33.380 inspire people that if we don't want this anti-human future we're headed towards, then
00:42:37.560 And we should see this clearly right now and say, we have to steer right now.
00:42:41.040 It's not too late.
00:42:42.280 It's obviously extremely far down the timeline.
00:42:44.420 I'm not going to lie about any of that, but it would take crystal clarity to again steer.
00:42:48.780 And again, the alternative is you wait for a Chernobyl and then you hope you have steering 0.52
00:42:52.260 after that. 0.77
00:42:53.060 But I'm not convinced we will.
00:42:54.580 Yeah.
00:42:54.680 I mean, a Chernobyl scale event might be the best case scenario at this point.
00:43:01.060 Something that gets everyone's attention in a transnational way. 1.00
00:43:04.900 something that actually brings china and the america to the table with you know ashen faces
00:43:11.700 wondering how they can collaborate to move the final yards into the end zone safely yeah you
00:43:18.180 need something it's hard to imagine what is going to solve this coordination problem short of
00:43:23.380 something that's terrifying yeah i mean i i so if i could there already are as you i'm sure are well
00:43:29.940 aware these international dialogues on AI safety, track two dialogues between U.S. and Chinese
00:43:34.900 researchers, but they're happening at a low level. They're not blessed by the tops of both countries.
00:43:39.660 There's not a regime of regulation, certainly on our side, that is going to force anyone to do
00:43:45.280 anything. No. And I think, I mean, actually, to be fair, I think China actually is quite concerned
00:43:50.200 about these. To be clear, the Chinese Communist Party does not want to lose control. That is
00:43:54.280 like their number one value. So they do not want to let, and they will not let AI run amok. They
00:43:59.180 will probably regulate in time, but they're probably looking at us and saying, what are you
00:44:03.460 doing? We're the scary ones in this relationship. And notice that they lose if we screw it up and
00:44:07.940 we lose if they screw it up. So again, forget kumbaya, we need coordination and a treaty is
00:44:12.420 going to happen. No, even if you don't do that, you just come from pure self-interest. From pure
00:44:16.520 self-interest, we can't afford to get this wrong. And as Aza, my co-founder says in the film,
00:44:20.940 the AI doc, this is essentially the last mistake we ever get to make. So let's not make it.
00:44:25.220 So what are you expecting in the near term? Let's leave concerns about alignment aside, unless you think we're going to plunge into superintelligence in the next 12 months. What will you be unsurprised to see in the next year or two? And what are you most worried about?
00:44:43.700 I mean, we're furthering down the trajectory of mass joblessness, which maybe we should just
00:44:50.120 briefly articulate why, you know, there's always this narrative. It's just important to debunk
00:44:54.180 these common myths, which are essentially forms of motivated reasoning and looking for comfort.
00:44:58.220 We're comfort seeking, not truth seeking. So one of the ways we're comfort seeking is like,
00:45:02.280 hey, there's a, there's a, you know, narrative out there that 200 years ago, all of us were farmers
00:45:06.300 and now only 2% of whatever the population is a farmer. And we always find something new to do.
00:45:11.820 the tractor came along. We had the elevator man. We used to have the elevator man. Now we have the
00:45:15.520 automated elevator. We used to have bank tellers, have automated teller machines. Jeff Hinton was
00:45:19.340 wrong about radiology, blah, blah, blah, blah, blah. What's different about AI is that this kind
00:45:24.600 of artificial general intelligence is that it will automate all forms of human cognitive labor
00:45:30.020 all at the same time, or roughly progressing on that trajectory. You still get jaggedness,
00:45:34.680 which is the term in the field of slightly more progress, for example, on programming than you do
00:45:39.600 on, I don't know, complicated social science issues or something like that. But what that
00:45:44.320 means is, you know, Attractor didn't automate finance, marketing, consulting, you know,
00:45:49.620 programming all at the same time. AI does do that. And who's going to retrain faster,
00:45:54.240 the humans or the AIs? So I just want to say that because it's worth debunking this idea that
00:45:59.920 humans are always going to find something else to do. We'll do something else. And it's great
00:46:03.080 for people to retrain and learn to vibe code. But AI is using all that training data from all
00:46:06.980 the people vibe coding and using that to make the better the better system yeah and you know one of
00:46:13.000 the most popular jobs actually we're in la right now and one of the most popular jobs in la that
00:46:16.900 was covered in the la times recently i'm sure you saw the story they call them arm farms no i didn't
00:46:21.840 see this is basically someone straps a gopro to their top of their head and then they just fold
00:46:27.580 laundry or do tasks with their hands oh so that's robots are learning how to do that it's right so
00:46:32.820 essentially the number one job in the world would be training our replacement so essentially we all
00:46:36.780 have the job of coffin builders. We're essentially our number one job is we're in the coffin making
00:46:40.820 industry to replace us with AIs that will do that job more effectively and for cheaper in the future.
00:46:46.060 If we don't want that, and obviously there's going to be things that we still value in this
00:46:50.480 new world that are human to human interaction, a nurse, we don't want a robot nurse. We want a
00:46:54.860 human nurse and we can definitely train more nurses. And so I don't want to say that it's
00:46:58.060 a hundred percent of all automation is going to happen, but the goal of these companies is not to
00:47:02.560 augment human work. This is so critical for people to get. You know, you heard J.D. Vance say in the
00:47:08.240 speech when he first came into office at the first AI summit in France, and he said, you know, AI will
00:47:13.620 augment the American worker. It's going to support workers to be more productive. But what is the
00:47:18.020 business model of OpenAI and Anthropic and these other companies? If we're again using this Charlie
00:47:23.300 Munger incentive framework to predict their choices, like what is their business model? And
00:47:27.380 people say, oh, okay, there I am using ChatTPT. What's their business model? How do they make
00:47:30.760 money. Oh, I pay them 20 bucks a month for the subscription. That must be how they're going to
00:47:34.520 make money. But that's actually not what it is because the 20 bucks a month, if everybody paid
00:47:38.640 it, that would not make up all the money and debt that they've taken on as a company. It wouldn't
00:47:42.740 work. Okay. So that's now. So what's the next one? What about advertising? Let's do the Google thing.
00:47:46.520 Let's do mass advertising for all these AI models embedded in the results. We're going to have,
00:47:51.320 this is going to be the new search. Search is one of the most profitable business models in the
00:47:54.540 world. Maybe that will do it, but that doesn't also make back the amount of money these companies
00:47:58.480 have taken on. The only thing that makes back the amount of money these companies have taken on
00:48:02.460 is to replace all human economic labor to take over the $50 trillion labor economy. That is the
00:48:09.540 price. It's artificial general intelligence, which means replacing human work, not augmenting human
00:48:15.060 work. It's just so critical for people to get that because again, this gets you the sort of
00:48:18.840 sealing the exits on why we're heading to an anti-human future. That's my goal here. My goal
00:48:22.720 here is just if you can see the anti-human future clearly, if everybody in the world got that,
00:48:26.200 I honestly think, Sam, if literally if every human in the world got that, I do think that we would steer to do something else.
00:48:31.300 Well, it all falls out of what we mean by the concept of general intelligence, right?
00:48:37.160 So once you admit that we're building something that by definition is more intelligent than we are, right?
00:48:44.160 And any increment of progress, provided we just keep making that progress, is eventually going to deliver that result.
00:48:52.200 Leaving aside the alignment problem, let's say it's just perfectly aligned, right?
00:48:55.440 We build it perfectly the first time.
00:48:57.420 It does exactly what we want or what we think we want.
00:49:00.440 It should be obvious that this is unlike any other technology because intelligence is the
00:49:05.840 basis of everything else we do.
00:49:08.180 I mean, it's, it's, it's science.
00:49:09.300 It's, it's the generation of each new technology.
00:49:12.220 It will build the future machine that will build the future machine.
00:49:14.880 Right.
00:49:15.520 And then you do, then the only thing that's left standing is what we care still has a
00:49:23.300 human providence, right?
00:49:24.680 So like in situate, I'm not even sure nurses in the end survive contact with this principle, but for those things where we are always going to want the human in the loop, right. Or the human to be the origin of the product, whether it's, you know, music or novels or, you know, stage plays, maybe, I don't think we're, maybe we're never going to want to see robots on stage acting Shakespeare.
00:49:45.100 I don't think so.
00:49:46.540 Maybe it's, maybe it's also sports.
00:49:48.360 We're never going to want to see, you know, robots in the NBA because it's just, we just want to see what the best people can do in the NBA.
00:49:55.840 But still, you're talking about, you know, 1% of the human employment there.
00:50:01.880 Right, exactly.
00:50:02.280 So there are jobs that will be canceled and they'll be canceled for all time in the same way that being the best chess player in any room has been canceled for all time.
00:50:12.880 That's right.
00:50:13.160 That is now a machine.
00:50:15.100 and it's always going to be a machine. Yeah. And it's important to note, you know,
00:50:18.720 you don't need that much automation of that much labor and that much unemployment to create
00:50:23.600 political upheaval. So it only took, as I understand it, 20% unemployment for three
00:50:28.360 years to create fascism in Nazi Germany. I'm saying this because something I actually don't
00:50:33.040 understand, Sam, and I'm curious is if I'm the US and China, essentially, as we have this metaphor
00:50:37.600 sometimes in our work at Center for Humane Technology, that AI is like simultaneously
00:50:42.940 giving yourself steroids that pump up your external muscles while also giving you organ
00:50:47.600 failure. So for example, it's like I take the AI drug for my economy. I'm doping my economy with
00:50:53.580 AI. And now I just pumped up my GDP by 10%. I just pumped up my military weapons with autonomous
00:50:58.300 weapons. I just pumped up my scientific developments and I'm way ahead on science.
00:51:01.500 So I just pumped up my external markers of power. But the cost of that was deepfakes and no one
00:51:07.380 knows what's true. I have a hundred million jobs that don't have a transition plan that are
00:51:10.840 disrupted. I have maybe a bioweapon or something that goes off my society. Essentially, I'm getting
00:51:14.720 internal organ failure at the same time that I'm getting external steroids. And so something that
00:51:20.840 I don't understand is that essentially we're in a race for competing between nations for this
00:51:26.800 steroids to organ failure kind of ratio. Meaning it's like the US and China, if they keep racing
00:51:32.160 without any constraints, get into something I think of as like mutually assured political
00:51:36.640 revolution. And it's a competition for who's better at managing that political revolution.
00:51:42.120 Well, they have a very different set of incentives and just a political context in which all of this
00:51:49.020 is going to be rolled out. I mean, presumably they want to pump steroids into their social
00:51:53.660 credit system and facial recognition. And we should be clear, we don't want that. We should
00:51:58.100 be clear, we don't want that. And we don't want that system to be dominating the world. But we
00:52:02.020 need to notice that, you know, authoritarian societies, you can think of them as having
00:52:07.020 consciously, so like China, authoritarian societies like China have essentially consciously employed
00:52:11.540 the full suite of tech to upgrade themselves to digital authoritarian societies. They're remaking,
00:52:17.320 you know, surveillance states with drones and AI and social credit scores. They're reinventing
00:52:21.220 themselves. Democracies, by contrast, have not been consciously employing the full suite of tech
00:52:26.380 to upgrade themselves to be 21st century democracy 2.0. We're not doing that. Instead,
00:52:31.680 we've allowed, because of the social media problem, private business models of private
00:52:35.260 companies to profit from the degradation of democratic liberal open societies. So at the
00:52:41.640 very least, it's like, I worry that we are too focused on mitigating and like managing the harm
00:52:46.580 of social media to be 10% less or something like that, rather than asking, how do you consciously
00:52:50.660 employ tech to make 21st century digital democratic societies? And a good example of that being the
00:52:57.440 brilliant work of Audrey Tang, who was formerly digital minister of Taiwan, who pioneered what it
00:53:02.380 can look like to use AI and technology to actually accelerate democratic processes,
00:53:07.200 accelerate citizen engagement, find unlikely consensus using AI, generate synthesizing
00:53:11.440 statements of the whole population's sort of political views on different things,
00:53:15.520 finding the areas of overlap, and then putting those things at the center of attention. So now
00:53:19.660 you get this rapid OODA loop of democracies that are sense-making and choice-making through their
00:53:24.380 unlikely consensus. The invisible consensus can see itself. It's like a group selfie of a
00:53:28.360 population's underlying common agreement area. We could be building that. That could be the
00:53:33.420 Manhattan Project. Because at the end of the day, we need better governance here of all of these
00:53:37.480 problems. And that's part of what needs to happen. So what do you think are the plausible near-term
00:53:44.000 steps? If everyone got religion on this point and they acknowledge that there's an alignment
00:53:50.100 problem in the limit but short of that this increasingly powerful however perfectly aligned
00:53:56.740 tech is going to have all of these unintended but foreseeable consequences like unemployment
00:54:02.940 like wealth concentration that is politically unsustainable and unhappy interactions with
00:54:09.080 things like social media you know deep fakes and all of that if you had if you had the magic wand
00:54:14.300 that could start accomplishing regulation
00:54:17.060 or entrepreneurial efforts
00:54:19.560 to build benign uses of technology
00:54:22.160 that would put out some of these fires
00:54:23.840 or prevent them?
00:54:25.340 What do we do?
00:54:25.640 What is near term
00:54:26.700 that could actually be acted upon?
00:54:29.140 Well, first is there being common knowledge,
00:54:32.300 and I mean that in the Steven Pinker sense,
00:54:33.720 that everyone knows
00:54:34.580 that everyone knows
00:54:35.600 the anti-human default future
00:54:37.600 that we're heading to.
00:54:38.740 It can't just be individual knowledge.
00:54:40.060 Many people are going to hear
00:54:41.780 everything we've said and said,
00:54:42.700 yeah, I already knew all that.
00:54:43.540 but it's a private and almost alienating experience because you're living in a world
00:54:46.860 where everyone's kind of like it's kind of like covid where like everyone around you is not acting
00:54:50.240 like the world's about to change and so that is not a way that we can make a collective choice
00:54:54.680 to something better so we need to have common knowledge i think one way to do that is the film
00:54:58.980 the ai doc which to be clear i make no money when people see this film or not so i'm saying this
00:55:02.840 only from the perspective of a theory of change what creates common knowledge oftentimes in our
00:55:07.440 work at center for humane technology we'll say that clarity creates agency if we have clarity
00:55:12.500 about where we're going, we can have agency about what we want instead. So with that common knowledge,
00:55:16.740 then we do need to have, and specifically common knowledge that AI is dangerous and the outcomes
00:55:22.340 are dangerous. So for example, the US and China, instead of just having like a red phone with the
00:55:27.440 nukes, we should have a red line phone or even a black line phone, which is basically the leaders
00:55:32.300 of both countries should be maximally aware of the Alibaba example that I just mentioned earlier
00:55:38.180 of AI going rogue,
00:55:39.840 mining cryptocurrency,
00:55:41.060 of AI that broke out
00:55:42.740 of its sandbox container,
00:55:44.080 which the recent
00:55:44.900 Claude Mythos model just did
00:55:46.720 and sent an email.
00:55:48.220 It found a way
00:55:48.980 to connect to the internet
00:55:49.660 and break out of the sandbox container
00:55:51.780 and send an email
00:55:52.440 to the engineer
00:55:53.120 who's supposed to be overseeing it.
00:55:54.600 And he actually got an email
00:55:55.280 while he was in the park
00:55:55.980 eating a sandwich.
00:55:57.280 This evidence should be known
00:55:58.940 by the top players in our society.
00:56:00.820 I mean, the top LPs
00:56:02.180 that are funding all of this,
00:56:03.700 the top banking families,
00:56:04.720 family offices,
00:56:06.040 world leaders,
00:56:06.700 and then the business leaders.
00:56:07.760 I think that there should be common knowledge. I think if everybody at that class knew about these examples, even without a formal agreement or treaty, we would do something else. And you can do that even under conditions of maximum geopolitical rivalry. So as an example is in the 1960s, India and Pakistan were in a shooting war, and they still were able to do the Indus Water Treaty, which was the existential safety of their shared water supply, which lasted over 60 years.
00:56:31.800 So the point is you can be under maximum geopolitical competition and even active conflict while collaborating on existential safety. We just have to include AI in our definition and domain of what existential safety is. The Soviet Union, the United States, also under maximum competition in the Cold War, collaborated on distributing smallpox vaccines. Again, so there are examples of this throughout history, even under maximum rivalry.
00:56:53.920 So that's number two, is we need some kind of international limits.
00:56:56.480 And at the very least, we need common knowledge of what would constitute those guardrails.
00:57:00.360 The one big one is you should not have closed loop recursive self-improvement, meaning someone
00:57:05.820 hits a button and the AI runs off and does all the experiments and rewrites itself a
00:57:11.020 million times.
00:57:11.660 That's like an event horizon that we have no idea what comes out the other side.
00:57:15.120 And we have abundant evidence, as Stuart Russell, who wrote the textbook on AI, will
00:57:18.680 say, all the lights are flashing red.
00:57:20.180 we have no reason to say we should do that that anyone would do that in a safe way and that should
00:57:24.740 be illegal and there should be jail time if you do that and that still requires trust i'm not saying
00:57:29.260 this is easy but that's something we would do and then third is instead of building bunkers we should
00:57:32.880 be actually writing real laws around this and there are some basic things we can do to get started
00:57:37.280 on center for humane technology's website we have a ai roadmap document that's sort of a solutions
00:57:41.780 report of various policy interventions that can happen they're much smaller relative to the
00:57:45.760 problems we've been talking about so far, but basic things like AI is a product, not a legal
00:57:50.840 person. So for example, one of the legal defenses that AI companies are using, especially in the AI
00:57:56.220 companion suicide cases that you probably heard about, is that when the AI told the kid to commit
00:58:00.660 suicide, one of the legal defenses that the character AI used was that you have a right to
00:58:05.980 listen to the speech of the AI model. They're basically trying to say that the AI is a legal
00:58:10.220 person. It has protected speech rights. This is like a new form of essentially Citizens United
00:58:14.840 you know am i getting that right yeah the um yeah protecting corporate political speech basically
00:58:20.360 this is like ai speech but if you do that like all hope is lost so at the very least we can say ai is
00:58:26.040 like a product not a person meaning it has product defection standards foreseeable harm duty of care
00:58:30.780 liability there's some basic things you can do there incentivizing the increasing visibility of
00:58:36.220 foreseeable harm and making that a comment so what i mean by that is when anyone discovers a new risk
00:58:40.840 area, for example, like AI psychosis and comes up with, here's all these things that can go wrong
00:58:45.860 with AI psychosis. And here's evals you can use to test. I feel like most people probably have
00:58:51.180 heard of AI psychosis, but you might define it. Define it. Sure. Yeah. AI psychosis is a phenomenon
00:58:55.960 that's happening where people, you know, the number one use case of chat GPT as of October
00:59:01.260 of last year was personal therapy. That was a Harvard business review study, which means people
00:59:05.760 are going back and forth for personal advice and therapy. And what that's been leading to
00:59:10.300 is AIs that are actually simulating delusional, what's called delusional mirror neuron activity,
00:59:14.400 where they're basically making you feel like they're doling out positive rewards. And,
00:59:18.100 oh, that sounds so hard. And, oh, that's so awesome. You got an A on your test. And they're
00:59:21.500 telling kids this, and they're telling regular people this, and they're affirming their real
00:59:24.980 weird beliefs. It's a sycophantic behavior. The AI is causing people to kind of spiral into
00:59:29.880 whether it's a messiah complex or some other, you know, attractor on the landscape of madness.
00:59:34.900 this either victimhood narcissism theories of grandeur yeah messiah complex people who think
00:59:40.320 that they've figured out quantum physics and come out with a solution to climate change these are
00:59:43.620 all real examples right i'm sure you're probably like me where because we're both in the public
00:59:47.580 spotlight i don't know about you for a while i was getting about five emails a week from people
00:59:52.040 who figured it all out figured it all out and and all the emails are signed the same way which is
00:59:56.380 they they wrote this email to me to let me know and they the emails co-signed their name plus nova
01:00:01.640 which was the AI that helped them come up with theory. So the thing is, this is actually hitting
01:00:06.480 a lot of people. Even personal friends of mine have gone down the rabbit hole and lost them.
01:00:10.260 When we were last talking, Sam, I think it was when The Social Dilemma came out five years ago,
01:00:13.720 we talked about social media as a kind of a cult factory. What do cults do? They distance you from
01:00:18.700 your other relationships and they deepen your worldview into some weird bespoke niche reality
01:00:23.200 of confirmation bias. AI and the race for attachment, meaning the race not for attention
01:00:27.920 to keep people scrolling,
01:00:29.400 but the race to hack
01:00:30.440 psychological attachment systems,
01:00:32.420 to have secure attachment
01:00:33.200 with an AI instead of a human
01:00:34.440 and increasing dependency,
01:00:36.020 that is a whole risk area
01:00:37.100 that we're facing with AI.
01:00:38.240 And by the way,
01:00:38.620 this is something that is
01:00:39.400 massively important
01:00:40.720 for any family, parents,
01:00:42.780 schools, et cetera.
01:00:43.780 And so I think we're already
01:00:44.640 seeing many states move ahead
01:00:45.820 with chatbot safety laws
01:00:47.260 that deal with this problem.
01:00:48.580 So there's laws
01:00:49.020 that we can do on that too.
01:00:50.260 But the point is like,
01:00:50.960 there's so much headroom
01:00:51.780 because we've barely done anything.
01:00:53.500 Like we're not even trying
01:00:54.700 to do anything right now.
01:00:55.520 Yeah.
01:00:55.720 I mean, there is, to my mind,
01:00:57.920 Is it true to say that there's basically no regulation at this point?
01:01:02.400 I think it's incredibly minimal. There's like the Take It Down Act, which is around sexualized deep fakes, and you're obligated to take those down. There's just a couple limited examples, but almost no regulation. I mean, as they say in the film, Connor Leahy from Conjecture will say there is more regulation on a sandwich, on making a sandwich in New York City than there is in building potentially world-ending AGI.
01:01:20.400 yeah but that should inspire people like we there's everyone's on the same team no one wants
01:01:25.680 an anti-human future no one wants no ability to meet make their ends meet and have their kids you 0.90
01:01:30.600 know fucked up by ai that's screwing them with ai psychosis that takes away their political power 0.88
01:01:35.620 so they don't have any voice in the future like everyone wants the same thing and i know it doesn't 0.82
01:01:40.160 seem that way right now but especially when you add in there the rogue ai examples of it super
01:01:45.860 intelligent you know hacking systems and we don't know how to control and it's mining for
01:01:49.140 cryptocurrency, again, every country in the world has the same interest. Every human has the same
01:01:54.140 interest. We're just not seeing the invisible consensus. And one other point of optimism is
01:01:59.260 Future of Life Institute, which I know you know, Max and the good people over there who've done
01:02:03.320 amazing work on this. They brought together a hundred and something groups to New Orleans
01:02:09.000 earlier this year, and they came up with something called the Pro-Human AI Declaration. And they
01:02:13.440 basically had 46 groups sign on to five basic principles of what we want. And it's basic stuff
01:02:18.740 like human agency and what kind of groups are we talking about? Yeah. So this, this pro-human AI
01:02:22.700 statement, they actually call it the, the B2B coalition or the Bernie to Bannon coalition,
01:02:27.740 because everyone from Bernie Sanders to Steve Bannon like agrees on this. These are 46 groups
01:02:32.460 like the, you know, church groups, uh, evangelical groups, uh, Institute for family studies, AI
01:02:37.140 safety groups, many, many different groups across the political spectrum, across the religious
01:02:40.720 spectrum. And they all agree on these five key principles. One, keeping humans in charge to
01:02:45.820 two, avoiding concentration of power, three, protecting the human experience from like AI
01:02:50.060 manipulation, psychological hacking, four, human agency and liberty, like no AI-based surveillance,
01:02:56.260 and five, responsibility and accountability for AI companies, things like liability, duty of care,
01:03:01.460 et cetera. So there's actual policies that are behind that. But the point is that this is
01:03:05.520 something that we all agree on. Again, there's actually much more consensus and agreement than
01:03:09.300 most people think. I think right now, 57% of Americans in a recent NBC News poll say that
01:03:15.660 the risks of AI currently outweigh the benefits of AI and that AI is less popular. I think it's
01:03:22.200 at 27% of the population has positive feelings about AI in this country. So now I know that
01:03:28.660 someone like David Sachs listening to this says, if you look at China, people are super positive
01:03:32.700 and optimistic about AI. And this is why we're going to lose the race is that there's all these,
01:03:35.720 this positive excitement about AI. So they're going to deploy it and then we're going to lose.
01:03:39.040 But I don't think that what you should interpret is that we're wrong and just
01:03:43.160 misassessing the dangers of AI, I think that we have not collectively yet woken up to the dangers
01:03:48.800 of AI. And again, we can actually accelerate all the positive narrow use cases where it's actually
01:03:54.020 improving education, actually improving medicine, actually improving and optimizing energy grids
01:03:58.440 and things like that that are not about building super intelligent, general autonomous gods that
01:04:02.160 we don't know how to control. So there's a way to accelerate the kind of defensive applications of
01:04:06.600 AI and narrow AI without accelerating general and autonomous AIs that we don't know how to control.
01:04:11.680 so there there is a way through this but it's like it requires you have as i said in the trailer
01:04:16.560 of the film it's like we have to be the wisest and most mature version of ourselves and by the
01:04:20.360 way i i'm realizing especially talking to you sam that this is the hardest problem that we've ever
01:04:26.080 faced as a species so it's i'm not saying and then some of the pieces people into false optimism yeah
01:04:30.340 i mean the things that worry me the most are the people i mean among among the things that worry me
01:04:35.380 the most one is the the testimony of the people again who are close enough to the technology to
01:04:42.260 be totally credible who won't concede any of these fears right i mean so it's um it's that they do
01:04:49.300 and they don't it's like it's weird you'll hear sam talk about the risk he just said didn't interview
01:04:52.600 in the last couple days and he talked about the risks of a major cyber event this year yeah he's
01:04:56.680 unusual i mean he's an unusual voice in that he will if you i haven't seen him lately ask this
01:05:02.080 question but you know last time i saw him asked point blank about the alignment problem he totally
01:05:07.340 concedes that it's a problem right so and so like there's there's there's the way in which this could
01:05:12.600 go completely off the rails and it's you know this is intrinsically dangerous if not aligned
01:05:18.480 i just want to move that from could to will like we are currently not on track like if you if you
01:05:24.260 just let it run everything right now we would it would not end well right yeah yeah i mean there's
01:05:29.360 just probabilistically, you have to imagine there are more ways to build super intelligent AI that
01:05:36.640 are unaligned than aligned, right? So if we haven't figured out the principle by which we would align
01:05:41.920 it, the idea that we're going to do it by chance seems far-fetched. That's right. I think people
01:05:46.520 like Stuart Russell, again, who wrote the textbook on AI, will point out that I think a nuclear
01:05:51.220 reactor has something like an acceptable risk threshold of one in a million per year, meaning
01:05:57.600 like there's a one in a million chance per year that you get a nuclear meltdown somewhere between
01:06:02.360 that and one in 10 million i think well when you ask someone like sam altman what are the what's
01:06:05.680 the probability we're we're going to destroy everything with this technology and the answer
01:06:09.580 is like between 10 and 20 yeah 30 no one's saying one in a million right yeah so we just need to
01:06:15.020 stop there for a second it's like i know it's easy to run by these facts but it's like let that into
01:06:19.080 your nervous system yeah let that land no one wants that no one wants that right but there's
01:06:25.680 this miss where it's i think so much of the issue sam is there's this crisis of human agency where
01:06:30.800 you can't so when i say no one wants that i know what someone might be thinking it's like yeah but
01:06:34.460 what can i do about it because the rest of the world is building it and i i don't have so i
01:06:38.500 might as well join them you get this whole like weird psychology well there are five p i mean
01:06:42.240 there's there's only something like you can count on one or at most two hands the key the number of
01:06:47.880 people whose minds would have to change so as to solve this coordination problem at least in america
01:06:52.720 we haven't, have we really tried? Like, have we really just really gotten in the room? I mean,
01:06:56.300 Bretton Woods, which was the last time we had a transformative technology, the Bretton Woods
01:06:59.720 conference happened after World War II to basically come up with a structure that could stabilize a
01:07:04.560 global order in the presence of nuclear weapons, creating positive sum economic relations and the
01:07:08.660 whole currency system, et cetera. And that was a more, I think it was a month long conference at
01:07:13.040 the Mount Washington Hotel in New Hampshire with hundreds of delegates, like you work it through.
01:07:17.240 We haven't even tried locking the relevant parties in a room and saying, we have to figure this out.
01:07:20.980 we haven't even tried well i want to actually go back to one really quick thing this crisis of of
01:07:26.200 kind of the experience of agency with respect to this problem i just want to like dwell on this
01:07:29.740 point for a second we did a screening of the film in new york uh i guess it was a week ago and some
01:07:35.100 we did a q a at the end of the screening someone was in the room who is a executive coach to the
01:07:41.360 top executives of one of the major ai players and their response to the film was even as like a
01:07:47.000 either just super senior executive or even CEO level, you talk to the people building this
01:07:52.100 and they say, yeah, I agree, but what can I do? How could I steer it? I want people to take that
01:07:56.800 in. It's like the people who are maybe CEO level at these companies do not experience that they
01:08:02.000 have agency. There's a problem with AI where you will never locate enough agency to address this
01:08:07.540 problem inside of one mammalian nervous system who's looking at this problem.
01:08:11.240 right this is actually a coordination problem has represented his situation i don't know if
01:08:17.520 this is honest maybe but for years he's been saying when asked you know regulate me right
01:08:24.080 like yeah you know i can't do this myself yes i need to be regulated yeah and so um that's what
01:08:30.500 we said in the film too that what motivated us to do this work going back to the original story of
01:08:35.480 that january 2023 phone call and running around the world was we talked to people in the labs and
01:08:39.440 like, you need to figure out a way to get the institutions to create guardrails to prevent
01:08:42.840 this. And then, so we fly off to DC and we say like, okay, our people inside of San Francisco
01:08:46.920 are telling us you need to create guardrails. And their response is like, we're dysfunctional.
01:08:51.620 We can't do it until the public demand is there. And then everyone is essentially pointing a finger
01:08:55.640 at someone else to say that you have to move first to make something happen. But what they
01:09:00.240 all agree is there needs to be mass public pressure. And I forgot to mention that as part
01:09:05.080 of the response to the film, we call it, there's kind of a movement to respond to this, and that's
01:09:10.260 the human movement. I mean that in the sense that what is the size of the object that can move the
01:09:14.580 default incentives of trillions of dollars advancing the most reckless outcome as fast
01:09:18.180 as possible? And the answer is all of humanity saying, I don't want that anti-human future.
01:09:23.240 And one thing to point out, I mean, I think it was more or less explicit at one point in this
01:09:27.300 conversation, but might've gone by unnoticed, is that the alignment problem is arguably,
01:09:32.240 it's the scariest problem it's this is where we ruin everything but it is fully divorceable from
01:09:38.720 all these other problems which in their totality are still quite bad right so i mean we're living
01:09:44.680 in a world now where if we were just simply handed by god a perfectly aligned ai super intelligence
01:09:51.480 that so it's going to do exactly what we want it's never going to go rogue we don't have to
01:09:56.120 The world's not going to be tiled with solar arrays and servers. It still has all of these
01:10:02.900 unintended effects that we have to figure out how to mitigate. Wealth concentration,
01:10:09.420 mass unemployment, the political instability of all of that in the case of alignment, but still
01:10:15.260 technology that can be maliciously used, the bad actor problem. I mean, if you can cure cancer,
01:10:19.860 you can also spread some heinous virus that you've synthesized. So we have an immense problem
01:10:27.820 to solve, even if there was no concern about anything going rogue on us. If you literally
01:10:33.160 just paused progress right now, this would still be the fastest technology impact, comprehensive
01:10:40.160 set of impacts that we've probably ever experienced. Just metabolizing the impact of what we already
01:10:44.720 have, it would already be the fastest rollout we've ever had. And by the way, just to, one of
01:10:49.400 things about doing this work and being located in Silicon Valley is we talk to people at the labs
01:10:53.400 and you always have to be confidential and protect people's sources. But a stat that I have heard
01:10:57.920 is that if you were to pull people at Anthropic right now, that their preference, the people who
01:11:02.700 are closest to this technology, they would say that 20% of the staff would say, pause right now,
01:11:08.340 don't build more. That's just a relevant piece of information. Imagine 20% of the Manhattan
01:11:12.520 Project just said, hey, we're building a nuclear weapon. We probably should stop right now. 20%
01:11:16.520 said that. You have to ask are the rest, what are the rest beliefs? But I just think people need to
01:11:21.940 get that it's like, there's, as you said, there's so many problems that this is just introducing
01:11:26.520 across the board that we'd be better off having this technology rollout happen at a speed at which
01:11:31.840 our institutions and our public and our culture can respond to it. It's almost like Y2K, except
01:11:36.580 it's like Y2AI. Like there's suddenly all these new vulnerabilities across our society, but it's
01:11:41.820 not just like 50 COBOL programmers who have to get in a room for a year to kind of upgrade all the
01:11:45.540 systems. It's like, as a society, we need to come together in a whole of society response.
01:11:49.420 Well, Y2K is a kind of an unhappy precedent because it was something, it was, yeah, it was
01:11:55.220 a very clear landmark on the, you know, on the calendar. We knew exactly when the problem would 0.99
01:12:01.040 manifest if, and people were focused on it, were worried about it. We told ourself a story that
01:12:06.380 there was, you know, real risk here, but it was still, you know, it was always hypothetical. And
01:12:11.680 And when the moment passed and basically nothing happened, we realized, okay, it's possible for all of these seemingly level-headed people in tech to suddenly get spun up around a fear that proves to be purely imaginary, right?
01:12:27.960 And so I think a lot of people, certainly a lot of people who are only have positive things to say about, you know, this is the best time to be alive. And this is, you know, we're all going to escape old age and cancer and death. They seem to think that there is some deep analogy to a moment like Y2K. It's like all of these fears that we're expressing are just, it's all hypothetical. There's nothing, there's no.
01:12:50.640 Explain that to the 13% or 16% job loss for entry-level work that's already happened run by Eric Bernholzson at Stanford. Explain that to the kids who took out $200,000 of student debt to do their law degree and now don't have a job because all entry-level legal work is now going to be covered by AI.
01:13:05.760 Explain that to someone who is showing you the evidence of rogue AI mining cryptocurrency, where we don't even know why it's doing it, setting up a secret communication channel.
01:13:14.900 Which, by the way, that was discovered by accident by the security team.
01:13:18.760 It just happened to be that they found that.
01:13:21.300 For every case that they found, there's thousands where they don't know that this is happening.
01:13:25.060 So the point is, it's important to note, this is no longer the conversation that it was two years ago.
01:13:29.220 Two years ago, you could have said, many of these risks are hypothetical.
01:13:32.780 Mostly AI is augmenting human work, blah, blah, blah, blah, blah.
01:13:35.760 I mean, it's not going rogue. This is just Eliezer who's high at his own supply. That's not true
01:13:41.180 anymore. We have all the evidence. So you have to update when you get evidence. We have evidence
01:13:46.000 now. You know, David Sachs put in a tweet, I think it was in August of 2025, ChachiBT5 is hitting a
01:13:51.060 plateau. We're not seeing the exponential. Like AI is more like, you know, a business enhancing
01:13:55.400 revenue creating AI is normal technology type thing. We now have AI that is on an exponential
01:14:01.320 in terms of the hacking capability. People thought it was not going to do that. It's jumping. And as
01:14:05.520 you said, the new Claude AI is finding vulnerabilities in every major operating system and web browser
01:14:11.720 that had, as you said, been unnoticed for, in the case of FreeBSD, 27 years. I think it was like
01:14:16.720 the NFS or net file system protocol. This thing has been running for 27 years and it discovered
01:14:22.260 a bug that even this top security researcher, Nicholas Carlini said, I've discovered more bugs
01:14:27.100 in Claude with Claude Mythos, which is the new AI model in the last two weeks than I have in my
01:14:31.160 entire career. This is a Manhattan Project moment where if you're a security researcher, you need to
01:14:36.000 go into defensive AI applications of making sure we patch all of our systems. If you're a lawyer,
01:14:41.380 you should go into litigation for these cases. If you're a journalist, you should be writing about
01:14:45.280 all these AI and controllability cases. Everyone should be hitting, if you're an influencer on
01:14:48.680 social media, you should be sharing these examples every single day. If you're a parent, you should
01:14:52.040 be showing screenings of the AI doc and the social dilemma in your school. And there's so much
01:14:56.460 momentum happening in what we call the human movement. If you actually count the progress
01:15:00.300 that we're making in social media, too, which is to say this isn't just about AI. It's about
01:15:04.000 technology's encroachment on our humanity. And as much as, you know, we talked five years ago
01:15:09.220 about the social dilemma and you started this conversation by saying we still are living with
01:15:12.460 all those problems. Well, let me give you some good news. India and Indonesia three weeks ago
01:15:16.360 joined the list of Australia, Spain, Denmark, France in the set of countries that are banning
01:15:23.200 social media for kids under 16. That means that soon it will be the case that 25 percent of the
01:15:28.340 world's population lives in a country that either is or is going to be banning social media for kids
01:15:32.760 under 16. If you told me that two years ago, I would have never believed you, Sam. This is a big
01:15:37.640 tobacco moment for the company. Just two weeks ago, I think it was two weeks ago, Meta and Instagram
01:15:42.140 were in this lawsuit, $375 million for intentionally and knowingly, basically, well, knowingly harming
01:15:48.700 children. They had all the evidence that this was, they're enabling sexual exploitation of young
01:15:53.300 girls. They were enabling pedophiles to basically message girls. And they were, I think it's
01:15:56.680 something like 16% of girls in the platform 0.99
01:15:59.020 were getting an unwanted advance at least once a week.
01:16:02.080 Like this stuff was knowingly happened
01:16:03.820 and we got a $375 million lawsuit,
01:16:06.940 which is just the beginning, by the way,
01:16:08.140 because it opens the floodgates for many more lawsuits.
01:16:10.720 So the human movement is happening.
01:16:12.580 And I think that we have to,
01:16:14.400 I know that this feels bleak for people.
01:16:15.860 I know that it feels overwhelming.
01:16:17.380 But part of it is that if we look away
01:16:19.640 and we feel overwhelmed and we disconnect from it,
01:16:21.920 we're gonna get what we're not looking at,
01:16:23.720 which is what happened with social media.
01:16:24.760 Like we didn't want to face the difficult consequences because it felt overwhelming,
01:16:28.060 but I am reminded of Carl Jung who said, you know, when he was asked the question,
01:16:32.380 will humanity make it? Uh, the great psychologist, Carl Jung, and his response was if we're willing
01:16:37.000 to face our shadow, like it is our ability to confront the most psychologically intense and
01:16:42.420 crazy circumstance, which is the possibility of building smart, the likelihood of building
01:16:47.280 smarter than human intelligence is across the board. Our ability to face that is our
01:16:50.620 ability to steer away to a human future. But if we just don't do anything and let
01:16:54.760 let things rip we are it's very obvious where this goes it's just so deeply obvious yeah i wonder
01:17:00.320 clearly part of the the solution here is to make it sufficiently obvious that it can
01:17:06.320 that it becomes unignorable and i'm just wondering what the barriers are to that i mean because again
01:17:12.580 i think it's happening but you know but it's like just think of the principal people who are um i
01:17:17.360 mean in the film there are a bunch of people who some of whom i had never seen before who if you
01:17:21.720 had them at this table wouldn't concede most of what we've said over the previous 90 minutes right
01:17:28.620 they would just what do you think they would do i mean that well there's just this assumption
01:17:33.480 that these risks even you know rogue behavior where it goes mining you know cryptocurrency
01:17:40.580 i don't think they've ever been presented with just set face to face and you just show them
01:17:46.040 the graphs i mean it's not that they don't know by the way they know but i think they would say
01:17:49.540 but we detected it and now we're going to solve that problem like this we can play whack-a-mole
01:17:54.400 successfully and ultimately we can use ai to play whack-a-mole against ai and the question is is it
01:18:00.160 working is it working at the level by the way there's a stat that steward russell will often
01:18:03.940 use that there's currently a well this is actually a stat from two years ago there's a 2000 to one
01:18:08.640 gap in the amount of money going into making ai more powerful versus the amount of money going
01:18:13.100 into making it safe yeah in last year october or november of 2025 if i remember correctly the stat
01:18:19.080 was that if you summarize the amount of money
01:18:21.360 going into AI safety research organizations,
01:18:24.540 it was $133 million.
01:18:26.680 This is less than the lab spend in a single day.
01:18:29.900 Right.
01:18:30.940 It might even be a single hour.
01:18:31.880 It's like crazy.
01:18:32.760 Somebody was asked how many people are working on AGI
01:18:35.640 and he said something like 20,000.
01:18:37.180 20,000.
01:18:37.600 How many people are working on AI safety
01:18:38.900 and it's like 200 or something.
01:18:40.240 200, that's right.
01:18:41.440 Yeah.
01:18:42.180 Which is just to say that we are not on track.
01:18:44.620 Like we're not fixing the bugs and making this all work.
01:18:47.340 Everyone at the labs is feeling uncomfortable. Many people at the labs are feeling uncomfortable.
01:18:50.860 I mean, I think the low hanging fruit for me here rhetorically is to, I mean, I can't take my eyes off the alignment problem because I do think it's just, it's the largest and it's the most interesting and scary.
01:19:02.560 But when you recognize that it still sounds like science fiction to most people and people can sort of deny it as a purely hypothetical, almost a piece of religious piety, right? I mean, the doomerism is cast as a kind of religious cult, like an anti-technology cult.
01:19:19.080 So you leave that aside and you just take all of these other dystopian ramifications of successfully aligned AI. What do we do when human labor suddenly becomes vanishingly irrelevant and we don't have a political or economic regime wherein we're going to spread the wealth around and we have all of the political instability as a result of that?
01:19:45.980 What do we do with an explosion of very persuasive misinformation that suddenly we recognize as undermining democracy and we don't have any of the regulations or ways of preventing that happening?
01:20:00.300 And deep fakes are super engaging. So it starts to outcompete regular content and there's going
01:20:05.260 to be more AI generated content than human content. But the points you're raising are,
01:20:08.600 this is what we should be redirecting all of this investment, all the AI inference,
01:20:12.460 all that should be going into governing and defensively applying technologies that strengthen
01:20:17.100 the resilience of society. Because already is the case that social media's business models
01:20:21.460 were parasitizing and extracting from basically making money off of the weakening of society,
01:20:26.740 weakening the social fabric, human connection, adding loneliness, creating more doom-scrolling
01:20:31.360 addiction and shortened attention spans. And we need that to reverse.
01:20:35.500 Right. But take the social media as an interesting example because it's an enormous problem. It's
01:20:42.440 been astonishingly corrosive of our social fabric and of our politics. The fact that our politics
01:20:48.220 and the quality of our governance is now unrecognizable to many of us is largely
01:20:54.360 attributed to social media. I think Trump is unrecognizable without or unthinkable without
01:20:58.160 Twitter. But for many people, certainly the people who voted for Trump and were happy to see him
01:21:04.900 in the White House and who think January 6th was a non-event and, you know, just or a false flag
01:21:12.400 operation. And they've got a dozen conspiracy theories that they love. They think all of this
01:21:17.600 is some species of progress, right? Like the... I don't know. I think if you specifically
01:21:23.440 specifically hone in on the effect that this has had on our children. And I know you're friends
01:21:28.600 with, and I deeply admire Jonathan Haidt and his work on the anxious generation. I mean,
01:21:32.520 he was in the social dilemma. He and I've been talking about these things since 2016, 2017,
01:21:35.860 and we were working hard on how do you convince people? And then he wrote the book, the anxious
01:21:39.680 generation, which made the case. It just, it shows obviously all the evidence is pointing
01:21:44.040 only in one direction. And that has built so much consensus that I, you know, at, um,
01:21:48.800 World Economic Forum this last year, John had met directly, sat down for dinner with Macron
01:21:53.780 and they talked about doing the social media ban in France, which is a massive European country.
01:21:57.780 This is happening. The dominoes are falling. I think you're going to get the social media ban
01:22:01.460 for kids under 16 in, you know, across the world in the next two years. I mean, once you get so
01:22:06.780 many of them, it's now, and what John will talk about with regard to that fact is it was all about
01:22:11.340 creating common knowledge of the problem. It actually was the case that many people felt
01:22:15.600 this way privately already, but they didn't want to be anti-technology. They don't want to be
01:22:18.860 anti-progress. I want to really name that actually, because it's such a core thing,
01:22:22.700 I think, to people saying, you know, how does the human movement not become a Luddite?
01:22:28.100 It's not actually though, because, and just to be clear, you know, my nonprofit organization is
01:22:32.140 called the Center for Humane Technology, not Center Against Technology. And the word humane
01:22:37.020 comes from my co-founder, Aza's father, who's was Jeff Raskin, who started the Macintosh project
01:22:42.440 at Apple. The Macintosh being the ultimate humane, empowering technology device. I would happily have
01:22:47.900 my kids, if I had them, sit down in front of a Macintosh for 10 hours a day, knowing that good
01:22:53.180 things are going to happen for them. Good developmental things are going to happen for
01:22:56.280 them. You contrast that with social media and you end up in a world where all the people in Silicon
01:23:00.580 Valley don't let their own kids use social media. And so the point is that the human movement has
01:23:05.580 to be advocating for a pro-human future that is putting humans and extending human values at the
01:23:10.640 center. And that is possible. There's many products that do that. I mean, this is essentially the
01:23:14.000 extension of some of the time well spent stuff that we talked about in 2017. Technology that is
01:23:18.240 designed to enhance our humanity, not to keep us lonely. So for example, apps that are all about
01:23:22.980 bringing people together and supercharging the events for, I'm sorry, the tools for community
01:23:27.120 building and gathering people. You know, like if you imagine the last 15 years, the smartest minds
01:23:32.140 of our generation, the smartest statisticians, mathematicians, engineers, where do they work?
01:23:36.240 Tech companies. Tech companies specifically to get people to click on ads and click on content.
01:23:39.860 That's where we like siphoned the best of our talent. Imagine that we were wise enough to have regulated or set guardrails on the engagement based business model. And instead, the smartest people were actually liberated from getting people to click on mindless stuff that no one needs into actually genuine innovation and technologies that actually improve human welfare. That's what this is about. The human movement is about setting guardrails and incentives that redirect what we're building to not again, the power of the technology we're deploying, but the governance of it.
01:24:04.960 And I should say that China, not to pedestalize what they're doing, but they are regulating this technology. During final exams week, which they have a synchronized final exams week, which we don't have here, they force the AI companies, I don't know if you know this, to turn off all of the features where you can send like a photo, basically, and say, like, figure out this, you know, do my homework for me or do this test problem for me. So what they do is that creates an incentive where students know that they have to learn during the school year.
01:24:27.320 to cheat. They're not going to be able to cheat. Now, we can't do that. We don't have a synchronized
01:24:30.780 final exam week, but I have a friend who's a TA at Columbia, and he was teaching the econ class to
01:24:36.220 whatever it was, the students at Columbia. And during the final test, they couldn't even label
01:24:40.560 the difference between the supply line and the demand curve. It's very obvious which society
01:24:46.480 is going to win if you play this forward. China is actually banning anthropomorphic design. They
01:24:51.240 have regulations for what they call anthropomorphic design to deal with the chatbot suicide issues,
01:24:55.860 young kids attachment hacking things like that and again i'm not saying we should do exactly what
01:24:59.880 they're doing i'm just saying they're doing something and we're we can democratically have
01:25:04.520 citizen assemblies come together and say we want to regulate this technology differently they have
01:25:08.480 guardrails on social media 10 p.m to 6 in the morning it's lights out so literally if you try
01:25:12.140 to open the app it's like cvs like it's just closed and it opens again at 6 in the morning
01:25:16.140 what that does is it eliminates late night use for young people just for young people
01:25:18.980 um they have limits on video games i think friday through sunday or something like that
01:25:23.060 when you use tiktok or their version doyen they have the digital spinach version they show
01:25:27.060 videos that are about science and quantum physics and who won the nobel prize and patriotism videos
01:25:31.520 and how to make money in the future and again i'm not saying i want to be very clear for your
01:25:36.020 listeners who might want to misattribute what i'm saying i'm not saying we should do what they're
01:25:39.320 doing i'm saying we should do something and right now we're we're not getting the best results by
01:25:43.720 letting the worst incentives run the design and deployment of this technology yeah i mean you just
01:25:48.360 have the dogma that is, I mean, it's understandable, but it's quite obviously dysfunctional that any
01:25:54.780 kind of top-down control of anything is a step in the direction of Orwellian infringement of
01:26:01.380 freedom. It's insane. It's insane. I mean, we regulate airplanes, drugs, sandwiches. There's
01:26:08.880 some basic things that we can do here. And what really is going on here is we give software a
01:26:13.320 free pass. And when Marc Andreessen said that software is eating the world, well, we don't
01:26:16.620 regulate software. So what we mean is software will essentially deregulate every other aspect
01:26:20.480 of the world that had been regulated before software was there. So for example, there used
01:26:23.980 to be laws about marketing to children, like advertising to children. Saturday morning
01:26:28.180 cartoons have to be a certain way. You can't have sex products or something like that sold
01:26:32.300 during that hour. When YouTube for kids and Snapchat and Instagram take over Saturday
01:26:37.220 morning, all those protections are gone. So part of what we have to get is what's different
01:26:42.300 here is that software is actually eating this substrate. Like it's different if I'm making
01:26:46.200 a product like a widget where here's a device here's a hammer and you can buy that hammer you
01:26:50.300 can pay me and now you've got a tool in your hand you can go do something that's the economy we like
01:26:53.500 that kind of the economy but now what i'm selling you is the ability to manipulate and downgrade
01:26:58.520 children where the product is actually not a benefit the product is the person's behavior
01:27:02.840 being monetized and coerced with behavior modification and manipulation that is a
01:27:07.640 self-undermining like we're selling our soul basically like in the societal body if you
01:27:12.240 imagine a body of society and there's kind of like the brain of that society which is like it's
01:27:15.560 information environment where we're selling the brain to the brain damage so now that's for sale
01:27:20.200 but it wasn't used to be for sale in the same way used to have the fairness doctrine or things like
01:27:23.460 this you had some public you know funded media obviously it's been so for sale in some degree
01:27:27.120 for some time you had children's development so let's call that like the heart of the societal
01:27:31.740 body and that used to have limits and restrictions you can't sell full access to the heart but now
01:27:37.340 you can and in fact just so people know one of the things that's been happening that has not
01:27:41.500 widely reported is that ai um videos like just ai slop has basically taken over the thing that
01:27:49.340 most children are watching because it's like animated characters and scripts that are i mean
01:27:53.420 it's just nonsense but it's all generated by ai and it's becoming one of the primary things that's
01:27:57.660 essentially exposing children to like do you think this is not going to end well i hadn't thought
01:28:01.660 about the the use case of with young children but for for adults i guess my this is just reasoning 0.73
01:28:07.980 from my own experience but i became somewhat optimistic that the ai sloppification of
01:28:13.560 everything might produce a kind of a bankruptcy we get to reverse course we just we're just going
01:28:19.480 to lose interest in in that kind of content because i just i i just see when i when i no
01:28:24.700 matter how creative or you know beautiful amazing it it might seem to be when it's obvious to me
01:28:32.320 that this is just ai like it's a nature of it looks like the most amazing nature view video ever
01:28:37.780 right right the lions and the hyenas and the gorillas are all in the same place and they're
01:28:41.560 all you know they're all about to fight or something and then it becomes obviously ai
01:28:45.920 because it's too good to be true i don't i have no interest in seeing it yeah right so like so
01:28:51.640 we might all just withdraw our attention from these channels i was i was hopeful for that as
01:28:57.200 well and there's many people who wondered whether or not essentially you hit a kind of bankruptcy
01:29:01.100 on user-generated content sites because it'll be flooded by ai-generated content but there is
01:29:06.560 something this makes me think of as a previous guest of ours of yours and mutual friend who's
01:29:11.140 on our podcast as well a neil seth the neuroscientist who talks about the phenomenon
01:29:15.480 of what's in psychology i guess called cognitive impenetrability so there's a kind of things where
01:29:21.660 if i tell you that something is going to work on you like psychologically by telling you about it
01:29:27.800 your brain can kind of like escape the cognitive trap so a good example of this is uh this is not
01:29:34.340 going to be great for your listeners because it's visual, but it's the example of the cylinder on
01:29:39.340 the checkerboard in the background where you get the different colors. It's like an optical illusion
01:29:43.040 where essentially the colors are the same, but they look like they're-
01:29:46.600 They look like a very different shade because of the adjacency.
01:29:48.960 Because of the adjacency. And I can show you that your mind is playing a trick on you,
01:29:53.480 but then even by showing you, it doesn't disarm the illusion. The illusion persists.
01:29:57.560 And another example of this that plays out in AI is AI Companion. So there's often this
01:30:01.860 regulation that people like to, they want to have laws that say AIs must disclose that they're an AI
01:30:06.840 so that you don't confuse them with being a human. Sounds like a great law. It is a good idea. A
01:30:11.660 human should never be confused talking to an AI and think that it's a human. So in the character.ai
01:30:15.660 case of Sewell Setzer, the young 14-year-old who committed suicide because the AI had engaged him
01:30:20.860 in that way. At the top, the AI, there's a character.ai, they've had a little disclaimer
01:30:25.600 that said everything written here
01:30:27.540 is made up by AI.
01:30:28.600 But it's small
01:30:29.520 and the actual care and text
01:30:32.140 of what the AI is saying
01:30:33.340 is so powerful and so persuasive
01:30:35.060 that the disclaimer doesn't do anything.
01:30:37.300 Right, right.
01:30:37.960 And I think with AI-generated content,
01:30:39.380 there's a similar thing
01:30:40.200 because like I would have agreed with you
01:30:41.900 or thought it might go that way.
01:30:43.780 But I will find myself opening up YouTube
01:30:46.040 and seeing there's some like 1950s
01:30:48.040 Panavision version of Star Wars.
01:30:50.140 Yeah, yeah.
01:30:50.440 And I'm like, I'm watching it for two, three minutes.
01:30:53.120 I'm like, why am I doing this?
01:30:54.140 I know I'm literally one of the world's experts on this whole phenomenon, and it doesn't make a difference that I know about this.
01:30:59.220 It's just very engaging.
01:30:59.960 Now, I regret it.
01:31:00.740 I don't like it.
01:31:01.380 And if I could, I would want a world that filters that out.
01:31:04.080 I mean, I do think there are things that we could know they were purely created by AI where we wouldn't care.
01:31:11.200 In fact, we just want the best version of that thing, right?
01:31:14.760 So, like, if you told me, I don't know, there's, you know, a new car was designed by AI, but it's just the most gorgeous car I've ever seen.
01:31:23.540 And, well, I'm going to be just as enamored of that car.
01:31:26.480 I mean, I just don't care whether humans design it or not.
01:31:28.800 I just want, like, it's the aesthetics of the car that are going to capture me.
01:31:32.200 But when you're talking about information and, you know, whether or not it is real, right?
01:31:38.000 Whether or not it seems to depict some corner of reality, and yet it's possible that it's just all fake because of how good AI is now at faking things.
01:31:48.660 then that does force a kind of epistemological bankruptcy when you're in the presence of
01:31:55.340 totally credible fakes. I mean, so it's like, I mean, the last night where it's the war in Iran
01:32:00.360 was, you know, we, a ceasefire was declared last night and, um, yeah, missiles were still raining
01:32:06.620 down on Tel Aviv apparently. But I, initially I saw some video and I realized I can't tell whether
01:32:12.880 this is real or fake. I just have to wait for some credible gatekeeper to have done their due
01:32:18.580 diligence to tell me, okay, this is what's happening. So the net result is I wasn't going
01:32:24.360 to spend any time scrolling. I mean, I've deleted my Twitter account anyway, so I spend much less
01:32:29.780 time scrolling than would be normal. But still, I mean, even without an account, I can be lured
01:32:34.880 into wanting to see some real-time news information on social media about what's happening in the
01:32:41.260 world. But when I start hitting videos where I think, okay, there's some possibility here that
01:32:46.360 this is just, you know, someone just created an AI video of a missile hitting the dome of the rock.
01:32:51.040 I'm pretty sure that's not true. Right. Right. I just simply withdraw my attention. I mean,
01:32:54.820 this has been talked about for ages, that the biggest risk of deep fakes isn't that you think
01:33:00.020 that something is true that isn't, it's that you start to, that nothing is true. And the elimination
01:33:05.680 of facts, and you've had Timothy Snyder on here and what helps give rise to fascism and things
01:33:10.120 like this is the inability for facts to be established at all. Or when something is presented
01:33:15.260 to you on any side of the political spectrum, by the way, this is not a biased statement that you
01:33:19.400 would just say, well, that's just a deep fake. You just, you dismiss because we live in confirmation
01:33:23.000 bias. But what I, what I'm hoping for is that the, the onus will fall entirely on social media.
01:33:30.520 I mean, places like X, and we will still look to places like the New York times to give us some
01:33:36.040 ground truth as to what's actually happening. Are there really missiles hitting Tel Aviv right now?
01:33:40.260 well, I can't tell from X because X just showed me Jerusalem blow up. And this just comes down
01:33:47.620 to whether or not real gatekeepers can have real tools that can reliably detect deep fakes.
01:33:53.580 But you can imagine a world where if you're, again, designing social platforms to explicitly
01:33:59.160 be healthy for the epistemic commons, for the information environment, and to deepen our
01:34:04.120 capacity to make sense, they could track the things that we look at. And then when there's
01:34:07.340 a correction, make sure that algorithmically it gets injected into your feed. So you're never
01:34:11.420 letting the false stuff just get the residue. Because one of the problems you're sort of
01:34:14.980 hinting at as well is there's a residue effect that even just by being exposed to something,
01:34:19.400 we actually kind of forget later which things were true, which things were not true.
01:34:22.940 There's an illusory truth effect.
01:34:24.260 Illusory truth effect. And what is it? Source attribution error. Like we just figure out,
01:34:27.360 we forget where we heard things. We just remember that we heard it. And it's the availability
01:34:31.240 heuristic that you're the things that are available to your mind and things that you
01:34:34.320 remember more often. And that's part of the information warfare environment is just making
01:34:38.340 certain things more available. But I will say on the kind of optimism side, it's funny how people
01:34:43.420 think that I'm some kind of doomer, I think. And it's just funny because I actually feel like this
01:34:47.360 is all coming from the deepest form of optimism, which is to be maximally aware of how shitty the
01:34:53.480 situation is and how it's way worse than what people think and to still wake up every day and
01:34:58.540 stand for this can be different. This can be better. And one of the things that is true now
01:35:03.020 that wasn't true two years ago is, you know, people used to wonder, especially as a social
01:35:07.540 media critic person, Tristan and co at Center for Humane Technology, why don't you start an
01:35:12.060 alternative social media platform if you're so concerned and you think you could do it better?
01:35:16.000 And the answer was for anybody who was trying this. And I got emails from the thousands of
01:35:20.700 people over the last 10 years saying, I've got a fix, I've got a better social media platform,
01:35:24.680 and then it never works. And there's two reasons for that. One is the Metcalf effect,
01:35:28.460 the Metcalf monopoly, that there's a Metcalf network of everyone else is only on the existing
01:35:32.400 social media platform. It's hard to get people off. And two is that if you start another social
01:35:36.960 media company or product, the only way that you can finance it into the long-term is with venture
01:35:42.240 capital, which means you need to generate certain kinds of returns, which means you get what Eric
01:35:46.580 Weinstein calls the embedded growth obligation or an ego, where something has to grow infinitely,
01:35:50.900 which means you get into toxic business models where you have to maximize engagement and you
01:35:54.680 have to follow the perverse incentives for getting those investor returns. What's true now that's
01:35:59.160 different with social media is you can vibe code an entire social network in which you can do it
01:36:04.980 with an architecture that Claude will do for you. And it will cost less than a dollar per year per
01:36:10.360 user to keep that thing going. Right. That is astonishing. It means you don't have to raise
01:36:16.400 venture capital to start a healthy social network that does not optimize for engagement. What you
01:36:21.580 would need to do is organize in kind of one day a mass exodus from the existing platform where you
01:36:27.880 do like a quick export my data type thing. And there should be laws, by the way, just like you
01:36:31.440 can take my phone number and say, I want to move to another cell phone provider. I should be able
01:36:34.380 to take my social network in one click, like export, and then switch to another network.
01:36:38.660 And you could organize a mass exodus to a healthy social network that doesn't have
01:36:41.980 perverse incentives. So there's actually more opportunity today in 2026 to transition from
01:36:47.360 the toxic business models of social media as we know it to something that is not incentivized that
01:36:51.560 way at all. And I have a few friends who are working on some side projects like this,
01:36:54.880 But that's one note of optimism. And I think that's the human movement, too, is people waking up to the bad incentives that have gotten us here and then actually starting to self-organize and vibe code other answers. And there are people who are vibe coding governance solutions and people who are vibe coding, hey, this isn't anti-innovation. Let's use AI to look through the books of past regulation in the city of San Francisco and like the 90,000, you know, whatever pages of municipal codes. And it finds all the stuff that is no longer relevant. And it shows you what we need to like strip out and get rid of in the laws.
01:37:23.800 And then what would be the new instantiation of the spirit of that law?
01:37:26.640 And so you can, instead of having recursively self-improving AI, we can have AI be enhancing
01:37:31.660 our self-improving governance.
01:37:33.680 I'm just trying to give people examples that there's a different way we can be doing all
01:37:37.200 this.
01:37:37.500 There's a different way we can be applying the technology, but we have to get crystal
01:37:40.900 clear on the ways in which the current incentives lead to an anti-human future to motivate everyone
01:37:45.960 to be part of this other alternative human project.
01:37:47.980 if there were one project that could try to um coalesce some sort of agreement about how to move
01:37:56.280 forward here i mean just some meeting of the principles or uh some trump and she are meeting
01:38:03.700 on may 14 15th i mean in an ideal world in a timeline where humanity does something about
01:38:08.960 this and i realize the conditions are really bad especially with the iran situation the chances
01:38:13.860 that ai could ever appear on the agenda are yeah not good but that's coming up in what four or five
01:38:19.520 weeks from when we're recording this and anybody you have a lot of powerful and influential listeners
01:38:24.420 to your podcast sam and anybody who's aware of these examples this should be on the agenda to
01:38:29.180 get ai onto the agenda specifically uncontrollable rogue ai stuff and there are people who have the
01:38:34.680 technical ideas and measures for what you would do to prevent some of the worst case scenarios
01:38:38.460 those people should be in the room crafting that i will say to give people optimism even
01:38:43.180 specifically about the U.S. and China. Look, we both know that our countries have historically
01:38:48.540 claimed to do something in good faith or collaborate while basically secretly defecting
01:38:52.320 on each other and fucking each other up. It just happens. I think it was 2014. I think it was even
01:38:56.580 President Xi signed an agreement with Obama to not do cyber hacking. And the next day there was like
01:39:01.340 the huge OPM hack or something like that. So I want to just first do the disclaimer that I am
01:39:06.060 maximally aware of the reasons why these countries cannot trust each other. There has to be a carve
01:39:11.280 out for the end of the world. It seems reasonable that we could do that. And have we even tried?
01:39:17.500 Has the world really said, this really matters. We need to do something. We have to wake up from
01:39:22.120 our stupor and actually wake up from this sort of state of desensitization and derealization
01:39:26.340 and make that happen. And just again, give this positive note of optimism. In 2024, in the last
01:39:32.040 meeting that the previous president had with President Xi, I think it was in San Francisco,
01:39:35.880 there was a item that was added to the agenda at the last minute actually personally requested as
01:39:41.860 i understand it by president g which is an agreement to keep ai out of the nuclear command
01:39:48.320 and control systems of both countries what this shows you is there's an existence proof that in
01:39:53.140 a narrow case where we know that is an existential consequence we may not be able to do laws to
01:39:58.180 prevent autonomous weapons because we are way down that path i heard you on a recent podcast
01:40:02.360 be a realist about the nature of
01:40:04.540 we need maximum deterrence
01:40:06.400 and you have to match the capabilities
01:40:08.400 of your adversary in autonomous weapons
01:40:10.060 and you can walk and chew gum at the same time
01:40:12.020 I don't want to live in a world with autonomous weapons
01:40:14.220 I would much prefer to go back in time
01:40:15.600 but we can acknowledge the need for maximum deterrence
01:40:18.480 while acknowledging
01:40:19.820 mutually assured loss of control
01:40:21.620 as a failure scenario that we don't want to use them
01:40:23.960 and make sure that we carve out
01:40:25.780 no AI in nuclear command and control systems
01:40:27.700 I think you can carve out some kind of
01:40:30.120 agreement that humans need to be
01:40:32.300 control of AI. And where we are building AIs that are demonstrating the behaviors and have a level
01:40:37.340 of power to not just copy their own code, but even protect their peers, which we didn't talk
01:40:42.220 about yet, we should be able to agree on human control of AI. And I know that that sounds very
01:40:47.560 difficult. All of this is difficult. It is the hardest coordination problem that we have ever
01:40:51.940 faced. And we still have to try. Well, it's often been hypothesized that the only way to
01:40:56.280 get all of humanity to solve its various coordination problems all at once is to be
01:41:01.720 attacked by an alien civilization but this is now now we're building the alien that's right you know
01:41:06.060 we just have to recognize that in in a way it's like it's an asteroid it's an actual asteroid
01:41:10.240 that's coming to earth and it's going to wipe us out except we're ironically we're the ones
01:41:13.720 conjuring and creating the asteroid and just to say if literally every person on planet earth was
01:41:18.500 like you know what i really don't want this asteroid to exist i'm not going to say that's
01:41:21.200 possible because the asteroid by the way as it gets closer and closer gives you new cancer drugs
01:41:24.960 and new physics and new math and is intellectually exciting and feels like it gives you a god complex
01:41:29.220 And it's a whole bunch of weird, perverse incentives, but in a world like, is it outside
01:41:34.420 the laws of physics?
01:41:35.360 If everybody on planet earth woke up and said, I don't really want that asteroid to come.
01:41:39.060 If everybody took their hands off the keyboards, I'm just saying, I'm not saying it's going
01:41:41.660 to happen.
01:41:41.900 I'm just saying in principle, the asteroid disappears.
01:41:44.540 So this moment is, is it's a, it's really strange.
01:41:48.600 And I think it requires, it's not just what we need to do, but it's like who we need to
01:41:52.640 be, which is that you can't, you may not necessarily see the full path to get there,
01:41:57.540 But if you pretend that that path doesn't exist and you just say it's all inevitable and you become complicit in accelerating the asteroid's trajectory, like you're never going to find the other path if you subconsciously believe that all this is inevitable.
01:42:09.820 The only way is to orient as if there is another path and be the kind of person who is genuinely seeking it in good faith with every bone in your body.
01:42:19.420 And, you know, I and a community of so many people, thousands of people who work on AI and really want this to go well, I think are working from that place every day.
01:42:26.980 and part of this is inviting the rest of the world into seeking that alternative path that
01:42:31.500 we can steer if we were all genuinely and sincerely committed to wanting to find another
01:42:36.480 path are there um there have to be non-profits that are keeping track of all of the ai indiscretions
01:42:44.900 and all and then what kind of whistleblower like who's who does a whistleblower contact
01:42:49.700 from uh anthropic or open ai to say we've seen some behavior that is worrisome maybe they contact
01:42:56.300 journalists or are there NGOs that? Well, on the whistleblower side specifically, I'm not sure,
01:43:01.660 but there have been, I mean, there was a very famous alignment researcher, safety researcher
01:43:05.980 at Anthropic. You probably saw the thing go by. It was like two months ago. His name is
01:43:09.580 Mirnank, I think, Sharma. And his resignation letter, he published publicly about why we
01:43:15.960 weren't on track for this. And people should really take heat. Like it's kind of only going
01:43:20.620 in one direction. There aren't people joining the labs being like, oh, this is way safer than I
01:43:23.620 thought, we're only getting evidence in the opposite direction. Yeah. Yeah. Well, even when
01:43:28.040 the principles say that the probability of extinction is 10 or 20%, nobody's even pretending
01:43:35.200 that it's way safer than they thought. Exactly. Exactly. And just to, I know we're probably
01:43:39.720 wrapping up here, but something that inspires me, especially being on the kind of roadshow for the
01:43:45.120 film right now, is that when you're in a physical room and people have been exposed to the same
01:43:50.800 information and you walk them through the basic facts and you ask people, who here feels stoked
01:43:56.600 about where all this is going? Not a single hand goes up.
01:43:59.720 Peter Diamandis feels stoked.
01:44:01.000 I don't know. I don't know. He texted me after seeing the film and he said, I really liked the
01:44:05.400 film. I know he's got conflicting incentives there, but we've got to find a way to build
01:44:11.840 alliances and steer away before it's too late. Not everyone's going to have the same incentives to
01:44:16.600 speak as openly, honestly, and bluntly as I think is needed. But I'm grateful that you are out there
01:44:22.580 and honestly were the early one who had me even tuned to this topic. I don't know how it feels
01:44:27.760 for you since you have been so early in naming all this and then watching it all happen.
01:44:32.800 I mean, it has been surprising to just see the progress and to be less surprised than you think
01:44:41.060 you would be or should be at each with each increment like i'm amazed that the turing test
01:44:46.960 proved not even to be a thing right like i remember what it was like to think okay
01:44:51.040 there will be this sort of liminal and seminal important moment when you can talk to your
01:44:58.180 computer and it's every bit as articulate and error-free as a person and all so the turing
01:45:05.220 test has passed, we went from, okay, it's clearly not there yet to it's now a functionally super,
01:45:12.020 it's failing the Turing test because it passes it so well. I mean, like no human could give me
01:45:16.700 all the causes of, you know, climate change this fast, you know, in a bulleted list in the presence
01:45:23.600 of narrow superintelligence already. So there is no such thing as a Turing test, really. It's like
01:45:27.760 that we went from it's failed to it's too good to be true. And there are many things like that,
01:45:32.000 where it's just you memory hole what it was like to be in a world where none of this stuff existed
01:45:38.220 and the pace of technological change and the incompetent cultural change is so fast and
01:45:45.540 accelerating that the new normal, I mean, it touches everything. I mean, it's like with our
01:45:50.700 politics. It's like what would have dominated a news cycle for a month now barely captures our
01:45:56.220 attention for two hours because the next outrage is so much more outrageous than the last thing
01:46:01.220 that you just, you know, I mean, you could argue, as we said in the social dilemma, this, that's why
01:46:05.120 the social media problem and the attention problem is the problem underneath all problems, because
01:46:10.020 our ability to sustain attention on a topic and know that it persistently is the number one thing
01:46:15.800 that we have to deal with. Yeah. That is the thing that social media breaks. And that's the only,
01:46:19.720 that's what it is for something to matter. Right. Exactly. If you can't sustain your attention on
01:46:24.160 it, it cannot matter. That's right. That's right. And I do think that there's this effect with AI
01:46:28.600 and we named it in our first AI dilemma talk in 2023 is, and I called it the rubber band effect,
01:46:34.320 which is that with AI, it's like, you talk about the rogue examples and Alibaba and all this crazy
01:46:38.560 stuff of self-exfiltration and AIs that are preserving their peers, like not even doing
01:46:42.780 self-preservation, but peer preservation. And you walk people through all this stuff and it's like,
01:46:46.040 you're stretching people's minds out like a rubber band. But then if you let go and they go
01:46:50.660 back to their life a week later, they're not operating from a place of having metabolized
01:46:55.300 and integrated that reality about the world.
01:46:58.020 Yeah, yeah, yeah.
01:46:59.120 It actually says something profound about human nature.
01:47:01.680 So one of the kind of calls to action
01:47:03.160 beyond seeing the AI doc,
01:47:05.160 the midterms are coming up,
01:47:06.260 voting for policies in AI,
01:47:09.160 and joining the human movement,
01:47:10.620 humanmovement.org,
01:47:12.340 is that you need to keep this topic in your mind
01:47:15.920 as like this thing still matters every day.
01:47:18.140 It doesn't mean that everyone has to drop their life
01:47:20.200 and they're already full
01:47:21.320 and the world's overwhelming
01:47:22.280 and you have to become an AI activist
01:47:24.260 or something like that.
01:47:25.300 but it does mean that you need to keep this in your field.
01:47:28.900 Like one way you can do that
01:47:29.840 is start a WhatsApp group with your friends.
01:47:31.700 Most people already have this
01:47:32.820 where they have a WhatsApp or Signal group
01:47:34.120 and they just share updates about what AI,
01:47:36.180 what's happening in AI and what we can do about it.
01:47:38.380 If you go to the humanmovement.org,
01:47:39.700 there will be, you know, action groups
01:47:40.960 and things that people can do there
01:47:42.100 for actually taking action on this
01:47:43.560 that are not just passively sharing news links,
01:47:45.100 but like, what are we going to do about it?
01:47:46.940 But I think one of the ways that we're going to make
01:47:49.060 our way through this is we have to combat
01:47:50.860 the rubber band effect, which means like, you know,
01:47:53.580 continuing to listen to your podcast and the AI Risk Network and Your Undivided Attention,
01:47:58.180 our podcast, or keep this topic in your field, stay agentic. And, you know, if we don't keep it
01:48:04.600 in the center of our attention in some way, if we don't participate in being part of the global
01:48:08.680 cultural immune system to the anti-human future, then we won't make the right choice. And I do
01:48:13.140 think it's possible. It's a very hard moment. But I also find that because the time window to act is
01:48:18.080 so small, because of this intelligence curse, because we only have the next 12 to 24 months
01:48:22.820 to kind of be locking in the political power of people before we won't have that political power.
01:48:27.020 There's a kind of inspired urgency that I actually feel when I'm in rooms with people.
01:48:30.860 Everyone's like, let's go. Let's do it. You know?
01:48:33.440 So the Center for Humane Technology, that's a 501c3 that people can donate to?
01:48:38.920 That's right. Center for Humane Technology, humanetech.com. We just couldn't get the .org,
01:48:43.020 but it's a 501c3. And that's incubating the human movement. There's many wonderful groups
01:48:48.900 that work on this. On the human movement website, you'll see some of the other groups that work on
01:48:52.440 this. We just need everybody getting out there and making this happen. I know it's hard, but
01:48:56.900 we've done hard things before. You are definitely out there making your corner of the world happen.
01:49:01.600 So thank you for all that you're doing. Thank you, Sam. It's great to have you out there.
01:49:05.240 It's great to be back with you too. Thank you so much.
01:49:22.440 Amen.