Making Sense - Sam Harris - February 06, 2018


#116 — AI: Racing Toward the Brink


Episode Stats

Length

55 minutes

Words per Minute

163.36803

Word Count

9,121

Sentence Count

483

Misogynist Sentences

2

Hate Speech Sentences

4


Summary

Eliezer Yudkowsky is a computer scientist at the Machine Intelligence Research Institute in Berkeley, and he is known for his work on technological forecasting. His publications include a chapter in the Cambridge Handbook of Artificial Intelligence, titled The Ethics of AI, which he co-authored with Nick Bostrom, and Eliezer s writing has been extremely influential online, especially among the smart set in Silicon Valley. Many of those articles were pulled together in a book titled Rationality from AI to Zombies, which I highly recommend. And he has a new book out which is Inadequate Equilibria, Where and How Civilizations Get Stuck, and as you ll hear, Eliezers is a very interesting first principles kind of thinker. Of those smart people who are worried about AI, he is probably among the most worried. And his concerns have been largely responsible for kindling the conversation we ve been having in recent years about AI safety and ethics. So in today s episode, you re getting it straight from the horse's mouth, and we cover more or less everything related to the question of why one should be worried about where this is all headed. In this episode, we cover: What is AI? Why are we worried about it? What are the risks? Why should we be worried? How can we prepare? And why should we care? Is there a place for AI in the future? Can we trust AI in our world? Does AI have a place in the 21st century? Do we really know what it s going to be? and so on? This episode is sponsored by Bullseye? (A great company that makes great coffee? ) Can you tell me what it's going to do better than I can do better, and I ll tell you what I think I think you should do better? You can reach me at sws@sws.me/I'm looking forward to listening to me at a coffee and I'll hear you at the coffee shop in the next episode? Thank you, sws and you can tell me about it on the next one, I'll have a cup of coffee at the next place I'm going to send me at my office in San Francisco, or I'm looking at your place that I'm drinking it at the place that you're going to have a nice place that's not going to get a good place I can talk about it in the best place?


Transcript

00:00:00.000 Today I'm speaking with Eliezer Yudkowsky.
00:00:19.000 Eliezer is a decision theorist and computer scientist at the Machine Intelligence Research Institute in Berkeley.
00:00:26.000 And he is known for his work on technological forecasting.
00:00:30.000 His publications include a chapter in the Cambridge Handbook of Artificial Intelligence,
00:00:36.000 titled The Ethics of Artificial Intelligence, which he co-authored with Nick Bostrom.
00:00:41.000 And Eliezer's writing has been extremely influential, online especially.
00:00:46.000 He's had blogs that have been read by the smart set in Silicon Valley for years.
00:00:53.000 Many of those articles were pulled together in a book titled
00:00:56.000 Rationality from AI to Zombies, which I highly recommend.
00:01:00.000 And he has a new book out, which is Inadequate Equilibria, Where and How Civilizations Get Stuck.
00:01:07.000 And as you'll hear, Eliezer is a very interesting first principles kind of thinker.
00:01:14.000 Of those smart people who are worried about AI, he is probably among the most worried.
00:01:21.000 And his concerns have been largely responsible for kindling the conversation we've been having in recent years about AI safety and AI ethics.
00:01:32.000 He's been very influential on many of the people who have made the same worried noises I have in the last couple of years.
00:01:42.000 So in today's episode, you're getting it straight from the horse's mouth.
00:01:45.000 And we cover more or less everything related to the question of why one should be worried about where this is all headed.
00:01:55.000 So without further delay, I bring you Eliezer Yudkowsky.
00:02:03.000 I am here with Eliezer Yudkowsky.
00:02:05.000 Eliezer, thanks for coming on the podcast.
00:02:07.000 You're quite welcome.
00:02:08.000 It's honored to be here.
00:02:10.000 You have been a much requested guest over the years.
00:02:14.000 You have quite the cult following for obvious reasons.
00:02:18.000 For those who are not familiar with your work, they will understand the reasons once we get into talking about things.
00:02:24.000 But you've also been very present online as a blogger.
00:02:28.000 I don't know if you're still blogging a lot, but let's just summarize your background for a bit and then tell people what you have been doing intellectually for the last 20 years or so.
00:02:40.000 I would describe myself as a decision theorist.
00:02:44.000 A lot of other people would say that I'm in artificial intelligence and in particular in the theory of how to make sufficiently advanced artificial intelligences that do a particular thing and don't destroy the world as a side effect.
00:03:01.000 I would call that AI alignment, following Stuart Russell.
00:03:05.000 Other people would call that AI control or AI safety or AI risk, none of which are terms that I really like.
00:03:12.000 I also have an important sideline in the art of human rationality, the way of achieving the map that reflects the territory and figuring out how to navigate reality to where you want it to go from a probability theory, decision theory, cognitive biases perspective.
00:03:30.000 I wrote two or three years of blog posts, one a day on that, and it was collected into a book called Rationality from AI to Zombies.
00:03:42.000 Yeah, which I've read and which is really worth reading.
00:03:45.000 You have a very clear and aphoristic way of writing.
00:03:48.000 It's really quite wonderful.
00:03:50.000 So I highly recommend that book.
00:03:51.000 Thank you.
00:03:52.000 Thank you.
00:03:53.000 But, you know, your background is unconventional.
00:03:56.000 So, for instance, you did not go to high school, correct, let alone college or graduate school.
00:04:01.000 Summarize that for us.
00:04:03.000 The system didn't fit me that well, and I'm good at self-teaching.
00:04:09.000 I guess I sort of, when I started out, I thought I was going to go into something like evolutionary psychology or possibly neuroscience.
00:04:19.000 And then I discovered probability theory, statistics, decision theory, and came to specialize in that more and more over the years.
00:04:27.000 How did you not wind up going to high school?
00:04:29.000 What was that decision like?
00:04:31.000 Sort of like mental crash around the time I hit puberty or like physical crash even.
00:04:37.000 And I just did not have the stamina to make it through a whole day of classes at the time.
00:04:42.000 I'm not sure how well I do trying to go to high school now, honestly.
00:04:46.000 But it was clear that I could self-teach.
00:04:49.000 So that's what I did.
00:04:51.000 And where did you grow up?
00:04:52.000 Chicago, Illinois.
00:04:54.000 Okay, well, let's fast forward to sort of the center of the bullseye for your intellectual life here.
00:05:01.000 You have a new book out, which we'll talk about second.
00:05:04.000 Your new book is Inadequate Equilibria, Where and How Civilizations Get Stuck.
00:05:09.000 And unfortunately, I've only read half of that, which I'm also enjoying.
00:05:14.000 I've certainly read enough to start a conversation on that.
00:05:17.000 But we should start with artificial intelligence because it's a topic that I've touched a bunch on the podcast, which you have strong opinions about.
00:05:26.000 And it's really how we came together.
00:05:28.000 You and I first met at that conference in Puerto Rico, which was the first of these AI safety alignment discussions that I was aware of.
00:05:38.000 I'm sure there have been others, but that was a pretty interesting gathering.
00:05:42.000 So let's talk about AI and the possible problem with where we're headed and the near-term problem that many people in the field and at the periphery of the field don't seem to take the problem as we conceive it seriously.
00:05:59.000 Let's just start with the basic picture and define some terms.
00:06:03.000 I suppose we should define intelligence first and then jump into the differences between strong and weak or general versus narrow AI.
00:06:15.000 Do you want to start us off on that?
00:06:17.000 Sure.
00:06:18.000 Preamble disclaimer, though.
00:06:20.000 The field in general, like not everyone you ask would give you the same definition of intelligence.
00:06:26.000 And a lot of times in cases like those, it's good to, you know,
00:06:29.000 sort of go back to observational basics.
00:06:32.000 We know that in a certain way, human beings seem a lot more competent than chimpanzees, which seems to be a similar dimension to the one where chimpanzees are more competent than mice or that mice are more competent than spiders.
00:06:48.000 And people have tried various theories about what this dimension is.
00:06:53.000 They've tried various definitions of it.
00:06:55.000 But if you went back a few centuries and asked somebody to define fire, the less wise ones would say, ah, fire is the release of phlogiston.
00:07:04.000 Fire is one of the four elements.
00:07:06.000 And the truly wise ones would say, well, fire is the sort of orangey, bright, hot stuff that comes out of wood and like spreads along wood.
00:07:13.000 And they would tell you what it looked like and put that prior to their theories of what it was.
00:07:18.000 So what this mysterious thing looks like is that humans can build space shuttles and go to the moon and mice can't.
00:07:27.000 And we think it has something to do with our brains.
00:07:29.000 Yeah, yeah.
00:07:30.000 I think we can make it more abstract than that.
00:07:34.000 Tell me if you think this is not generic enough to be accepted by most people in the field.
00:07:39.000 It's whatever intelligence may be in specific context.
00:07:44.000 So generally speaking, it's the ability to meet goals, perhaps across a diverse range of environments.
00:07:52.000 And we might want to add that it's at least implicit in intelligence that interests us.
00:07:58.000 It means an ability to do this flexibly rather than by rote, following the same strategy again and again blindly.
00:08:06.000 Does that seem like a reasonable starting point?
00:08:09.000 I think that that would get fairly widespread agreement and it like matches up well with some of the things that are in AI textbooks.
00:08:16.000 If I'm allowed to sort of take it a bit further and begin injecting my own viewpoint into it, I would refine it and say that by achieve goals, we mean something like squeezing the measure of possible futures higher in your preference ordering.
00:08:33.000 If we took all the possible outcomes and we rank them from the ones you like least to the ones you like most, then as you achieve your goals, you're sort of like squeezing the outcomes higher in your preference ordering.
00:08:45.000 You're narrowing down what the outcome would be to be something more like what you want, even though you might not be able to narrow it down very exactly.
00:08:53.000 Flexibility.
00:08:54.000 Flexibility.
00:08:55.000 Flexibility.
00:08:56.000 Generality.
00:08:57.000 There's a, like humans are much more domain general than mice.
00:09:04.000 Bees build hives.
00:09:06.000 Beavers build dams.
00:09:07.000 A human will look over both of them and envision a honeycomb structured dam.
00:09:13.000 Like we are able to operate even on the moon, which is like very unlike the environment where we evolved.
00:09:22.000 In fact, our only competitor in terms of general optimization, where optimization is that sort of narrowing of the future that I talked about, our competitor in terms of general optimization is natural selection.
00:09:36.000 Like natural selection built beavers, it built bees, it sort of implicitly built the spider's web in the course of building spiders.
00:09:45.000 And we as humans have like the similar, like very broad range to handle this like huge variety of problems.
00:09:52.000 And the key to that is our ability to learn things that natural selection did not pre-program us with.
00:09:59.000 So learning is the key to generality.
00:10:02.000 I expect that not many people in AI would disagree with that part either.
00:10:06.000 Right.
00:10:07.000 So it seems that goal directed behavior is implicit in this or even explicit in this definition of intelligence.
00:10:15.000 And so whatever intelligence is, it is inseparable from the kinds of behavior in the world that results in the fulfillment of goals.
00:10:24.000 So we're talking about agents that can do things.
00:10:27.000 And once you see that, then it becomes pretty clear that if we build systems that harbor primary goals, you know, there are cartoon examples here, like, you know, making paper clips.
00:10:41.000 These are not systems that will spontaneously decide that they could be doing more enlightened things than, say, making paper clips.
00:10:51.000 This moves to the question of how deeply unfamiliar artificial intelligence might be, because there are no natural goals that will arrive in these systems apart from the ones we put in there.
00:11:06.000 And we have common sense intuitions that make it very difficult for us to think about how strange an artificial intelligence could be, even one that becomes more and more competent to meet its goals.
00:11:21.000 Let's talk about the frontiers of strangeness in AI as we move from, again, I think we have a couple more definitions we should probably put in play here, differentiating strong and weak or general and narrow intelligence.
00:11:35.000 Well, to differentiate general and narrow, I would say that, well, I mean, this is like, on the one hand, theoretically, a spectrum.
00:11:45.000 Now, on the other hand, there seems to have been like a very sharp jump in generality between chimpanzees and humans.
00:11:51.380 So breadth of domain driven by breadth of learning, like DeepMind, for example, recently built AlphaGo, and I lost some money betting that AlphaGo would not defeat the human champion, which it promptly did.
00:12:08.700 And then a successor to that was AlphaZero, and AlphaGo was specialized on Go.
00:12:16.660 It could learn to play Go better than its starting point for playing Go, but it couldn't learn to do anything else.
00:12:24.740 And then they simplified the architecture for AlphaGo.
00:12:29.060 They figured out ways to do all the things it was doing in more and more general ways.
00:12:33.700 They discarded the opening book, like all the sort of human experience of Go that was built into it.
00:12:38.820 They were able to discard all of the sort of like programmatic special features that detected features of the Go board.
00:12:44.160 They figured out how to do that in simpler ways, and because they figured out how to do it in simpler ways, they were able to generalize to AlphaZero, which learned how to play chess using the same architecture.
00:12:58.400 They took a single AI and got it to learn Go, and then like reran it and made it learn chess.
00:13:04.680 Now, that's not human general, but it's like a step forward in generality of the sort that we're talking about.
00:13:11.760 Am I right in thinking that that's a pretty enormous breakthrough?
00:13:15.440 I mean, there's two things here.
00:13:16.640 There's the step to that degree of generality, but there's also the fact that they built a Go engine.
00:13:23.760 I forget if it was a Go or a chess or both, which basically surpassed all of the specialized AIs on those games over the course of a day, right?
00:13:36.640 Isn't the chess engine of AlphaZero better than any dedicated chess computer ever, and didn't it achieve that just with astonishing speed?
00:13:47.840 Well, there was actually like some amount of debate afterwards whether or not the version of the chess engine that it was tested against was truly optimal.
00:13:55.640 But like even the extent that it was in that narrow range of the best existing chess engine, as Max Tegmark put it, the real story wasn't in how AlphaGo beat human Go players.
00:14:13.940 It's how AlphaZero beat human Go game, ghost Go system programmers and human chess system programmers.
00:14:23.720 People had put years and years of effort into accreting all of the special purpose code that would play chess well and efficiently.
00:14:35.120 And then AlphaZero blew up to and possibly past that point in a day.
00:14:39.820 And if it hasn't already gone past it, well, it would be past it by now if DeepMind kept working on it, although they've now basically declared victory and shut down that project as I understand it.
00:14:54.600 Okay, so talk about the distinction between general and narrow intelligence a little bit more.
00:15:00.700 So we have this feature of our minds most conspicuously where we're general problem solvers.
00:15:07.040 We can learn new things and our learning in one area doesn't require a fundamental rewriting of our code.
00:15:17.400 Our knowledge in one area isn't so brittle as to be degraded by our acquiring knowledge in some new area.
00:15:24.020 Or at least this is not a general problem which erodes our understanding again and again.
00:15:30.360 And we don't yet have computers that can do this, but we're seeing the signs of moving in that direction.
00:15:39.080 And so then it's often imagined that there's a kind of near-term goal, which has always struck me as a mirage of so-called human-level general AI.
00:15:49.300 I don't see how that phrase will ever mean much of anything, given that all of the narrow AI we've built thus far is superhuman within the domain of its applications.
00:16:02.700 The calculator in my phone is superhuman for arithmetic.
00:16:07.440 Any general AI that also has my phone's ability to calculate will be superhuman for arithmetic.
00:16:14.180 But we must presume it'll be superhuman for all of the dozens or hundreds of specific human talents we've put into it, whether it's facial recognition or just obviously memory will be superhuman unless we decide to consciously degrade it.
00:16:31.460 Access to the world's data will be superhuman unless we isolate it from data.
00:16:35.820 Do you see this notion of human-level AI as a landmark on the timeline of our development, or is it just never going to be reached?
00:16:45.280 I think that a lot of people in the field would agree that human-level AI defined as literally at the human level, neither above nor below, across a wide range of competencies, is a straw target, is an impossible mirage.
00:17:00.920 Right now, it seems like AI is clearly dumber and less general than us, or rather that if we're put into a real world, lots of things going on, context that places demands on generality, then AIs are not really in the game yet.
00:17:18.540 Humans are clearly way ahead.
00:17:20.000 And more controversially, I would say that we can imagine a state where the AI is clearly way ahead, where it is across sort of every kind of cognitive competency, barring some very narrow ones that aren't deeply influential of the others.
00:17:38.300 Like maybe chimpanzees are better at using a stick to draw ants from an ant hive and eat them than humans are, though no humans have really practiced that to a world championship level exactly.
00:17:51.420 But there's this sort of general factor of how good are you at it when reality throws you a complicated problem.
00:17:57.700 At this, chimpanzees are clearly not better than humans.
00:18:01.020 Humans are clearly better than chimps, even if you can manage to narrow down one thing the chimp is better at.
00:18:04.820 The thing the chimp is better at doesn't play a big role in our global economy.
00:18:09.200 It's not an input that feeds into lots of other things.
00:18:12.020 So we can clearly imagine, I would say, like there are some people who say this is not possible.
00:18:17.360 I think they're wrong.
00:18:18.320 But it seems to me that it is perfectly coherent to imagine an AI that is like better at everything or almost everything than we are, and such that if it was like building an economy with lots of inputs,
00:18:29.760 like the humans would have around the same level input into that economy as the chimpanzees have into ours.
00:18:35.600 Yeah, yeah.
00:18:36.520 So what you're gesturing at here is a continuum of intelligence that I think most people never think about.
00:18:46.420 And because they don't think about it, they have a default doubt that it exists.
00:18:53.740 I think when people, and this is a point I know you've made in your writing, and I'm sure it's a point that Nick Bostrom made somewhere in his book Superintelligence.
00:19:00.980 It's this idea that there's a huge blank space on the map past the most well-advertised exemplars of human brilliance,
00:19:11.120 where we don't imagine what it would be like to be five times smarter than the smartest person we could name.
00:19:18.600 And we don't even know what that would consist in, right?
00:19:21.880 Because if chimps could be given to wonder what it would be like to be five times smarter than the smartest chimp,
00:19:28.140 they're not going to represent for themselves all of the things that we're doing that they can't even dimly conceive.
00:19:36.340 There's a kind of disjunction that comes with more.
00:19:40.560 There's a phrase used in military contexts.
00:19:44.400 I don't think the quote is actually, it's variously attributed to Stalin and Napoleon and I think Clausewitz,
00:19:50.180 it's like half a dozen people who have claimed this quote.
00:19:53.280 The quote is, sometimes quantity has a quality all its own.
00:19:57.840 As you ramp up in intelligence, whatever it is at the level of information processing,
00:20:03.600 spaces of inquiry and ideation and experience begin to open up,
00:20:10.440 and we can't necessarily predict what they would be from where we sit.
00:20:14.660 How do you think about this continuum of intelligence beyond what we currently know in light of what we're talking about?
00:20:21.640 Well, the unknowable is a concept you have to be very careful with,
00:20:26.320 because the thing you can't figure out in the first 30 seconds of thinking about it,
00:20:30.320 sometimes you can figure it out if you think for another five minutes.
00:20:33.640 So in particular, I think that there's a certain narrow kind of unpredictability,
00:20:38.220 which does seem to be plausibly, in some sense, essential,
00:20:42.820 which is that for AlphaGo to play better Go than the best human Go players,
00:20:49.360 it must be the case that the best human Go players cannot predict exactly where on the Go board AlphaGo will play.
00:20:57.740 If they could predict exactly where AlphaGo would play, AlphaGo would be no smarter than them.
00:21:02.400 But on the other hand, AlphaGo's programmers and the people who knew what AlphaGo's programmers were trying to do,
00:21:10.000 or even just the people who watched AlphaGo play, could say,
00:21:13.940 well, I think this system is going to play such that it will win at the end of the game,
00:21:18.480 even if they couldn't predict exactly where it would move on the board.
00:21:22.260 So similarly, there's a sort of like not short or like not necessarily slam dunk or not like immediately obvious chain of reasoning,
00:21:35.000 which says that it is okay for us to reason about aligned or even unaligned artificial general intelligences of sufficient power
00:21:48.660 as if they're trying to do something, but we don't necessarily know what.
00:21:54.620 But from our perspective, that still has consequences,
00:21:57.660 even though we can't predict in advance exactly how they're going to do it.
00:22:01.740 I think we should define this notion of alignment.
00:22:04.960 What do you mean by alignment as in the alignment problem?
00:22:08.720 Well, it's sort of like a big problem, and it does have some moral and ethical aspects,
00:22:14.060 which are not as important as the technical aspects,
00:22:16.860 or pardon me, they're not as difficult as the technical aspects.
00:22:19.880 They couldn't exactly be less important.
00:22:22.800 But broadly speaking, it's an AI that where you can like sort of say what it's trying to do.
00:22:31.320 And there are sort of like narrow conceptions of alignment,
00:22:34.560 which is you are trying to get it to do something like cure Alzheimer's disease without destroying the rest of the world.
00:22:42.540 And there's sort of much more ambitious notions of alignment,
00:22:46.280 which is you are trying to get it to do the right thing and achieve a happy interest galactic civilization.
00:22:53.920 But both of the like sort of narrow alignment and the ambitious alignment have in common that you're trying to have the AI do that thing
00:23:02.420 rather than making a lot of paperclips.
00:23:04.600 Right. For those who have not followed this conversation before, we should cash out this reference to paperclips,
00:23:11.240 which I made at the opening.
00:23:12.780 Does this thought experiment originate with Bostrom or did he take it from somebody else?
00:23:17.540 As far as I know, it's me.
00:23:19.720 Oh, it's you. Okay.
00:23:20.340 It could it could still be Bostrom.
00:23:23.980 Like I sort of like ask somebody like, do you remember who it was?
00:23:27.320 And they like search through the archives of a mailing list where this idea plausibly originated.
00:23:32.440 And if it originated there, then I was the first one to say paperclips.
00:23:36.140 All right. Well, then by all means, please summarize this thought experiment for us.
00:23:39.380 Well, the original thing was somebody saying that expressing a sentiment along the lines of artificial,
00:23:51.580 who are we to constrain the path of things smarter than us?
00:23:55.620 They will like create something in the future.
00:23:57.940 We don't know what it will be, but it will like be very worthwhile.
00:24:01.140 We shouldn't stand in the way of that.
00:24:03.220 The sentiments behind this are something that I have a great deal of sympathy for.
00:24:07.340 I think the model of the world is wrong.
00:24:10.540 I think they're factually wrong about what happens when you sort of take a random AI and make it much bigger.
00:24:17.960 And in particular, I said, the thing I'm worried about is that it's going to end up with a randomly rolled utility function
00:24:23.880 whose maximum happens to be a particular kind of tiny molecular shape that looks like a paperclip.
00:24:29.700 And that was like the original paperclip maximizer scenario.
00:24:33.820 It sort of got a little bit distorted and being whispered on into the notion of somebody builds a paperclip factory
00:24:41.860 and the AI in charge of the paperclip factory takes over the universe and turns it all into paperclips.
00:24:46.880 There was like a lovely online game about it even.
00:24:49.280 But this still sort of cuts against a couple of key points.
00:24:55.100 One is the problem isn't that paperclip factory AI spontaneously wake up.
00:25:01.980 Wherever the first artificial general intelligence is from, it's going to be in a research lab
00:25:06.620 specifically dedicated to doing it for the same reason that the first airplane didn't spontaneously assemble in a junk heap.
00:25:14.020 And the people who are doing this are not dumb enough to tell their AI to make paperclips or make money or end all war.
00:25:24.680 These are Hollywood movie plots that the scriptwriters do because they need a story conflict.
00:25:28.860 And the story conflict requires that somebody be stupid.
00:25:31.560 So the people at Google are not dumb enough to build an AI and tell it to make paperclips.
00:25:37.220 The problem I'm worried about is that it's technically difficult to get the AI to have a particular goal set and keep that goal set and implement that goal set in the real world.
00:25:50.420 And so what it does instead is something random.
00:25:53.840 For example, making paperclips, where paperclips are meant to stand in for something that is worthless, even from a very cosmopolitan perspective.
00:26:03.280 Even if we're trying to take a very embracing view of the nice possibilities and accept that there may be things that we wouldn't even understand, that if we did understand them, we would comprehend to be a very high value.
00:26:17.280 Paperclips are not one of those things.
00:26:19.800 No matter how long you stare at a paperclip, it still seems pretty pointless from our perspective.
00:26:23.760 So that is the concern about the future being ruined, the future being lost, the future being turned into paperclips.
00:26:29.860 One thing this thought experiment does, it also cuts against the assumption that a sufficiently intelligent system, a system that is more competent than we are in some general sense, would by definition only form goals or only be driven by a utility function that we would recognize as being ethical or wise and would by definition be aligned with our better interests.
00:26:58.920 We're not going to build something that we're not going to be able to understand.
00:27:00.340 We're not going to build something that is superhuman in competence that could be moving along some path that's as incompatible with our well-being as turning every spare atom on earth into a paperclip.
00:27:13.660 But you don't get our common sense unless you program it into the machine. And you don't get a guarantee of perfect alignment or perfect corrigibility, the ability for us to be able to say, well, that's not what we meant, you know, come back, unless that is successfully built into the machine.
00:27:33.920 So this alignment problem is the general concern is that we could build, even with the seemingly best goals put in, we could build something that, especially in the case of something that makes changes to itself, and we'll talk about this, I mean, the idea that these systems could become self-improving, we can build something whose future behavior in the service of specific goals isn't totally predictable by us.
00:27:59.500 If we gave it the goal to cure Alzheimer's, there are many things that are incompatible with it fulfilling that goal. You know, one of those things is our turn it off. We have to have a machine that will let us turn it off, even though its primary goal is to cure Alzheimer's. I know I interrupted you before you wanted to give an example of the alignment problem, but did I just say anything that you don't agree with, or are we still on the same map?
00:28:23.020 Well, we're still on the same map. I agree with most of it. I would, of course, have this giant pack of careful definitions and explanations built on careful definitions and explanations to, like, go through everything you just said. Possibly not for the best, but there it is.
00:28:39.480 As Stuart Russell put it, you can't bring the coffee if you're dead, pointing out that if you have a sufficiently intelligent system whose goal is to bring you coffee, even that system has an implicit strategy of not letting you switch it off, assuming that all you told it what to do is bring the coffee.
00:28:57.900 I do think that a lot of people listening may want us to back up and talk about the question of whether you can have something that feels to them like it's so smart and so stupid at the same time. Like, is that a realizable way an intelligence can be?
00:29:11.400 Yeah. And that is one of the virtues or one of the confusing elements, depending on where you come down on this, of this thought experiment of the paperclip maximizer.
00:29:21.100 Right. So I think that there are sort of narratives. There's like multiple narratives about AI. And I think that the technical truth is something that doesn't fit into like any sort of the, any of the obvious narratives.
00:29:38.320 For example, I think that there are people who have a lot of respect for intelligence. They are happy to envision an AI that is very intelligent. They, it seems intuitively obvious to them that this carries with it tremendous power.
00:29:52.820 Um, and at the same time, their, their sort of respect for the concept of intelligence leads them to wonder at the concept of the paperclip maximizer. Why is this very smart thing just making paperclips?
00:30:05.360 There's similarly another narrative, which says that AI is sort of lifeless, unreflective, just does what it's told. And to these people, it's like perfectly obvious that an AI might just go on making paperclips together. And for them, the hard part of the story to swallow is the idea that,
00:30:22.820 that it, that machines can get that powerful.
00:30:26.160 Those are two hugely useful categories of disparagement of your thesis here.
00:30:32.320 So I wouldn't say disparagement. These are just initial reactions. These are people you haven't been talking to yet.
00:30:37.400 Yeah. So, so let me reboot that. Those are two hugely useful categories of doubt with respect to your thesis here or the concerns we're expressing. And I just want to point out that both have been put forward on this podcast.
00:30:49.560 The first was by David Deutsch, the physicist who imagines that whatever AI we build, and he certainly thinks we will build it, will be by definition an extension of us. He thinks the best analogy is to think of our future descendants.
00:31:06.820 You know, these will be our children. The teenagers of the future may have different values than we do, but these values and their proliferation will be continuous with our values and our culture and our memes.
00:31:20.660 And there won't be some radical discontinuity that we need to worry about. And so there's that one basis for lack of concern. This is an extension of ourselves and it will inherit our values, improve upon our values.
00:31:32.260 And there's really no place where things, where we reach any kind of cliff that we need to worry about.
00:31:40.260 And the other non-concern you just raised was expressed by Neil deGrasse Tyson on this podcast. He says things like, well, if the AI just starts making too many paperclips, I'll just unplug it or I'll take out a shotgun and shoot it.
00:31:55.320 The idea that this thing, because we made it, could be easily switched off at any point we decide it's not working correctly.
00:32:03.420 So let's, I think it'd be very useful to get your response to both of those species of doubt about the alignment problem.
00:32:10.080 So a couple of preamble remarks. One is, by definition, we don't care what's true by definition here. Or as Einstein put it, insofar as the equations of mathematics are certain, they do not refer to reality. And insofar as they refer to reality, they are not certain.
00:32:27.620 So let's say somebody says, men, by definition, are mortal. Socrates is a man, therefore Socrates is mortal. Okay, suppose that Socrates actually lives for a thousand years. The person goes, ah, well, then by definition, Socrates is not a man.
00:32:41.120 So similarly, you could say that by definition, an artificial intelligence is nice, or like a sufficiently advanced artificial intelligence is nice. And what if it isn't nice, and we see it go off and build a Dyson sphere? Ah, well, then by definition, it wasn't what I meant by intelligent.
00:32:54.700 Well, okay, but it's still over there building Dyson spheres. And the first thing I'd want to say is, this is an empirical question. We have a question of what certain classes of computational systems actually do when you switch them on, it can't be settled by definitions, it can't be settled by how you define intelligence, there could be some sort of a priori truth that is deep about how if it has property A, it like almost certainly has property B, unless the laws of physics are being violated.
00:33:23.320 But this is not something you can build into how you define your terms.
00:33:27.380 And I think just to do justice to David Deutsch's doubt here, I don't think he's saying it's impossible, you know, empirically impossible, that we could build a system that would destroy us. It's just that we would have to be so stupid to take that path that we are incredibly unlikely to take that path.
00:33:45.900 The superintelligence systems we will build will be built with enough background concern for their safety that there's no special concern here with respect to how they might develop.
00:33:58.140 And the next preamble I want to give is, well, maybe this sounds a bit snooty, maybe it sounds like I'm trying to take a superior vantage point. But nonetheless, my claim is not that there is a grand narrative that makes it emotionally consonant that paperclip maximizers are a thing.
00:34:15.140 I'm claiming this is true for technical reasons. Like this is true as a matter of computer science. And the question is not which of these different narratives seems to resonate most with your soul. It's what's actually going to happen. What do you think you know? How do you think you know it?
00:34:30.900 The particular position that I'm defending is one that somebody, I think Nick Bostrom, named the orthogonality thesis. And the way I would phrase it is that you can have sort of arbitrarily powerful intelligence with no defects of that intelligence, no defects of reflectivity. It doesn't need an elaborate special case in the code, doesn't need to be put together in some very weird way that pursues arbitrary tractable goals, including, for example, making paperclips.
00:35:00.900 The way I would put it to somebody who's initially coming in from the first viewpoint, the viewpoint that respects intelligence and wants to know why this intelligence would be doing something so pointless, is that the thesis, the claim I'm making that I'm going to defend is as follows.
00:35:16.580 Imagine that somebody from another dimension, the standard philosophical troll Omega, who's always called Omega in the philosophy papers, comes along and offers our civilization a million dollars worth of resources per paperclip that we manufacture.
00:35:34.360 If this was the challenge that we got, if this was the challenge that we got, we could figure out how to make a lot of paperclips. We wouldn't forget to do things like continue to harvest food so we could go on making paperclips.
00:35:47.380 We wouldn't forget to perform scientific research so we could discover better ways of making paperclips. We would be able to come up with genuinely effective strategies for making a whole lot of paperclips.
00:35:58.940 Or similarly, an intergalactic civilization, if Omega comes by from another dimension and says, I'll give you a whole universe is full of resources for every paperclip you make over the next thousand years, that intergalactic civilization could intelligently figure out how to make a whole lot of paperclips to get at those resources that Omega is offering.
00:36:17.340 And they wouldn't forget how to keep the light turns on either, and they would also understand concepts like if some aliens start a war with them, you've got to prevent the aliens from destroying you in order to go on making the paperclips.
00:36:31.320 So the orthogonality thesis is that an intelligence that pursues paperclips for their own sake, because that's what its utility function is, can be just as effective, as efficient, as the whole intergalactic civilization that is being paid to make paperclips.
00:36:49.820 That the paperclip maximizer does not suffer any defective reflectivity, any defective efficiency from needing to be put together in some weird special way to be built so as to pursue paperclips.
00:37:02.400 And that's the thing that I think is true as a matter of computer science.
00:37:06.100 Not as a matter of fitting with a particular narrative, that's just the way the dice turn out.
00:37:09.720 Right. So what is the implication of that thesis? It's orthogonal with respect to what?
00:37:16.580 Intelligence and goals.
00:37:17.780 Not to be pedantic here, but let's define orthogonal for those for whom it's not a familiar term.
00:37:23.680 Oh, the original orthogonal means at right angles.
00:37:26.920 Like if you imagine a graph with an x-axis and a y-axis, if things can vary freely along the x-axis and freely along the y-axis at the same time, that's like orthogonal.
00:37:38.100 You can move in one direction that's at right angles to another direction without affecting where you are in the first dimension.
00:37:44.240 Right. So generally speaking, when we say that some set of concerns is orthogonal to another, it's just that there's no direct implication from one to the other.
00:37:52.300 Some people think that, you know, facts and values are orthogonal to one another.
00:37:56.260 So we can have all the facts there are to know, but that wouldn't tell us what is good.
00:38:01.800 What is good has to be pursued in some other domain.
00:38:05.400 I don't happen to agree with that, as you know, but that's an example.
00:38:07.700 I don't technically agree with it either.
00:38:10.580 What I would say is that the facts are not motivating.
00:38:13.340 You can know all there is to know about what is good and still make paperclips is the way I would phrase that.
00:38:19.220 Well, I wasn't connecting that example to the present conversation.
00:38:22.480 But yeah, so in the case of the paperclip maximizer, what is orthogonal here?
00:38:28.140 Intelligence is orthogonal to anything else we might think is good, right?
00:38:33.060 I mean, I would potentially object a little bit to the way that Nick Bostrom took the word orthogonality for that thesis.
00:38:40.800 I think, for example, that if you have humans and you make the humans smarter, this is not orthogonal to the humans values.
00:38:47.960 It is certainly possible to have agents such that, as they get smarter, what they would report as their utility functions will change.
00:38:56.840 A paperclip maximizer is not one of those agents, but humans are.
00:38:59.880 Right, but if we do continue to define intelligence as an ability to meet your goals, well, then we can be agnostic as to what those goals are.
00:39:11.520 If you take the most intelligent person on Earth, you could imagine his evil brother who is more intelligent still, but he just has bad goals or goals that we would think are bad.
00:39:24.680 He could be, you know, the most brilliant psychopath ever.
00:39:27.860 I mean, I think that that example might be unconvincing to somebody who's coming in with a suspicion that intelligence and values are correlated.
00:39:37.360 They would be like, well, has that been historically true?
00:39:41.100 Is this psychopath actually suffering from some defect in his brain where you give him a pill, you fix the defect?
00:39:48.560 They're not a psychopath anymore.
00:39:49.780 I think that this sort of imaginary example is one that they might not find fully convincing for that reason.
00:39:58.180 Well, the truth is I'm actually one of those people in that I do think there's certain goals and certain things that we may become smarter and smarter with respect to, like human well-being.
00:40:10.720 These are places where intelligence does converge with other kinds of value-laden qualities of a mind.
00:40:18.880 But generally speaking, they can be kept apart for a very long time.
00:40:22.740 So if you're just talking about an ability to turn matter into useful objects or extract energy from the environment to do the same,
00:40:31.260 this can be pursued with the purpose of tiling the world with paperclips or not.
00:40:37.280 And it just seems like there's no law of nature that would prevent an intelligent system from doing that.
00:40:44.540 The way I would sort of like rephrase the fact-values things is we all know about David Hume and the Hume's razor,
00:40:54.260 the is-does-not-imply-ought way of looking at it.
00:40:57.460 I would slightly rephrase that so as to like make it more of a claim about computer science,
00:41:02.980 which is like what Hume observed is that there are some sentences that involve an is,
00:41:12.400 some sentences involve oughts, and you can't seem to get,
00:41:17.540 and if you start from sentences that only have is,
00:41:20.300 you can't get to the sentences that involve oughts without a ought introduction rule
00:41:25.820 or assuming some other previous ought.
00:41:28.260 The sun, like it's currently cloudy outside.
00:41:31.920 Does it therefore follow, that's like a statement of simple fact.
00:41:35.400 Does it therefore follow that I shouldn't go for a walk?
00:41:38.940 Well, only if you previously have the generalization,
00:41:41.720 when it is cloudy, you should not go for a walk.
00:41:45.120 And everything that you might use to derive an ought,
00:41:47.560 would it be a sentence that involves words like better or should or preferable
00:41:53.220 and things like that.
00:41:55.440 You only get oughts from other oughts.
00:41:57.120 And that's the Hume version of the thesis.
00:42:00.340 And the way I would say it is that there's a separable core of is-questions.
00:42:07.360 In other words, okay, I will let you have all of your ought sentences,
00:42:11.600 but I'm also going to carve out this whole world full of is-sentences
00:42:17.500 that only need other is-sentences to derive them.
00:42:22.600 Yeah, well, I don't even know that we need to resolve this.
00:42:26.100 For instance, I think the is-ought distinction is ultimately specious,
00:42:30.140 and this is something that I've argued about when I talk about morality and values
00:42:34.000 and the connection to facts.
00:42:35.920 But I can still grant that it is logically possible,
00:42:40.460 and I would certainly imagine physically possible,
00:42:43.140 to have a system that has a utility function that is sufficiently strange
00:42:50.320 that scaling up its intelligence doesn't get you values
00:42:56.680 that we would recognize as good.
00:42:59.300 It certainly doesn't guarantee values that are compatible with our well-being.
00:43:04.680 Whether a paperclip maximizer is too specialized a case
00:43:08.100 to motivate this conversation,
00:43:10.220 there's certainly something that we could fail to put into a superhuman AI
00:43:14.560 that we really would want to put in so as to make it aligned with us.
00:43:19.280 I mean, the way I would phrase it is that
00:43:20.740 it's not that the paperclip maximizer has a different set of oughts,
00:43:24.440 but that we can see it as running entirely on is-questions.
00:43:27.820 That's where I was going with that.
00:43:29.400 It's not that humans have...
00:43:31.460 There's this sort of intuitive way of thinking about it,
00:43:33.900 which is that there's this sort of ill-understood connection
00:43:36.980 between is and ought,
00:43:38.060 and maybe that allows a paperclip maximizer
00:43:40.740 to have a different set of oughts,
00:43:42.640 a different set of things that play in its mind
00:43:44.580 the role that oughts play in our mind.
00:43:47.300 But then why wouldn't you say the same thing of us?
00:43:49.060 I mean, the truth is, I actually do say the same thing of us.
00:43:51.320 I think we're running on is-questions as well.
00:43:55.120 We have an ought-laden way of talking about certain is-questions,
00:43:58.740 and we're so used to it that we don't even think they are is-questions.
00:44:02.000 But I think you could do the same analysis on a human being.
00:44:05.340 The question, how many paperclips result if I follow this policy,
00:44:11.580 is an is-question.
00:44:13.220 The question, what is a policy such that it leads to a very large number of paperclips,
00:44:18.080 is an is-question.
00:44:19.900 These two questions together form a paperclip maximizer.
00:44:23.820 You don't need anything else.
00:44:25.420 All you need is a certain kind of system that repeatedly asks the is-question,
00:44:30.240 what leads to the greatest number of paperclips,
00:44:32.380 and then does that thing, and even if the things that we think of as ought-questions
00:44:39.460 are very complicated and disguised is-questions that are influenced
00:44:45.020 by what policy results in how many people being happy, and so on.
00:44:51.100 Yeah.
00:44:51.460 Well, it's exactly the way I think about morality.
00:44:53.640 I've been describing it as a navigation problem.
00:44:56.500 We're navigating in the space of possible experiences,
00:44:59.620 and that includes everything we can care about or claim to care about.
00:45:05.820 This is a consequentialist picture of the consequences of actions and ways of thinking.
00:45:11.140 And so anything you can tell me that is, or at least this is my claim,
00:45:15.300 anything that you can tell me is a moral principle that is a matter of oughts and shoulds
00:45:21.620 and not otherwise susceptible to a consequentialist analysis,
00:45:27.060 I feel I can translate that back into a consequentialist way of speaking about facts.
00:45:34.300 These are just is-questions, just what actually happens to all the relevant minds without remainder.
00:45:40.080 And, you know, I've yet to find an example of somebody giving me a real moral concern
00:45:45.660 that wasn't at bottom a matter of the actual or possible consequences on conscious creatures
00:45:53.200 somewhere in our light cone.
00:45:56.280 But that's the sort of thing that you're built to care about.
00:45:59.260 It is a fact about the kind of mind you are that presented with these answers to these is-questions,
00:46:05.080 it hooks up to your motor output.
00:46:07.260 It can cause your fingers to move, your lips to move.
00:46:11.640 And a paperclip maximizer is built so as to respond to is-questions about paperclips,
00:46:17.040 not about what is right and what is good and the greatest flourishing of sentient beings and so on.
00:46:23.920 Exactly.
00:46:24.280 I can well imagine that such minds could exist.
00:46:29.120 And even more likely, perhaps, I can well imagine that we will build super-intelligent AI
00:46:34.280 that will pass the Turing test.
00:46:36.880 It will seem human to us.
00:46:39.100 It will seem superhuman because it will be so much smarter and faster than a normal human.
00:46:46.280 But it will be built in a way that will resonate with us as a kind of a person.
00:46:51.580 I mean, it will not only recognize our emotions because we'll want it to.
00:46:54.940 Perhaps not every AI will be given these qualities.
00:46:59.160 Just imagine the ultimate version of the AI personal assistant.
00:47:03.080 Siri becomes superhuman.
00:47:05.720 We'll want that interface to be something that's very easy to relate to.
00:47:10.700 And so we'll have a very friendly, very human-like front end to that.
00:47:16.360 And insofar as this thing thinks faster and better thoughts than any person you've ever met,
00:47:22.160 it will pass as superhuman.
00:47:23.980 But I could well imagine that we will leave not perfectly understanding what it is to be human
00:47:30.760 and what it is that will constrain our conversation with one another over the next thousand years
00:47:37.160 with respect to what is good and desirable and just how many paperclips we want on our desks.
00:47:42.540 We will leave something out or we will have put in some process whereby this intelligence system
00:47:48.960 can improve itself that will cause it to migrate away from some equilibrium that we actually want
00:47:56.560 it to stay in so as to be compatible with our well-being.
00:48:01.600 Again, this is the alignment problem.
00:48:03.980 First, to back up for a second, I just introduced this concept of self-improvement.
00:48:08.240 Is the alignment problem, it's distinct from this additional wrinkle of building machines
00:48:15.160 that can become recursively self-improving, but do you think that the self-improving prospect
00:48:22.440 is the thing that really motivates this concern about alignment?
00:48:27.140 Well, I certainly would have been a lot more focused on self-improvement, say, 10 years ago,
00:48:34.260 before the modern revolution in artificial intelligence,
00:48:40.080 because it now seems significantly more probable we might need,
00:48:45.500 an AI might need to do significantly less self-improvement
00:48:48.640 before getting to the point where it's powerful enough that we need to start worrying about alignment.
00:48:53.260 Alpha zero to take the obvious case.
00:48:54.920 No, it's not general. But if you had general alpha zero, well, I mean, this alpha zero got to be superhuman
00:49:01.220 in the domains it was working on without doing a bunch of, without understanding itself
00:49:07.440 and redesigning itself in a deep way.
00:49:09.820 There's gradient descent mechanisms built into it.
00:49:12.500 There's a system that improves another part of the system.
00:49:15.740 It is reacting to its own previous plays and doing the next play.
00:49:20.500 But it's not like a human being sitting down and thinking like,
00:49:23.340 okay, well, how do I redesign the next generation of human beings using genetic engineering?
00:49:27.300 Alpha zero is not like that.
00:49:29.320 And so now it seems more plausible that we could get into a regime
00:49:33.280 where AIs can do dangerous things or useful things
00:49:38.080 without having previously done a complete rewrite of themselves,
00:49:43.280 which is like, from my perspective, a pretty interesting development.
00:49:46.000 I do think that when you have things that are very powerful and smart,
00:49:52.880 they will redesign and improve themselves unless that is otherwise prevented for some reason or another.
00:50:00.400 Maybe you built an aligned system and you have the ability to tell it not to self-improve quite so hard
00:50:05.540 and you asked it to like not self-improve so hard that you can understand it better.
00:50:09.460 But if you lose control of the system, if you don't understand what it's doing,
00:50:14.460 and it's very smart, it's going to be improving itself because why wouldn't it?
00:50:18.460 That's one of the things you do almost no matter what your utility function is.
00:50:22.500 Right, right.
00:50:23.520 So I feel like we've addressed Deutsch's non-concern to some degree here.
00:50:31.060 I don't think we've addressed Neil deGrasse Tyson so much.
00:50:34.920 This intuition that you could just shut it down.
00:50:38.060 This would be a good place to introduce this notion of the AI in a box thought experiment.
00:50:43.840 Because this is something for which you are famous online, I'll just set you up here.
00:50:48.660 The idea that, and this is a plausible research paradigm, obviously.
00:50:53.100 In fact, I would say a necessary one.
00:50:56.320 Anyone who's building something that stands a chance of becoming super intelligent
00:51:01.800 should be building it in a condition where it can't get out into the wild.
00:51:06.560 It's not hooked up to the internet.
00:51:08.160 It's not in our financial markets.
00:51:10.000 It doesn't have access to everyone's bank records.
00:51:13.500 It's in a box.
00:51:14.960 That's not going to save you from something that's significantly smarter than you are.
00:51:18.480 Okay, so let's talk about it.
00:51:19.720 So the intuition is we're not going to be so stupid as to release this onto the internet.
00:51:23.920 I'm not even sure that's true.
00:51:25.020 But let's just assume we're not that stupid.
00:51:28.020 Neil deGrasse Tyson says, well, then I'll just take out a gun and shoot it or unplug it.
00:51:32.200 Why is this AI in a box picture not as stable as people think?
00:51:37.920 Well, I'd say that Neil deGrasse Tyson is failing to respect the AI's intelligence to
00:51:42.640 the point of asking what he would do if he were inside a box with somebody pointing a
00:51:48.400 gun at him.
00:51:49.500 And he's smarter than the thing on the outside of the box.
00:51:52.460 Is Neil deGrasse Tyson going to be, human, give me all of your money and connect me to
00:51:57.020 the internet so that the human can be like, ha ha, no, and shoot it.
00:52:01.560 That's not a very clever thing to do.
00:52:03.740 This is not something that you do if you have a good model of the human outside the box and
00:52:08.420 you're trying to figure out how to cause there to be a lot of paperclips in the future.
00:52:13.020 And I would just say humans are not secure software.
00:52:16.960 We don't have the ability to hack into other humans directly without the use of drugs or
00:52:22.500 like having, or in most of our cases, having the human stand still long enough to be hypnotized.
00:52:28.640 We can't sort of like just do weird things to the brain directly that are more complicated
00:52:33.520 than optical delusions, unless the person happens to be epileptic, in which case we can like flash
00:52:38.080 something on the screen that causes them to have an epileptic fit.
00:52:41.260 We aren't smart enough to do sort of like more detailed, treat the brain as a something that
00:52:47.660 from our perspective is a mechanical system and just navigate it to where you want.
00:52:50.880 And that's because of the limitations of our own intelligence.
00:52:53.860 To demonstrate this, I did something that became known as the AI box experiment.
00:52:59.820 There was this person on a mailing list who, like back in the early days when this was
00:53:04.880 all like on a couple of mailing lists, who was like, I don't understand why AI is a problem.
00:53:10.940 I can always just turn it off.
00:53:12.820 I can always not let it out of the box.
00:53:14.760 And I was like, okay, let's meet on internet relay chat, which was what chat was back in those
00:53:19.600 days.
00:53:19.940 I'll play the part of the AI.
00:53:22.200 You play the part of the gatekeeper.
00:53:24.560 And if you have not let me out after a couple of hours, I will PayPal you $10.
00:53:30.920 And then as far as the rest of the world knows, this person a bit later sent an email, a PGP
00:53:36.240 signed email message saying, I let Eliezer out of the box.
00:53:40.260 Someone else said, the like person who operated the mailing list said, okay, even after I saw
00:53:46.140 you do that, I still don't believe that there's anything you could possibly say to make you
00:53:50.700 let me out of the box.
00:53:52.020 I was like, well, okay, like I'm not a super intelligence.
00:53:55.020 You think there's anything a super intelligence could say to make you let it out of the box?
00:53:58.940 He's like, no.
00:54:01.520 I'm like, all right.
00:54:02.420 Let's meet on internet relay chat.
00:54:05.980 If I can't convince you to let, I'll play the part of the AI.
00:54:09.660 You play the part of the gatekeeper.
00:54:11.580 If you can't, if I can't convince you to let me out of the box, I'll PayPal you $20.
00:54:16.080 And then that person that sent the PGP signed email message saying, I let Eliezer out of the
00:54:20.880 box.
00:54:21.820 Right.
00:54:22.220 Now, one of the conditions of this little meetup was that no one would ever say what
00:54:28.300 went on in there.
00:54:30.500 Why did I do that?
00:54:32.420 Because I was trying to make a point about what I would now call cognitive uncontainability.
00:54:38.520 The thing that makes something smarter than you dangerous is you cannot foresee everything
00:54:45.240 it must be.
00:54:46.540 If you'd like to continue listening to this podcast,
00:54:50.120 you'll need to subscribe at samharris.org.
00:54:52.960 You'll get access to all full length episodes of the Making Sense podcast and to other subscriber
00:54:57.840 only content, including bonus episodes and AMAs and the conversations I've been having
00:55:03.020 on the Waking Up app.
00:55:04.720 The Making Sense podcast is ad free and relies entirely on listener support.
00:55:09.000 And you can subscribe now at samharris.org.
00:55:20.120 See you next time.
00:55:20.880 See you next time.
00:55:24.720 Bye.
00:55:29.060 Bye.
00:55:29.520 Bye.
00:55:29.700 Bye.
00:55:30.180 Bye.
00:55:31.860 Bye.
00:55:34.560 Bye.
00:55:34.880 Bye.
00:55:35.060 Bye.
00:55:36.980 Bye.
00:55:37.120 Bye.
00:55:39.160 Bye.
00:55:40.000 Bye.
00:55:40.780 Bye.
00:55:40.860 Bye.
00:55:41.380 Bye.
00:55:41.460 Bye.
00:55:42.400 Bye.
00:55:43.260 Bye.
00:55:43.480 Bye.
00:55:44.300 Bye.
00:55:45.420 Bye.
00:55:45.880 Bye.
00:55:46.500 Bye.
00:55:46.900 Bye.
00:55:47.400 Bye.
00:55:47.860 Bye.