Making Sense - Sam Harris - November 22, 2022


Making Sense of Artificial Intelligence | Episode 1 of The Essential Sam Harris


Episode Stats

Length

1 hour and 7 minutes

Words per Minute

164.15674

Word Count

11,104

Sentence Count

267

Misogynist Sentences

3

Hate Speech Sentences

11


Summary

In this episode of The Making Sense Podcast, host Sam Harris sits down with filmmaker Jay Shapiro to discuss his life, career, and work as an atheist. They talk about Jay's path to becoming a filmmaker, how he became interested in secularism, and why he decided to take his passion for secularism and secularism in particular to a whole new level. They also discuss his new project, "The End of Faith," a project he's working on with Jaron and Jaron to make a documentary about the life and career of atheist icon Christopher Hitchens. It's a fascinating conversation, and one that I think many of you should listen to. If you're interested in becoming a supporter of the podcast, please consider becoming a patron or a supporter, and if you haven't already become a patron, you'll get access to the full "Making Sense" catalog as well as access to all of the episodes of the Making Sense podcast available on all major podcast directories, including the most popular podcatcher, the Audible.org account. We don't run ads on the podcast and therefore it's made possible entirely through the support of our subscribers, so if you enjoy what we're doing here, you're making a generous donation. I am here to support what we re doing here...please consider becoming one! Thank you! Sam Harris - Make Sense: A Podcast About Stuff I'm Working on a Podcast by Jay Shapiro - The End Of Faith by Jay Shapiro - The Final Word by Sam Harris & Jaron J. Harris - The Other Side of the Podcast by Jaron jayshapoe Sam and J.J. Sam talks about how he got into atheistism, secularism & secularism by becoming a skeptic, and how to be a better atheist, and what it means to be an atheist by being an atheist in the 21st century by being a secularist by working to make sense of the world by working on the hard work of secularism and how he's going to do it better than most of us do it by being kinder, not better than we can do it Jay talks about what it looks like to be better than us all by being more than we know how to do what we can be, not more than that, and more like that, not less than we do it like that by being better than that And much more! and so much more, and much more.


Transcript

00:00:00.000 Welcome to the Making Sense Podcast.
00:00:08.820 This is Sam Harris.
00:00:10.880 Just a note to say that if you're hearing this, you are not currently on our subscriber
00:00:14.680 feed and will only be hearing the first part of this conversation.
00:00:18.420 In order to access full episodes of the Making Sense Podcast, you'll need to subscribe at
00:00:22.720 samharris.org.
00:00:24.060 There you'll find our private RSS feed to add to your favorite podcatcher, along with
00:00:28.360 other subscriber-only content.
00:00:30.260 We don't run ads on the podcast, and therefore it's made possible entirely through the support
00:00:34.640 of our subscribers.
00:00:35.880 So if you enjoy what we're doing here, please consider becoming one.
00:00:47.120 I am here with Jay Shapiro.
00:00:48.920 Jay, thanks for joining me.
00:00:50.520 Thank you for having me.
00:00:51.660 So we have a fun project to talk about here.
00:00:54.640 And let's see if I can remember the genesis of this.
00:00:59.200 I think, you know, I woke up in the middle of the night one night realizing that more
00:01:03.900 or less my entire catalog of podcasts was, if not the entire thing, maybe, you know, conservatively
00:01:11.360 speaking, you know, 50% of all the podcasts were evergreen, which is to say that their content
00:01:17.760 was basically as good today as the day I recorded them.
00:01:21.860 But because of the nature of the medium, they would never be perceived as such, and people
00:01:26.580 really don't tend to go back into the catalog and listen to, you know, a three-year-old podcast.
00:01:32.580 And yet there's something insufficient about just recirculating them in my podcast feed or
00:01:39.840 elsewhere.
00:01:40.340 And so I and Jaron, my partner in crime here, we're trying to think about how to give all
00:01:49.200 of this content new life.
00:01:51.160 And then we thought of you just independently turning your creative intelligence loose on
00:01:59.320 the catalog.
00:02:00.380 And now I will properly introduce you as someone who should be doing that.
00:02:06.240 Perhaps you can introduce yourself.
00:02:07.480 Just tell us what you have done a lot of these many years and the kinds of things you've
00:02:12.480 focused on.
00:02:13.720 Yeah, well, I'm a filmmaker first and foremost, but I think my story and my genesis of being
00:02:21.420 maybe the right person to tap here is probably indicative or representative of a decent portion
00:02:28.360 of your audience.
00:02:28.920 I'm just guessing.
00:02:29.600 I'm 40 now, which pegs me in college when 9-11 hit.
00:02:34.620 I think it was late in my second year.
00:02:37.400 I guess it would have been early if it was September.
00:02:39.640 And, you know, I never heard of you at all at that point.
00:02:43.620 I was an atheist and just didn't think too much about that kind of stuff.
00:02:47.820 I was fully on board with any atheist things I saw coming across my world.
00:02:52.840 But then 9-11 hit and I was on a very, very liberal college campus and the kind of questions
00:02:58.560 that were popping up in my mind and I was asking myself were uncomfortable for me.
00:03:04.560 I just didn't know what to do with them.
00:03:05.840 I really had no formal philosophical training and I kind of just buried them, you know, under
00:03:11.340 under the weight of my own confusion or shame or just whatever kind of brew.
00:03:16.380 A lot of us were probably feeling at the time.
00:03:18.120 And then I discovered your work with The End of Faith, right when you sort of were responding
00:03:23.440 to the same thing.
00:03:25.140 And a lot of your language, you were philosophically trained and maybe sharper with your language
00:03:31.920 for better or worse, which we found out later was complicated, resonated with me.
00:03:37.540 And I started following along with your work and The Four Horsemen and Hitchens and Dawkins
00:03:42.880 and that sort of whole crowd.
00:03:44.720 And I'm sure I wasn't alone.
00:03:45.960 And then I paid close, special attention to what you were doing, which I actually included
00:03:52.720 in one of the pieces that I ended up putting together in this series.
00:03:56.860 But with a talk you gave in Australia, you know, I don't have to tell you about your career,
00:04:01.280 but again, I was following along as you were on sort of this atheist circuit and I was interested.
00:04:07.240 But whenever you would talk about sort of the hard work of secularism and the hard work
00:04:12.860 of atheism, this in particular, I'm thinking of your talk called Death in the Present Moment
00:04:17.660 right after Christopher Hitchens had died.
00:04:19.780 I'm actually curious how quickly you threw that together because I know you were supposed
00:04:23.700 to or you were planning on speaking about free will and you ended up giving this whole
00:04:28.020 other talk.
00:04:29.240 And that one, and I'll save it because I definitely put that one in our compilation.
00:04:32.840 But it struck me as, okay, this guy's up to something a little different and the questions
00:04:37.500 that he's asking are really different.
00:04:39.120 I was just on board with that ride.
00:04:40.640 So I became a fan and like probably many of your listeners started to really follow and
00:04:47.600 listen closely and became a student.
00:04:49.560 And hopefully like any good student started to disagree with my teacher a bit and slowly
00:04:54.580 get the confidence to push back and have my own thoughts and maybe find the weaknesses
00:05:00.780 and strengths of what you were up to.
00:05:03.440 And, you know, your work exposed me and many, many other people, I'm sure, to a lot of great
00:05:07.720 thinkers.
00:05:08.480 And maybe you don't love this, but sometimes the people who disagree with you that you introduce
00:05:13.440 us to on this side of the microphone, we think are right.
00:05:19.960 And that's a great credit to you as well for just giving them the air and maybe on some
00:05:24.460 really nerdy, esoteric things.
00:05:27.040 I'm one of them at this point now.
00:05:28.420 Because to back up way to the beginning of the story, I was at a university where I was
00:05:32.800 well on my way to a film degree, which is what I ended up getting.
00:05:35.420 But when 9-11 hit, I started taking a lot more courses in a track that they had, which
00:05:40.880 I think is fairly unique at the time.
00:05:43.200 Maybe one, maybe still one of the only programs where you can actually major in Holocaust studies,
00:05:48.580 which is sort of sits in between the history and philosophy kind of departments.
00:05:54.380 And I started taking a bunch of courses in there.
00:05:57.420 And that's where I was first exposed to sort of the formal philosophy, language, and education.
00:06:02.880 And that was so useful for me.
00:06:04.660 So I was just on board.
00:06:06.020 And now hopefully I, you know, I swim deep in those waters and know my way around the
00:06:10.460 lingo.
00:06:10.720 And it's super helpful.
00:06:12.260 But yeah, it was almost, you know, Moore's law of bringing up the Nazis was those were the
00:06:17.600 first times actually in courses called like resistance during the Holocaust and things
00:06:22.380 like that, where, you know, I first was exposed to the words like deontology and consequentialism
00:06:27.660 and utilitarianism and a lot of moral ethics stuff.
00:06:30.260 And then I went further on my own into sort of the theory of mind and this kind of stuff.
00:06:34.980 But yeah, I consider myself in this weird new digital landscape that we're in a bit of a
00:06:40.180 student of the school of Sam Harris.
00:06:41.780 But then again, like hopefully any good student, I've branched off and have my own sort of thoughts
00:06:46.180 and framings.
00:06:47.660 And so that it's, I'm definitely in these pieces in this series of that we're calling
00:06:51.960 The Essential Sam Harris.
00:06:53.120 It is, I can't help but sort of put my writing and my framework on it, or at least hope that
00:06:59.600 the people and the challenges that you've encountered and continue to encounter, whether
00:07:04.920 they're right or wrong or making drastic mistakes, I want to give everything in it a really fair
00:07:11.020 hearing.
00:07:11.500 So there's times I'm sure where the listener will hear my own hand of opinion coming in
00:07:17.080 there, and I'm sure you know the areas as well.
00:07:19.100 But most times I'm just trying to give an open door to the mystery and why these subjects
00:07:25.620 interest you in the first place, if that makes sense.
00:07:28.660 Yeah, yeah.
00:07:29.560 And I should remind both of us that we met because you were directing a film focused on
00:07:37.160 Majid Nawaz and me around our book, Islam and the Future of Tolerance.
00:07:42.500 And also we've brought into this project another person who I think you met independently, I
00:07:50.020 kind of remember, but Megan Phelps Roper, who's been a guest on the podcast and someone who I
00:07:55.920 have long admired, and she's doing the voiceover work in this series, and she happens to have
00:08:01.900 a great voice, so I'm very happy to be working with her.
00:08:04.540 Yeah, I did meet her independently.
00:08:06.820 Your archive, I think you said three or four years old, your archive is over 10 years old
00:08:10.120 now.
00:08:10.940 Right.
00:08:11.500 And I was diving into the earliest days of it, and there are some fascinating conversations
00:08:17.260 that age really interestingly.
00:08:20.540 And I'm curious, I mean, I think this project, again, it's for fans, it's for listeners, but
00:08:25.180 it's for people who might hate you also, or critics of you, or people who are sure you were
00:08:28.960 missing something or wrong about something, or even yourself, to go back and listen to
00:08:34.020 certain conversations.
00:08:35.600 For example, one with like Dan Carlin, who hosts Hardcore History, you had him on, I think
00:08:39.660 that conversation is seven or eight years ago now.
00:08:42.220 And the part that I really resurfaced, it's actually in the morality episode, is full of
00:08:48.180 of details and philosophies and politics and moral philosophies regarding things like intervention
00:08:56.080 in the Middle East.
00:08:57.800 And at the time of your recording, of course, we had no idea how Afghanistan might look a
00:09:03.120 decade from then.
00:09:04.600 But now we kind of do, and it's not a, if people listen to these carefully, it's not
00:09:12.460 about, oh, this side of the conversation turned out to be right, and this kind of part turned
00:09:17.600 out to be wrong.
00:09:18.280 But certain things hit our ears a little differently.
00:09:21.800 Even on this first topic of artificial intelligence, I mean, I think that conversation continues
00:09:27.220 to be, evolve in a way where the issues that you bring up are evergreen, but hopefully evolving
00:09:35.220 as well, just as far as their application goes.
00:09:38.060 So yeah, so I think you, I would love to hear your thoughts listening back to some of those.
00:09:42.200 And in fact, to reference the film we made together, a lot of that film was you doing
00:09:46.480 that actively and live, given a specific topic of looking back and reassessing language
00:09:52.160 about how it might, you know, land politically in that project.
00:09:56.660 So yeah, but this, this goes into, to really different, including an episode about social
00:10:01.520 media, which changes every day, but fascinating to, yeah.
00:10:06.600 And the conversation you have with Jack Dorsey is now fascinating for all kinds of different
00:10:10.820 reasons that at the time couldn't have been.
00:10:13.260 So yeah, it's evergreen, but it's also just like new life in all of them, I think.
00:10:18.500 Yeah.
00:10:18.620 Yeah.
00:10:19.100 Yeah.
00:10:19.480 Well, I look forward to hearing it.
00:10:20.640 Just to be clear, this has been very much your project.
00:10:24.960 I mean, I haven't heard most of this material since the time I recorded it and released
00:10:31.980 it.
00:10:32.560 And, you know, you've gone back and created episodes on a theme where you've pulled
00:10:37.960 together five or six conversations and kind of intercut material from five or six different
00:10:45.380 episodes and then added your own interstitial pieces, which you have written and Megan Phelps
00:10:52.300 Roper is reading.
00:10:53.480 So it's just, these are very much, you know, their own documents.
00:10:57.360 And as you say, you don't agree with me about everything and you're occasionally you're, you're
00:11:02.000 shading different points from your own point of view.
00:11:04.840 And so, yeah, I look forward to hearing it and we'll be dropping the whole series here
00:11:10.940 in the podcast feed.
00:11:13.000 If you're in the public feed, as always, you'll be getting partial episodes.
00:11:17.980 And if you're in the subscriber feed, you'll be getting full episodes.
00:11:22.480 And the first will be on artificial intelligence.
00:11:25.960 And then there are many other topics, consciousness, violence, belief, free will, morality, death,
00:11:32.640 and others beyond that.
00:11:34.340 Yeah.
00:11:34.780 There's one existential threat in nuclear war that I'm still piecing together, but it's,
00:11:40.580 that one's pretty harrowing.
00:11:42.100 One of your areas of interest.
00:11:43.440 Yeah.
00:11:44.120 Yeah.
00:11:44.680 Yeah.
00:11:45.140 Great.
00:11:45.520 Well, thanks for the collaboration, Jay.
00:11:47.880 I'm, again, I'm, I'm a consumer of this, probably more than a collaborator at this
00:11:53.440 point, because I have only heard part of what you've done here.
00:11:57.680 So I will be, I'll be eager to listen as well, but thank you for the work that you've
00:12:01.860 done.
00:12:02.700 No, thank you.
00:12:03.540 And I'll just say like, it's, it's, you, you're gracious to allow someone to do this
00:12:09.080 who, who does have some, you know, again, most of our, my disagreements with you are
00:12:13.900 pretty deep and nerdy and, and esoteric kind of philosophy stuff, but it's incredibly gracious
00:12:19.800 that you've given me the opportunity to do it.
00:12:21.380 And then hopefully, again, I'm a bit of a representative for people who have been in
00:12:25.880 the passenger seat of your public project of thinking out loud for over a decade now.
00:12:30.920 And if I can, if I can, you know, be, be a voice for that, that part of the crowd, it's
00:12:37.880 just, it's an honor to do it.
00:12:39.340 And, and there are a lot of fun to a ton of fun.
00:12:41.120 There's a ton of audio, you know, like thought experiments that we play with and hopefully
00:12:44.720 bring to life in your ears a little bit, including in this very first one with artificial
00:12:48.840 intelligence.
00:12:49.400 So yeah, I hope people enjoy it.
00:12:52.160 I do as well.
00:12:52.900 So now we bring you Megan Phelps Roper on the topic of artificial intelligence.
00:13:00.000 Welcome to the essential Sam Harris.
00:13:03.100 This is making sense of artificial intelligence.
00:13:07.520 The goal of this series is to organize, compile, and juxtapose conversations hosted by Sam Harris
00:13:14.020 into specific areas of interest.
00:13:16.780 This is an ongoing effort to construct a coherent overview of Sam's perspectives and arguments,
00:13:23.020 the various explorations and approaches to the topic, the relevant agreements and disagreements,
00:13:28.580 and the pushbacks and evolving thoughts which his guests have advanced.
00:13:34.080 The purpose of these compilations is not to provide a complete picture of any issue, but
00:13:39.540 to entice you to go deeper into these subjects.
00:13:42.780 Along the way, we'll point you to the full episodes with each featured guest.
00:13:46.640 And at the conclusion, we'll offer some reading, listening, and watching suggestions, which range
00:13:52.980 from fun and light to densely academic.
00:13:56.560 One note to keep in mind for this series.
00:14:00.080 Sam has long argued for a unity of knowledge where the barriers between fields of study are
00:14:05.080 viewed as largely unhelpful artifacts of unnecessarily partitioned thought.
00:14:09.080 The pursuit of wisdom and reason in one area of study naturally bleeds into, and greatly affects,
00:14:16.200 others.
00:14:17.400 You'll hear plenty of crossover into other topics as these dives into the archives unfold.
00:14:22.960 And your thinking about a particular topic may shift as you realize its contingent relationships
00:14:27.880 with others.
00:14:29.320 In this topic, you'll hear the natural overlap with theories of identity and the self, consciousness,
00:14:36.040 and free will.
00:14:37.480 So, get ready.
00:14:39.680 Let's make sense of artificial intelligence.
00:14:48.680 Artificial intelligence is an area of resurgent interest in the general public.
00:14:53.680 Its seemingly eminent arrival first garnered wide attention in the late 60s, with thinkers
00:14:58.220 like Marvin Minsky and Isaac Asimov writing provocative and thoughtful books about the burgeoning technology
00:15:03.840 and concomitant philosophical and ethical quandaries.
00:15:06.580 Science fiction novels, comic books, and TV shows were flooded with stories of killer robots and encounters
00:15:13.680 with super-intelligent artificial lifeforms hiding out on nearby planets, which we thought
00:15:18.900 we would soon be visiting on the backs of our new rocket ships.
00:15:21.500 Over the following decades, the excitement and fervor look to have faded from view in the public imagination.
00:15:28.600 But in recent years, it has made an aggressive comeback.
00:15:32.300 Perhaps this is because the fruits of the AI revolution and the devices and programs once only imagined in those science fiction stories
00:15:39.360 have started to rapidly show up in impressive and sometimes disturbing ways all around us.
00:15:44.800 Our smartphones, cars, doorbells, watches, games, thermostats, vacuum cleaners, light bulbs, and glasses now have embedded algorithms
00:15:56.220 running on increasingly powerful hardware which navigate, dictate, or influence not just our locomotion,
00:16:03.180 but our entertainment choices, our banking, our politics, our dating lives, and just about everything else.
00:16:09.900 It seems every other TV show or movie that appears on a streaming service
00:16:14.700 is birthed out of a collective interest, fear, or otherwise general fascination
00:16:19.740 with the ethical, societal, and philosophical implications of artificial intelligence.
00:16:24.900 There are two major ways to think about the threat of what is generally called AI.
00:16:30.720 One is to think about how it will disrupt our psychological states or fracture our information landscape.
00:16:36.240 And the other is to ponder how the very nature of the technical details of its development may threaten our existence.
00:16:44.000 This compilation is mostly focused on the latter concern.
00:16:47.740 Because Sam is certainly amongst those who are quite worried about the existential threat
00:16:52.080 of the technical development and arrival of AI.
00:16:57.420 Now, before we jump into the clips, there are a few concepts that you'll need to onboard to find your footing.
00:17:02.760 You'll hear the terms Artificial General Intelligence, or AGI,
00:17:08.280 and Artificial Superintelligence, or ASI, used in these conversations.
00:17:13.980 Both of these terms refer to an entity which has a kind of intelligence
00:17:17.620 that can solve a nearly infinitely wide range of problems.
00:17:22.260 We humans have brains which display this kind of adaptable intelligence.
00:17:25.720 We can climb a ladder by controlling our legs and arms in order to retrieve a specific object
00:17:31.120 from a high shelf with our hands.
00:17:33.480 And we use the same brain to do something very different,
00:17:36.820 like recognize emotions in the tone of a voice of a romantic partner.
00:17:40.160 I look forward to infinity with you.
00:17:42.140 That same brain can play a game of checkers against a young child,
00:17:45.900 who we might also be coyly trying to let win.
00:17:48.940 Or play a serious game of competitive chess against a skilled adult.
00:17:52.160 That same brain can also simply lift a coffee mug to our lips,
00:17:57.280 not just to ingest nutrients and savor the taste of the beans,
00:18:00.540 but also to send a subtle social signal to a friend at the table
00:18:03.980 to let them know that their story is dragging on a bit.
00:18:07.820 All of that kind of intelligence is embodied and contained in the same system,
00:18:12.620 namely, our brains.
00:18:14.980 AGI refers to a human level of intelligence,
00:18:17.800 which doesn't surpass what our brightest humans can accomplish on any given task,
00:18:22.220 while ASI references an intelligence which performs at,
00:18:26.440 well, superhuman levels.
00:18:29.300 This description of flexible intelligence is different from a system
00:18:32.680 which is programmed or trained to do one particular thing incredibly well,
00:18:37.240 like arithmetic,
00:18:38.760 or painting straight lines on the sides of a car,
00:18:41.100 or playing computer chess,
00:18:43.880 or guessing large prime numbers,
00:18:46.060 or displaying music options to a listener
00:18:48.340 based on the observable lifestyle habits of like-minded users
00:18:51.880 and a certain demographic.
00:18:53.760 That kind of system has an intelligence
00:18:55.940 that is sometimes referred to as narrow or weak AI.
00:19:00.600 But even that kind of thing can be quite worrisome
00:19:03.100 from the standpoint of weaponization or preference manipulation.
00:19:07.220 You'll hear Sam voice his concerns throughout these conversations,
00:19:10.420 and he'll consistently point to our underestimation of the challenge
00:19:14.140 that even narrow AI poses.
00:19:17.200 So, there are dangers and serious questions to consider
00:19:20.640 no matter which way we go with the AI topic.
00:19:23.780 But as you'll also hear in this compilation,
00:19:26.700 not everyone is as concerned about the technical existential threat of AI as Sam is.
00:19:32.380 Much of the divergence in levels of concern
00:19:34.680 stems from initial differences on the fundamental conceptual approach
00:19:38.700 towards the nature of intelligence.
00:19:41.860 Defining intelligence is notoriously slippery and controversial,
00:19:45.480 but you're about to hear one of Sam's guests
00:19:48.040 offer a conception which distills intelligence
00:19:50.540 to a type of observable competence
00:19:52.480 at actualizing desired tasks,
00:19:54.860 or an ability to manifest preferred future states
00:19:58.000 through intentional current action and intervention.
00:20:00.560 You can imagine a linear gradient indicating more or less
00:20:05.140 of the amount of this competence as you move along it.
00:20:09.320 This view places our human intelligence on a continuum
00:20:12.360 along with bacteria, ants, chickens, honeybees, chimpanzees,
00:20:18.280 all of the potential undiscovered alien lifeforms,
00:20:21.400 and, of course, artificial intelligence,
00:20:24.480 which perches itself far above our lowly human competence.
00:20:27.500 This presents some rather alarming questions.
00:20:32.820 Stephen Hawking once issued a famous warning
00:20:34.940 that perhaps we shouldn't be actively seeking out
00:20:37.560 intelligent alien civilizations,
00:20:39.900 since we'd likely discover a culture
00:20:41.500 which is far more technologically advanced than ours.
00:20:44.820 And, if our planet's history provides any lessons,
00:20:48.020 it seems to prove that when technologically mismatched cultures
00:20:50.920 come into contact,
00:20:52.340 it usually doesn't work out too well for the lesser-developed one.
00:20:55.460 Are we bringing that precise suicidal encounter into reality
00:20:59.920 as we set out to develop artificial intelligence?
00:21:03.320 That question alludes to what is known as the value alignment problem.
00:21:07.860 But, before we get to that challenge,
00:21:10.120 let's go to our first clip,
00:21:11.400 which starts to lay out the important definitional foundations
00:21:14.100 and distinction of terms in the landscape of AI.
00:21:17.780 The thinker you're about to meet
00:21:19.440 is the decision theorist and computer scientist Eliezer Yudkowsky.
00:21:23.140 Yudkowsky begins here by defending this linear gradient perspective on intelligence
00:21:28.160 and offers an analogy to consider
00:21:30.260 how we might be mistaken about intelligence
00:21:32.460 in a similar way to how we once were mistaken
00:21:34.980 about the nature of fire.
00:21:37.040 It's clear that Sam is aligned and attracted
00:21:39.320 to Eliezer's run at this question,
00:21:41.500 and consequently,
00:21:42.720 both men end up sharing a good deal of unease
00:21:45.160 about the implications that all of this has for our future.
00:21:48.120 This is from episode 116,
00:21:51.440 which is entitled
00:21:52.220 AI, Racing Towards the Brink.
00:21:58.240 Let's just start with the basic picture
00:22:00.400 and define some terms.
00:22:02.360 I suppose we should define
00:22:03.760 intelligence first
00:22:05.760 and then jump into
00:22:07.720 the differences between
00:22:09.760 strong and weak
00:22:11.380 or general versus narrow AI.
00:22:13.920 Do you want to start us off on that?
00:22:15.760 Sure.
00:22:17.340 Preamble disclaimer, though.
00:22:19.640 The field in general,
00:22:21.420 like not everyone you would ask
00:22:22.640 would give you the same definition of intelligence.
00:22:24.880 And a lot of times in cases like those,
00:22:27.140 it's good to sort of go back
00:22:28.980 to observational basics.
00:22:31.000 We know that in a certain way,
00:22:33.020 human beings seem a lot more competent
00:22:35.380 than chimpanzees,
00:22:37.220 which seems to be a similar dimension
00:22:39.100 to the one where chimpanzees
00:22:40.600 are more competent than mice
00:22:43.920 or that mice are more competent than spiders.
00:22:47.600 And people have tried various theories
00:22:49.680 about what this dimension is.
00:22:51.760 They've tried various definitions of it.
00:22:54.000 But if you went back a few centuries
00:22:55.620 and asked somebody to define fire,
00:22:58.980 the less wise ones would say,
00:23:00.740 ah, fire is the release of phlogiston.
00:23:03.120 Fire is one of the four elements.
00:23:04.920 And the truly wise ones would say,
00:23:07.040 well, fire is the sort of orangey,
00:23:08.700 bright, hot stuff that comes out of wood
00:23:10.800 and like spreads along wood.
00:23:12.060 And they would tell you what it looked like
00:23:13.900 and put that prior to their theories
00:23:16.200 of what it was.
00:23:17.580 So what this mysterious thing looks like
00:23:20.200 is that humans can build space shuttles
00:23:23.740 and go to the moon and mice can't.
00:23:26.340 And we think it has something to do with our brains.
00:23:28.820 Yeah, yeah.
00:23:30.200 I think we can make it more abstract than that.
00:23:33.160 Tell me if you think this is not generic enough
00:23:35.260 to be accepted by most people in the field.
00:23:37.880 Whatever intelligence may be in specific context,
00:23:43.380 generally speaking,
00:23:44.340 it's the ability to meet goals,
00:23:47.660 perhaps across a diverse range of environments.
00:23:51.720 And we might want to add
00:23:53.600 that it's at least implicit in intelligence
00:23:56.420 that interests us.
00:23:57.840 It means an ability to do this flexibly
00:24:01.020 rather than by rote,
00:24:02.740 following the same strategy again and again blindly.
00:24:05.480 Does that seem like a reasonable starting point?
00:24:08.940 I think that that would get
00:24:10.720 fairly widespread agreement.
00:24:12.520 And it like matches up well
00:24:13.780 with some of the things that are in AI textbooks.
00:24:15.760 If I'm allowed to sort of take it a bit further
00:24:18.040 and begin injecting my own viewpoint into it,
00:24:21.220 I would refine it and say
00:24:22.940 that by achieve goals,
00:24:25.380 we mean something like
00:24:26.660 squeezing the measure of possible futures
00:24:30.100 higher in your preference ordering.
00:24:31.900 If we took all the possible outcomes
00:24:34.520 and we ranked them
00:24:35.640 from the ones you like least
00:24:36.960 to the ones you like most,
00:24:38.680 then as you achieve your goals,
00:24:41.500 you're sort of like squeezing the outcomes
00:24:43.220 higher in your preference ordering.
00:24:44.720 You're narrowing down
00:24:45.840 what the outcome would be
00:24:46.780 to be something more like what you want,
00:24:49.320 even though you might not be able
00:24:50.480 to narrow it down very exactly.
00:24:53.440 Flexibility, generality.
00:24:56.200 There's a,
00:24:57.620 like humans are much more domain general
00:25:00.400 than mice.
00:25:03.100 Bees build hives.
00:25:05.120 Beavers build dams.
00:25:06.580 A human will look over both of them
00:25:08.020 and envision a honeycomb structured dam.
00:25:13.360 Like we are able to operate
00:25:15.140 even on the moon,
00:25:17.860 which is like very unlike the environment
00:25:19.660 where we evolved.
00:25:21.340 In fact, our only competitor
00:25:22.800 in terms of general optimization,
00:25:26.760 where optimization is that sort of narrowing
00:25:28.660 of the future that I talked about,
00:25:30.400 our competitor in terms of general optimization
00:25:33.360 is natural selection.
00:25:34.960 Like natural selection built beavers.
00:25:38.460 It built bees.
00:25:39.860 It sort of implicitly built the spider's web
00:25:42.300 in the course of building spiders.
00:25:44.480 And we as humans have like this similar,
00:25:47.560 like very broad range to handle
00:25:49.200 this like huge variety of problems.
00:25:51.260 And the key to that is our ability
00:25:53.300 to learn things that natural selection
00:25:56.080 did not pre-program us with.
00:25:57.680 So learning is the key to generality.
00:26:01.080 I expect that not many people in AI
00:26:03.300 would disagree with that part either.
00:26:05.300 Right.
00:26:05.740 So it seems that goal-directed behavior
00:26:08.980 is implicit in this,
00:26:11.340 or even explicit in this definition
00:26:13.060 of intelligence.
00:26:13.680 And so whatever intelligence is,
00:26:15.300 it is inseparable from the kinds of behavior
00:26:19.740 in the world that results
00:26:21.400 in the fulfillment of goals.
00:26:22.920 So we're talking about agents
00:26:24.500 that can do things.
00:26:25.880 And once you see that,
00:26:28.340 then it becomes pretty clear
00:26:30.580 that if we build systems
00:26:33.580 that harbor primary goals,
00:26:36.980 you know, there are cartoon examples here,
00:26:38.540 like, you know, making paperclips.
00:26:40.600 These are not systems
00:26:41.500 that will spontaneously decide
00:26:44.340 that they could be doing
00:26:46.100 more enlightened things
00:26:47.440 than, say, making paperclips.
00:26:49.980 This moves to the question
00:26:51.400 of how deeply unfamiliar
00:26:53.680 artificial intelligence might be,
00:26:56.580 because there are no natural goals
00:26:59.880 that will arrive in these systems
00:27:02.500 apart from the ones we put in there.
00:27:04.780 And we have common sense intuitions
00:27:07.140 that make it very difficult
00:27:09.760 for us to think about
00:27:11.680 how strange an artificial intelligence
00:27:14.600 could be,
00:27:15.740 even one that becomes
00:27:17.540 more and more competent
00:27:18.700 to meet its goals.
00:27:20.220 Let's talk about
00:27:21.060 the frontiers of strangeness
00:27:23.780 in AI as we move from,
00:27:26.540 again, I think we have
00:27:27.360 a couple more definitions
00:27:28.240 we should probably put in play here,
00:27:30.060 differentiating strong and weak
00:27:32.120 or general and narrow intelligence.
00:27:35.260 Well, to differentiate
00:27:36.560 general and narrow,
00:27:39.120 I would say that,
00:27:40.740 well, I mean, this is like,
00:27:41.980 on the one hand,
00:27:42.760 theoretically a spectrum.
00:27:44.480 Now, on the other hand,
00:27:45.360 there seems to have been
00:27:46.040 like a very sharp jump
00:27:47.420 in generality
00:27:48.180 between chimpanzees and humans.
00:27:51.260 So, breadth of domain
00:27:53.160 driven by breadth of learning.
00:27:56.740 Like DeepMind, for example,
00:27:58.520 recently built AlphaGo,
00:28:01.700 and I lost some money
00:28:03.080 betting that AlphaGo
00:28:04.140 would not defeat
00:28:04.880 the human champion,
00:28:06.040 which it promptly did.
00:28:07.660 And then a successor to that
00:28:09.880 was AlphaZero,
00:28:12.260 and AlphaGo
00:28:13.680 was specialized on Go.
00:28:15.640 It could learn to play Go
00:28:17.820 better than its starting point
00:28:19.980 for playing Go,
00:28:21.160 but it couldn't learn
00:28:21.940 to do anything else.
00:28:23.720 And then they simplified
00:28:25.840 the architecture for AlphaGo.
00:28:28.040 They figured out ways
00:28:29.100 to do all the things
00:28:30.360 it was doing
00:28:30.920 in more and more general ways.
00:28:32.680 They discarded the opening book,
00:28:34.620 like all the sort of
00:28:35.460 human experience of Go
00:28:36.620 that was built into it.
00:28:37.820 They were able to discard
00:28:38.840 all of the sort of like
00:28:39.960 programmatic special features
00:28:41.400 that detected features
00:28:42.420 of the Go board.
00:28:43.140 They figured out
00:28:44.780 how to do that
00:28:45.620 in simpler ways.
00:28:47.240 And because they figured out
00:28:48.040 how to do it in simpler ways,
00:28:49.480 they were able to generalize
00:28:51.140 to AlphaZero,
00:28:52.240 which learned how to play chess
00:28:54.420 using the same architecture.
00:28:57.320 They took a single AI
00:28:58.380 and got it to learn Go
00:28:59.940 and then like reran it
00:29:02.300 and made it learn chess.
00:29:03.660 Now that's not human general,
00:29:05.480 but it's like a step forward
00:29:08.420 in generality
00:29:09.300 of the sort that we're talking about.
00:29:10.740 Am I right in thinking
00:29:11.840 that that's a pretty
00:29:13.160 enormous breakthrough?
00:29:14.400 I mean, there's two things here.
00:29:15.540 There's the step
00:29:16.560 to that degree of generality,
00:29:18.800 but there's also the fact
00:29:20.160 that they built a Go engine.
00:29:22.760 I forget if it was a Go
00:29:23.740 or a chess or both,
00:29:25.500 which basically surpassed
00:29:28.200 all of the specialized AIs
00:29:31.880 on those games
00:29:33.360 over the course of a day, right?
00:29:35.620 Isn't the chess engine
00:29:37.540 of AlphaZero
00:29:39.600 better than any
00:29:41.300 dedicated chess computer ever?
00:29:43.480 And didn't it achieve that
00:29:44.480 just with astonishing speed?
00:29:46.820 Well, there was actually
00:29:47.920 like some amount
00:29:48.720 of debate afterwards
00:29:49.720 whether or not
00:29:50.760 the version of the chess engine
00:29:52.240 that it was tested against
00:29:53.220 was truly optimal.
00:29:55.100 But like even that,
00:29:56.680 even the extent that it
00:29:57.700 was in that narrow range
00:30:00.240 of the best existing chess engine,
00:30:02.400 as Max Tegmark put it,
00:30:03.960 the real story wasn't
00:30:05.920 in how AlphaGo
00:30:09.300 beat human Go players.
00:30:11.840 It's how AlphaZero
00:30:13.980 beat human Go system programmers
00:30:16.900 and human chess system programmers.
00:30:19.720 People had put years
00:30:21.420 and years of effort
00:30:22.640 into accreting
00:30:23.620 all of the special purpose code
00:30:25.740 that would play chess
00:30:26.920 well and efficiently.
00:30:29.340 And then AlphaZero
00:30:30.980 blew up to and possibly
00:30:32.480 past that point in a day.
00:30:34.720 And if it hasn't already
00:30:36.000 gone past it,
00:30:37.260 well, it would be past it
00:30:39.840 by now if DeepMind
00:30:41.360 kept working on it.
00:30:42.560 Although they've now
00:30:43.720 basically declared victory
00:30:45.180 and shut down that project
00:30:46.980 as I understand it.
00:30:48.820 Okay, so talk about
00:30:50.280 the distinction between
00:30:52.300 general and narrow intelligence
00:30:54.320 a little bit more.
00:30:54.940 So we have this feature
00:30:56.620 of our minds
00:30:57.500 most conspicuously
00:30:58.740 where we're general
00:31:00.060 problem solvers.
00:31:01.260 We can learn new things
00:31:03.920 and our learning
00:31:05.120 in one area
00:31:05.960 doesn't require
00:31:08.540 a fundamental rewriting
00:31:10.360 of our code.
00:31:11.640 Our knowledge in one area
00:31:13.040 isn't so brittle
00:31:13.920 as to be degraded
00:31:15.220 by our acquiring knowledge
00:31:16.860 in some new area.
00:31:18.260 Or at least this is not
00:31:19.260 a general problem
00:31:20.920 which erodes
00:31:22.320 our understanding
00:31:23.520 again and again.
00:31:24.600 And we don't yet have
00:31:26.640 computers that can do this
00:31:29.000 but we're seeing the signs
00:31:30.400 of moving in that direction.
00:31:33.320 And so then it's often imagined
00:31:34.940 that there's a kind of
00:31:36.160 near-term goal
00:31:37.320 which has always struck me
00:31:39.040 as a mirage
00:31:40.000 of so-called
00:31:41.560 human-level general AI.
00:31:43.920 I don't see how
00:31:44.980 that phrase
00:31:46.040 will ever mean
00:31:47.060 much of anything
00:31:47.760 given that
00:31:48.440 all of the narrow AI
00:31:50.220 we've built thus far
00:31:51.760 is superhuman
00:31:53.820 within the domain
00:31:55.380 of its applications.
00:31:57.200 The calculator
00:31:57.880 in my phone
00:31:58.620 is superhuman
00:32:00.200 for arithmetic.
00:32:01.680 Any general AI
00:32:03.080 that also has
00:32:04.340 my phone's ability
00:32:05.320 to calculate
00:32:06.340 will be superhuman
00:32:07.740 for arithmetic
00:32:08.420 but we must presume
00:32:10.180 it'll be superhuman
00:32:11.160 for all of the
00:32:12.800 dozens or hundreds
00:32:14.120 of specific
00:32:15.400 human talents
00:32:17.000 we've put into it
00:32:18.040 whether it's
00:32:18.420 facial recognition
00:32:19.280 or just obviously
00:32:20.840 memory will be superhuman
00:32:22.680 unless we decide
00:32:24.080 to consciously degrade it.
00:32:25.720 Access to the world's data
00:32:27.100 will be superhuman
00:32:27.980 unless we isolate it
00:32:29.440 from data.
00:32:30.320 Do you see
00:32:30.820 this notion
00:32:31.500 of human-level AI
00:32:33.480 as a landmark
00:32:35.300 on the timeline
00:32:36.400 of our development
00:32:37.120 or is it just
00:32:37.700 never going to be reached?
00:32:39.400 I think that a lot
00:32:40.480 of people in the field
00:32:41.420 would agree
00:32:41.960 that human-level AI
00:32:45.200 defined as
00:32:45.960 literally
00:32:46.600 at the human level
00:32:48.140 neither above
00:32:48.800 nor below
00:32:49.540 across a wide
00:32:50.820 range of competencies
00:32:51.940 is a straw target
00:32:53.820 is an impossible mirage.
00:32:55.340 Right now
00:32:56.100 it seems like
00:32:57.080 AI is clearly
00:32:58.160 dumber
00:32:58.800 and less general
00:32:59.560 than us
00:33:00.040 or rather that
00:33:01.380 like if we're put
00:33:02.140 into a sort of
00:33:02.720 like real world
00:33:04.160 lots of things
00:33:05.140 going on
00:33:05.900 context that
00:33:07.600 places demands
00:33:08.740 on generality
00:33:09.580 then AIs
00:33:10.900 are not really
00:33:11.420 in the game yet.
00:33:12.800 Humans are like
00:33:13.380 clearly way ahead
00:33:14.260 and more controversially
00:33:16.300 I would say
00:33:17.520 that we can
00:33:18.100 imagine a state
00:33:18.960 where the AI
00:33:19.800 is clearly
00:33:20.280 way ahead
00:33:20.940 where it is
00:33:22.860 across sort of
00:33:24.200 every kind
00:33:25.300 of cognitive
00:33:25.720 competency
00:33:26.440 barring some
00:33:28.580 like very narrow
00:33:29.600 ones that like
00:33:30.580 aren't deeply
00:33:31.480 influential
00:33:32.100 of the others
00:33:32.540 like maybe
00:33:33.000 chimpanzees
00:33:34.000 are better
00:33:35.040 at using
00:33:36.160 a stick
00:33:36.900 to draw ants
00:33:37.860 from an ant hive
00:33:38.820 and eat them
00:33:39.420 than humans are
00:33:40.860 though no humans
00:33:42.020 have really like
00:33:42.480 practiced that
00:33:43.080 to world championship
00:33:43.900 level exactly
00:33:44.760 but there's
00:33:46.360 this sort of
00:33:46.720 general factor
00:33:47.640 of how good
00:33:48.520 are you at it
00:33:49.760 when reality
00:33:50.340 throws you
00:33:50.760 a complicated
00:33:51.260 problem
00:33:51.760 at this
00:33:52.300 chimpanzees
00:33:53.380 are clearly
00:33:54.160 not better
00:33:54.680 than humans
00:33:55.120 humans are
00:33:55.580 clearly better
00:33:56.100 than chimps
00:33:56.600 even if you
00:33:56.900 can manage
00:33:57.220 to narrow
00:33:57.500 down one
00:33:57.940 thing the
00:33:58.300 chimp is
00:33:58.600 better at
00:33:59.060 the thing
00:33:59.660 the chimp
00:33:59.960 is better
00:34:00.300 at doesn't
00:34:00.760 play a big
00:34:01.320 role in
00:34:02.220 our global
00:34:02.800 economy
00:34:03.260 it's not
00:34:03.620 an input
00:34:04.040 that feeds
00:34:04.460 into lots
00:34:04.880 of other
00:34:05.200 things
00:34:05.640 so we can
00:34:07.140 clearly imagine
00:34:08.700 I would say
00:34:09.360 like there are
00:34:09.740 some people
00:34:10.100 who say this
00:34:10.500 is not possible
00:34:11.240 I think they're
00:34:12.160 wrong but it
00:34:13.120 seems to me
00:34:13.480 that it is
00:34:14.100 perfectly coherent
00:34:14.940 to imagine
00:34:15.600 an AI that is
00:34:16.940 like better
00:34:17.680 at everything
00:34:18.400 or almost
00:34:18.900 everything
00:34:19.300 than we are
00:34:19.940 and such
00:34:21.320 that if it
00:34:21.780 was like
00:34:22.120 building an
00:34:22.580 economy with
00:34:23.200 lots of
00:34:23.540 inputs
00:34:23.980 like humans
00:34:25.060 would have
00:34:25.680 around the
00:34:26.080 same level
00:34:26.520 input into
00:34:27.080 that economy
00:34:27.660 as the
00:34:27.980 chimpanzees
00:34:28.560 have into
00:34:28.900 ours
00:34:29.280 yeah
00:34:30.080 yeah so
00:34:30.880 what you're
00:34:31.600 gesturing at
00:34:32.820 here is a
00:34:33.680 continuum of
00:34:34.740 intelligence
00:34:35.280 that I think
00:34:37.040 most people
00:34:37.780 never think
00:34:40.260 about and
00:34:41.160 because they
00:34:42.080 don't think
00:34:42.460 about it
00:34:42.860 they have a
00:34:44.660 default doubt
00:34:45.740 that it
00:34:46.480 exists
00:34:47.060 I think
00:34:48.180 when people
00:34:48.660 this is a
00:34:49.160 point I know
00:34:49.640 you've made
00:34:50.240 in your
00:34:51.120 writing and
00:34:51.720 I'm sure
00:34:52.260 it's a point
00:34:52.640 that Nick
00:34:53.060 Bostrom made
00:34:53.700 somewhere in
00:34:54.160 his book
00:34:54.460 super intelligence
00:34:55.160 it's this
00:34:56.380 idea that
00:34:57.000 there's a
00:34:57.820 huge blank
00:34:58.360 space on
00:34:58.820 the map
00:34:59.300 past the
00:35:00.840 most well
00:35:01.900 advertised
00:35:02.340 exemplars of
00:35:03.520 human brilliance
00:35:05.240 where we don't
00:35:06.100 imagine you
00:35:07.160 know what it
00:35:07.480 would be like
00:35:07.960 to be five
00:35:09.300 times smarter
00:35:10.160 than the
00:35:10.660 smartest person
00:35:11.440 we could
00:35:12.100 name
00:35:12.420 and we
00:35:13.100 don't even
00:35:13.360 know what
00:35:13.720 that would
00:35:14.340 consist in
00:35:15.740 right because
00:35:16.420 if chimps
00:35:17.380 could be
00:35:17.940 given to
00:35:18.320 wonder what
00:35:18.860 it would be
00:35:19.120 like to be
00:35:19.540 five times
00:35:20.080 smarter than
00:35:20.740 the smartest
00:35:21.680 chimp
00:35:22.160 they're not
00:35:23.420 going to
00:35:24.220 represent for
00:35:25.380 themselves all
00:35:26.800 of the things
00:35:27.340 that we're
00:35:27.880 doing that
00:35:28.360 they can't
00:35:28.780 even dimly
00:35:29.800 conceive
00:35:30.160 there's a
00:35:31.220 kind of
00:35:31.620 disjunction
00:35:32.440 that comes
00:35:33.080 with more
00:35:34.380 there's a
00:35:35.120 phrase used
00:35:36.140 in military
00:35:37.360 context
00:35:38.120 i don't
00:35:38.800 think the
00:35:39.120 quote is
00:35:39.500 actually
00:35:39.760 it's
00:35:40.160 variously
00:35:40.560 attributed
00:35:40.880 to
00:35:41.260 stalin
00:35:41.880 and napoleon
00:35:42.800 and i think
00:35:43.780 clauswitz
00:35:44.420 like half a dozen
00:35:45.400 people who have
00:35:46.100 claimed this
00:35:46.480 quote the
00:35:47.680 quote is
00:35:48.200 sometimes
00:35:49.140 quantity has
00:35:50.260 a quality
00:35:50.800 all its own
00:35:51.560 as you ramp
00:35:52.700 up in
00:35:53.540 intelligence
00:35:54.420 whatever it
00:35:55.000 is at the
00:35:55.500 level of
00:35:56.040 information
00:35:56.600 processing
00:35:57.220 spaces
00:35:58.240 of inquiry
00:36:00.360 and ideation
00:36:02.240 and experience
00:36:03.600 begin to
00:36:04.080 open up
00:36:04.540 and we
00:36:05.440 can't
00:36:05.680 necessarily
00:36:06.060 predict
00:36:06.600 what they
00:36:07.500 would be
00:36:07.960 from where
00:36:08.400 we sit
00:36:08.900 how do
00:36:09.380 you think
00:36:09.680 about this
00:36:10.240 continuum
00:36:10.800 of intelligence
00:36:12.400 beyond what
00:36:13.100 we currently
00:36:13.820 know in light
00:36:14.440 of what we're
00:36:14.960 talking about
00:36:15.460 well the
00:36:16.440 unknowable
00:36:17.700 is a concept
00:36:18.820 you have to be
00:36:19.440 very careful
00:36:20.140 with because
00:36:20.820 the thing you
00:36:21.460 can't figure
00:36:21.940 out in the
00:36:22.480 first 30
00:36:22.980 seconds of
00:36:23.500 thinking about
00:36:24.020 it sometimes
00:36:24.860 you can figure
00:36:25.640 it out if
00:36:25.980 you think
00:36:26.220 for another
00:36:26.560 five minutes
00:36:27.260 so in
00:36:28.280 particular i
00:36:29.060 think that
00:36:29.560 there's a
00:36:29.940 certain narrow
00:36:30.980 kind of
00:36:31.600 unpredictability
00:36:32.460 which does
00:36:33.920 seem to be
00:36:34.460 plausibly in
00:36:35.400 some sense
00:36:35.900 essential
00:36:36.420 which is
00:36:37.640 that for
00:36:38.720 alpha go
00:36:39.320 to play
00:36:40.120 better go
00:36:41.060 than the
00:36:41.980 best human
00:36:42.440 go players
00:36:43.120 it must be
00:36:44.140 the case
00:36:44.660 that the
00:36:45.700 best human
00:36:46.180 go players
00:36:46.860 cannot predict
00:36:47.820 exactly where
00:36:49.280 on the go
00:36:49.900 board alpha
00:36:51.060 go will play
00:36:51.820 if they could
00:36:52.760 predict exactly
00:36:53.540 where alpha
00:36:54.020 go would play
00:36:54.780 alpha go
00:36:55.540 would be no
00:36:55.940 smarter than
00:36:56.380 them
00:36:56.640 on the other
00:36:58.000 hand alpha
00:36:59.180 go's
00:36:59.600 programmers
00:37:00.140 and the
00:37:01.080 people who
00:37:01.520 knew what
00:37:01.860 alpha go's
00:37:02.380 programmers
00:37:02.780 were trying
00:37:03.320 to do
00:37:03.840 or even
00:37:04.560 just the
00:37:04.900 people who
00:37:05.300 watched alpha
00:37:05.920 go play
00:37:06.640 could say
00:37:07.640 well i
00:37:08.660 think the
00:37:09.080 system is
00:37:09.780 going to
00:37:10.160 play such
00:37:10.760 that it
00:37:11.040 will win
00:37:11.440 at the
00:37:11.680 end of
00:37:11.960 the game
00:37:12.420 even if
00:37:14.020 they couldn't
00:37:14.640 predict exactly
00:37:15.280 where it
00:37:15.580 would move
00:37:15.900 on the
00:37:16.180 board
00:37:16.500 so
00:37:17.500 similarly
00:37:18.600 there's
00:37:21.040 a
00:37:21.840 sort of
00:37:22.800 like not
00:37:23.460 short or
00:37:24.540 like not
00:37:24.980 necessarily
00:37:25.520 slam dunk
00:37:26.440 or not
00:37:27.440 like immediately
00:37:27.940 obvious chain
00:37:28.740 of reasoning
00:37:29.240 which says
00:37:30.160 that it
00:37:31.100 it is
00:37:31.360 okay for
00:37:32.520 us to
00:37:33.520 reason
00:37:34.640 about
00:37:35.460 aligned
00:37:38.040 or even
00:37:38.580 unaligned
00:37:39.320 artificial
00:37:39.920 general
00:37:40.340 intelligences
00:37:41.200 of sufficient
00:37:42.340 power
00:37:42.900 as if
00:37:44.180 they're trying
00:37:45.040 to do
00:37:45.400 something
00:37:45.960 but we
00:37:46.840 don't
00:37:47.040 necessarily
00:37:47.560 know what
00:37:48.280 but from
00:37:49.220 our perspective
00:37:49.920 that still
00:37:50.380 has consequences
00:37:51.260 even though
00:37:52.400 we can't
00:37:52.880 predict in
00:37:53.220 advance
00:37:53.540 exactly how
00:37:54.180 they're going
00:37:54.520 to do
00:37:54.860 it
00:37:55.300 Yudkowsky
00:37:59.020 lays out a
00:37:59.720 basic picture
00:38:00.400 of intelligence
00:38:01.060 that once
00:38:02.080 accepted
00:38:02.540 takes us
00:38:03.600 into the
00:38:04.000 details
00:38:04.500 and edges
00:38:05.340 us towards
00:38:05.860 the cliff
00:38:06.340 and now
00:38:08.960 we're going
00:38:09.420 to introduce
00:38:09.880 someone who
00:38:10.460 tosses us
00:38:11.000 fully into
00:38:11.600 the canyon
00:38:12.100 Yudkowsky
00:38:13.820 just brought
00:38:14.380 in the
00:38:14.620 concept we
00:38:15.180 mentioned
00:38:15.480 earlier
00:38:15.920 of value
00:38:16.540 alignment
00:38:16.960 in artificial
00:38:17.600 intelligence
00:38:18.300 there's a
00:38:19.720 related problem
00:38:20.540 called the
00:38:20.980 control or
00:38:21.720 containment
00:38:22.180 problem
00:38:22.680 both are
00:38:24.140 concerned with
00:38:24.700 the issue
00:38:25.080 of just
00:38:25.620 how we
00:38:26.060 would go
00:38:26.400 about
00:38:26.720 building
00:38:27.080 something
00:38:27.520 that is
00:38:28.260 unfathomably
00:38:29.180 smarter and
00:38:29.900 more competent
00:38:30.440 than us
00:38:30.920 that we
00:38:31.500 could either
00:38:31.840 contain in
00:38:32.540 some way
00:38:33.040 to ensure
00:38:33.600 it wouldn't
00:38:33.960 trample us
00:38:34.680 and as you'll
00:38:35.880 soon hear
00:38:36.440 that really
00:38:37.260 would take
00:38:37.780 no malicious
00:38:38.340 intent on
00:38:39.040 its part
00:38:39.600 or even
00:38:40.240 our part
00:38:40.920 or that its
00:38:42.340 goals would be
00:38:42.980 aligned with
00:38:43.440 ours in such
00:38:44.160 a way that it
00:38:44.680 would be making
00:38:45.200 our lives
00:38:45.800 genuinely
00:38:46.420 better
00:38:46.820 it turns out
00:38:48.320 that both of
00:38:48.900 those problems
00:38:49.520 are incredibly
00:38:50.340 difficult to
00:38:50.940 think about
00:38:51.440 let alone
00:38:52.080 solve
00:38:52.620 the control
00:38:53.960 problem entails
00:38:54.920 trying to
00:38:55.460 contain something
00:38:56.220 which by
00:38:56.980 definition can
00:38:58.120 outsmart us in
00:38:58.960 ways that we
00:38:59.520 literally can't
00:39:00.400 imagine
00:39:00.800 just think of
00:39:02.240 trying to keep a
00:39:02.940 prisoner locked in
00:39:03.800 a jail cell who
00:39:04.940 had the ability to
00:39:05.820 know exactly which
00:39:06.900 specific bribes or
00:39:08.060 threats would compel
00:39:09.220 every guard in the
00:39:10.080 place to unlock the
00:39:10.920 door even if those
00:39:12.620 guards aren't aware of
00:39:13.760 their own vulnerabilities
00:39:14.640 or perhaps even more
00:39:16.600 basically the prisoner
00:39:18.120 simply discovers features
00:39:19.340 in the laws of physics
00:39:20.340 that we have not yet
00:39:21.420 understood and that
00:39:22.960 somehow enable him to
00:39:24.260 walk through the thick
00:39:25.080 walls which we were
00:39:26.380 sure would stop him
00:39:27.300 and the other problem
00:39:30.180 that of value
00:39:31.300 alignment involves not
00:39:33.000 only discovering what
00:39:33.980 we truly want but
00:39:35.720 figuring out a way to
00:39:36.640 express it precisely and
00:39:38.080 mathematically so as to
00:39:39.780 not cause any
00:39:40.480 unintentional and
00:39:41.680 civilization
00:39:42.280 threatening destruction
00:39:43.300 it turns out that
00:39:45.580 this is incredibly hard
00:39:46.700 to do as well
00:39:47.420 this particular problem
00:39:49.840 nearly flips the
00:39:50.880 super intelligent threat
00:39:52.120 on its head to
00:39:53.200 something more like a
00:39:54.320 super dumb or let's
00:39:56.080 say super literal
00:39:57.180 machine which doesn't
00:39:58.660 understand all the
00:39:59.640 unspoken considerations
00:40:00.780 that we humans have
00:40:02.000 when we ask someone to
00:40:03.000 do something for us
00:40:03.960 this is what Sam was
00:40:06.600 alluding to in the
00:40:07.440 first conversation when
00:40:08.560 he referenced a
00:40:09.480 paperclip universe
00:40:10.940 the concern is that a
00:40:13.160 simple command to a
00:40:14.340 super intelligent machine
00:40:15.660 such as make paperclip
00:40:17.400 as fast as possible
00:40:18.500 could result in the
00:40:19.980 machine taking the
00:40:21.020 as fast as possible
00:40:22.440 part of that command
00:40:23.380 so literally that it
00:40:25.240 attempts to maximize
00:40:26.200 its speed and
00:40:27.140 performance by using
00:40:28.460 raw materials even the
00:40:30.040 carbon in our bodies
00:40:31.000 to build hard drives in
00:40:32.740 order to run billions
00:40:33.740 of simulations to
00:40:35.060 figure out the best
00:40:35.800 method for making
00:40:36.560 paperclips clearly that
00:40:38.640 misunderstanding would
00:40:39.440 be rather unfortunate
00:40:40.340 and neither of these
00:40:42.520 questions of value
00:40:43.560 alignment or
00:40:44.280 containment deal with
00:40:45.700 a potentially more
00:40:46.600 mundane terrorism threat
00:40:48.040 the threat of a bad
00:40:49.720 actor who would
00:40:50.460 purposefully unleash the
00:40:51.740 AI to inflict massive
00:40:53.180 harm but let's save
00:40:55.320 that cheery picture for
00:40:56.400 later now let's continue
00:40:58.980 our journey down the AI
00:41:00.220 path with the professor
00:41:01.300 of physics and author
00:41:02.420 Max Tegmark who
00:41:03.880 dedicates much of his
00:41:04.900 brilliant mind towards
00:41:05.880 these questions
00:41:06.640 Tegmark starts by taking
00:41:09.040 us back to our prison
00:41:10.060 analogy but this time he
00:41:11.940 places us in the cell and
00:41:13.660 imagines the equivalent of
00:41:14.960 a world of helpless and
00:41:16.100 hapless five-year-olds
00:41:17.160 making a real mess of
00:41:18.460 things outside of the
00:41:19.360 prison walls but we'll
00:41:21.780 start first with Sam
00:41:22.820 laying out his conception
00:41:24.180 of these relevant AI
00:41:25.360 safety questions this
00:41:27.360 comes from episode 94
00:41:28.820 the frontiers of
00:41:30.200 intelligence well let's
00:41:32.440 talk about this breakout
00:41:33.400 risk because this is
00:41:34.580 really the first concern
00:41:36.320 of everybody who's been
00:41:38.380 thinking about the what
00:41:40.680 has been called the
00:41:41.200 alignment problem or the
00:41:42.540 control problem just how
00:41:44.140 do we create an AI that
00:41:46.900 is superhuman in its
00:41:48.700 abilities and do that in
00:41:51.760 a context where it is
00:41:53.020 still safe I mean once we
00:41:54.500 once we cross into the
00:41:55.480 end zone and are still
00:41:57.180 trying to assess whether
00:41:58.940 the system we have built
00:42:00.540 is perfectly aligned with
00:42:02.120 our values how do we keep
00:42:03.920 it from destroying us if it
00:42:06.400 isn't perfectly aligned and
00:42:07.700 the solution to that
00:42:09.580 problem is to keep it locked
00:42:12.040 in a box but that's a
00:42:14.060 harder project than it
00:42:15.600 first appears and you
00:42:16.540 have many smart people
00:42:18.000 assuming that it's a
00:42:19.840 trivially easy project I
00:42:21.320 mean I've got you know I've
00:42:21.880 got people like Neil deGrasse
00:42:23.240 Tyson on my podcast saying
00:42:24.820 that he's just going to
00:42:25.740 unplug any superhuman AI if
00:42:27.540 it starts misbehaving you
00:42:29.440 know or shoot it with a
00:42:30.280 rifle now he's a little
00:42:31.420 tongue-in-cheek there but
00:42:32.520 he he clearly has a picture
00:42:34.060 of the development process
00:42:36.820 here that makes the
00:42:39.200 containment of an AI a very
00:42:42.540 easy problem to solve and
00:42:43.860 that even if that's true at
00:42:45.580 the beginning of the
00:42:46.340 process it's by no means
00:42:48.320 obvious that it remains easy
00:42:50.600 in perpetuity I mean you're
00:42:52.220 talking you have people
00:42:53.360 interacting with the AI that
00:42:56.520 that gets built and and you
00:42:58.000 at one point you you
00:42:59.660 described several scenarios of
00:43:01.020 of of breakout and you
00:43:03.720 you point out that even if
00:43:05.620 the AI's intentions are
00:43:08.500 perfectly benign if in fact
00:43:10.340 it is value aligned with us
00:43:12.200 it may still want to break
00:43:14.040 out because I mean just
00:43:15.100 imagine how you would feel
00:43:17.000 if you had nothing but the
00:43:18.720 interests of humanity at
00:43:20.380 heart but you you were in a
00:43:22.160 situation where every other
00:43:24.200 grown-up on earth died and
00:43:27.240 now you were but you're
00:43:28.400 basically imprisoned by a
00:43:30.080 population of five-year-olds
00:43:32.580 who you're trying to guide
00:43:34.820 from your jail cell to make
00:43:37.000 a better world and I'll let
00:43:39.180 you describe it but take me
00:43:40.880 to the the prison planet run
00:43:42.860 by five-year-olds yeah so if
00:43:45.060 when you're in that situation
00:43:46.300 obviously it's extremely
00:43:48.220 frustrating for you even if
00:43:49.560 you have only the best
00:43:50.580 intentions for the five-year-olds
00:43:52.820 you know you want to teach
00:43:54.300 them how to plant food and
00:43:56.780 they but they won't let you
00:43:57.760 outside to show you so you
00:43:59.860 have to try to explain but
00:44:00.940 you can't write down to-do
00:44:02.280 lists for them either because
00:44:03.600 then first you have to teach
00:44:04.900 them to read which takes a
00:44:06.460 very very long time you also
00:44:08.260 can't show them how to use
00:44:10.560 any power tools because they're
00:44:11.840 afraid to give them to you
00:44:12.720 because they don't understand
00:44:13.440 these tools well enough to be
00:44:14.700 convinced that they you can't
00:44:16.140 use them to break out it you
00:44:17.640 would have an incentive even if
00:44:19.460 your goal is just to help the
00:44:20.600 five-year-olds the first break
00:44:21.760 out and then help them now
00:44:23.860 before we talk more about
00:44:25.180 breakout though i think it's
00:44:26.860 worth taking a quick step
00:44:28.340 back because you talked
00:44:29.480 multiple times now about
00:44:30.720 superhuman intelligence and i
00:44:32.980 think it's very important to
00:44:34.340 be clear that intelligence is
00:44:36.720 not just something that goes on
00:44:38.540 a one-dimensional scale like
00:44:40.620 an iq and if your iq is above a
00:44:42.740 certain number you're superhuman
00:44:44.940 it's very important to
00:44:46.080 distinguish between narrow
00:44:47.200 intelligence and broad
00:44:48.560 intelligence
00:44:49.180 intelligence different is a
00:44:52.420 phrase that different word that
00:44:53.460 different people use to mean a
00:44:54.900 whole lot of different things
00:44:56.040 and they argue about it in the
00:44:58.640 book i just take this very
00:44:59.680 broad definition that
00:45:01.140 intelligence is how good you
00:45:02.280 are at accomplishing complex
00:45:04.280 goals which means your
00:45:05.640 intelligence is a spectrum how
00:45:06.980 good are you at this how good
00:45:08.300 are you that and it's just like
00:45:11.460 in sports it would make no sense
00:45:12.920 to say that there's a single
00:45:14.120 number your athletic coefficient
00:45:16.260 aq which determines how good
00:45:18.960 you're going to be winning
00:45:19.760 olympic medals and the athlete
00:45:22.220 that has the highest aq is going
00:45:23.820 to win all the medals so
00:45:25.860 today what we have is a lot of
00:45:27.660 devices that actually have
00:45:29.020 superhuman intelligence and very
00:45:30.440 narrow tasks we've had
00:45:32.260 calculators that can multiply
00:45:33.980 numbers better than us for a very
00:45:35.920 long time we have machines that
00:45:38.460 can play go better than us and
00:45:40.720 drive better than us but they
00:45:42.720 still can't beat us a tic-tac-toe
00:45:44.360 unless they're programmed for that
00:45:46.300 whereas we humans have this very
00:45:48.400 broad intelligence so when i talk
00:45:50.440 about superhuman intelligence with
00:45:52.960 you now that's really shorthand for
00:45:54.960 what we in geek speak call superhuman
00:45:56.960 artificial general intelligence
00:45:58.600 broad intelligence across the board
00:46:01.020 so that they can do all intellectual
00:46:03.340 tasks better than us so with that
00:46:05.360 let's come let me just come back to
00:46:06.560 your question about the breakout
00:46:07.940 there are two schools of thought for
00:46:10.080 how one should create a beneficial
00:46:11.960 future if we have super
00:46:13.580 intelligence one is to lock them up
00:46:15.560 and keep them confined like you
00:46:18.040 mentioned but there's also a school
00:46:19.580 of thought that says that that's
00:46:20.680 immoral if if these machines can
00:46:23.680 also have a subjective experience
00:46:25.180 and they shouldn't be treated like
00:46:28.700 slaves and that a better approach is
00:46:31.180 instead to let them be free but just
00:46:33.640 make sure that their values or goals
00:46:35.680 are aligned with ours after all grown-up
00:46:39.420 parents are more intelligent than their
00:46:41.560 one-year-old kids but that's fine for
00:46:43.400 the kids because the parents have goals
00:46:46.100 that are aligned with what with the
00:46:47.680 goals of what's what best for the kids
00:46:49.300 right but if you do go the confinement
00:46:51.340 route after all this enslaved god
00:46:53.600 scenario as i call it yes this is
00:46:57.060 extremely difficult as that five-year-old
00:46:59.780 example illustrates first of all almost
00:47:02.320 whatever open-ended goal you give your
00:47:04.100 machine it's probably going to have an
00:47:06.100 incentive to try to break out in one way
00:47:08.600 or the other and when people simply say
00:47:12.060 oh i'll unplug it you know if you're
00:47:15.580 chased by a heat-seeking missile you
00:47:17.760 probably wouldn't say i'm not worried
00:47:18.920 i'll just unplug it we have to let go of
00:47:22.420 this old-fashioned idea that intelligence
00:47:25.840 is just something that sits in your
00:47:27.420 laptop right good luck unplugging the
00:47:29.660 internet and even if you initially like in
00:47:33.300 my first book scenario have physical
00:47:35.920 confinement where you have a machine in a
00:47:38.100 room you're going to want to communicate
00:47:39.600 with it somehow right so that you can get
00:47:41.740 useful information from it to get rich or
00:47:45.460 take power or whatever you want to do
00:47:47.260 and you're going to need to put some
00:47:48.880 information into it about the world so it
00:47:51.460 can do smart things for you which already
00:47:54.680 shows how tricky this is i'm absolutely
00:47:56.760 not saying it's impossible but i think
00:47:58.820 it's fair to say that it's not at all clear
00:48:01.880 that it's easy either the other one of
00:48:05.320 getting the goals aligned it's also extremely
00:48:07.800 difficult first of all you need to get the
00:48:10.380 machine able to understand your goals so if
00:48:13.780 you if you have a future self-driving car and
00:48:16.440 you tell it to take you to the airport as fast as
00:48:18.300 possible and then you get there covered in
00:48:20.160 vomit chased by police helicopters and you're
00:48:22.760 like this is not what i asked for and it replies
00:48:25.760 that is exactly what you asked for then you realize how hard it is to get that
00:48:32.420 machine to learn your goals right if you tell an uber driver to take you to the
00:48:36.680 airport as fast as possible she's going to know that you actually had
00:48:40.200 additional goals that you didn't explicitly need to say because she's a
00:48:44.760 human too and she understands where you're coming from but for someone made
00:48:49.080 out of silicon you have to actually explicitly have it learn all of those
00:48:54.440 other things that we humans care about so that's hard and then once it can
00:48:59.500 understand your goals that doesn't mean it's going to adopt your goals i mean
00:49:02.700 everybody who has kids knows that
00:49:05.640 and finally if you get the machine to adopt your goals then
00:49:11.200 how can you ensure that it's going to retain those goals and
00:49:14.680 gradually gets smarter and smarter through self-improvement
00:49:18.780 most of us grown-ups have pretty different goals from what we had when we were
00:49:24.120 five i'm a lot less excited about legos now for example and uh we don't want a
00:49:30.800 super intelligent ai to just think about this goal of being nice to humans and some
00:49:35.580 some little passing fad from its early youth it seems to me that the second
00:49:41.980 scenario of value alignment does imply the first of keeping the ai successfully
00:49:48.540 boxed at least for a time because you have to be sure its value aligned before
00:49:54.980 you let it out in the world before you let it out on the internet for instance or
00:49:59.640 you create you know robots that have superhuman intelligence that are
00:50:04.480 functioning autonomously out in the world do you see a development path where we
00:50:09.820 don't actually have to solve the the boxing problem at least initially no i think
00:50:16.340 you're completely right even if your intent is to build a value line ai and let it out
00:50:20.480 you clearly are going to need to have it boxed up during the development phase when you're
00:50:24.620 just messing around with it just like any biolab that deals with dangerous pathogens is very
00:50:31.140 carefully sealed off and uh it's this highlights the incredibly pathetic state of computer
00:50:38.560 security today i mean and i think pretty much everybody who listens to this has at some
00:50:43.140 point experienced the blue screen of death courtesy of microsoft windows or the spinning wheel of doom
00:50:48.800 courtesy of apple and we need to get away from that to have truly robust machines if we're ever going
00:50:56.780 to be able to have ai systems that we can trust that are provably secure and i feel it's actually
00:51:03.860 quite embarrassing that we're so flippant about this it's it's maybe annoying if your computer
00:51:10.600 crashes and you lose one hour of work that you hadn't saved but it's not as funny anymore if it's
00:51:16.180 yourself driving car that crashed or the control system for your nuclear power plant or your nuclear
00:51:21.960 weapon system or something like that and when we start talking about human level ai and boxing systems
00:51:28.340 you have to have this much higher level of safety mentality where you've really made this a priority
00:51:35.080 the way we aren't doing today yeah you describe in the book various catastrophes that have happened
00:51:41.880 by virtue of software glitches or just bad user interface where you know the dot on the screen or
00:51:48.360 the number on the screen is is too small for the human user to deal with in real time and so we there
00:51:54.420 have been plane crashes where scores of people have died and patients have been annihilated by having you
00:52:01.940 know hundreds of times the radiation dose that they should have gotten in various machines because
00:52:07.900 the the software was improperly calibrated or the user had selected the wrong option and so we're by no
00:52:14.580 means perfect at this even when we have a human in the loop and here we're talking about systems that
00:52:23.440 we're creating that that are going to be fundamentally autonomous and you know the idea of having
00:52:29.660 perfect software that has been perfectly debugged before it assumes these massive responsibilities
00:52:37.260 it is fairly daunting i mean just i mean how do we recover from something like you know seeing the
00:52:43.840 stock market go to zero because we didn't understand the ai that we we unleashed on the on you know the dow
00:52:52.020 jones or the financial system generally i mean these are these are not impossible outcomes
00:52:58.600 yeah you you raise a very important point there just to inject some optimism in this i do want to
00:53:05.060 emphasize that first of all there's a huge upside also if one can get this right because people are
00:53:11.400 bad at things yeah in all of these areas where there were horrible accidents of course the technology can
00:53:15.820 save lives and health care and transportation and so many other areas so there's an incentive to do it
00:53:21.340 and secondly there are examples in history where we've had really good safety engineering
00:53:27.260 built in from the beginning for example when we sent neil armstrong buzz aldrin and michael collins to
00:53:33.720 they did not die there were tons of things that could have gone wrong but nasa very meticulously
00:53:39.880 tried to predict everything that possibly could go wrong and then take precautions so it didn't happen
00:53:45.920 right they weren't luck it wasn't luck that got them there it was planning and i think we need to shift
00:53:51.480 into this safety engineering mentality with ai development throughout history it's always been
00:53:58.880 the situation that we could we could create a better future with technology as long as we
00:54:03.020 won this race between the growing power of the technology and the growing wisdom with which we
00:54:08.600 managed it and in the past we by and large used the strategy of learning from mistakes to stay ahead in
00:54:15.600 the race we invented fire oopsie screwed up a bunch of times and then we uh invented the fire extinguisher
00:54:21.480 we uh invented cars oopsie and invented the seat belt but with more powerful technology like
00:54:28.260 nuclear weapons synthetic biology super intelligence we don't want to learn from mistakes that's a
00:54:36.080 terrible strategy we instead want to have a safety engineering mentality where we plan ahead and
00:54:42.760 get things right the first time because that might be the only time we have
00:54:46.780 it's helpful to note the optimism that tegmark plants in between the flashing warning signs
00:54:53.840 artificial intelligence holds incredible potential to bring about inarguably positive changes for
00:55:00.820 humanity like prolonging lives eliminating diseases avoiding all automobile accidents increasing logistic
00:55:09.220 efficiency in order to deliver food or medical supplies cleaning the climate increasing crop yields
00:55:15.480 expanding our cognitive abilities to learn languages or improve our memory the list goes on imagine being
00:55:23.060 able to simulate the outcome of a policy decision with a high degree of confidence in order to morally
00:55:28.160 assess it consequentially before it is actualized now some of those pipe dreams may run contrary to the
00:55:35.020 laws of physics but the likely possible positive outcomes are so tempting and morally compelling that the
00:55:41.160 urgency to think through the dangers is even more pressing than it first seems
00:55:45.020 tegmark's book on the subject where much of that came from is fantastic it's called life 3.0 just a
00:55:53.140 reminder that a reading watching and listening list will be provided at the end of this compilation
00:55:57.240 which will have all the relevant texts and links from the guests featured here somewhere in the middle of
00:56:03.440 the chronology of these conversations sam delivered a ted talk that focused on and tried to draw attention to
00:56:08.900 the value alignment problem much of his thinking about this entire topic was heavily influenced by the
00:56:15.260 philosopher nick bostrom's book superintelligence sam had nick on the podcast though their conversation
00:56:21.740 delved into slightly different areas of existential risk and ethics which belong in other compilations
00:56:26.900 but while we're on the topic of the safety and promise of ai we'll borrow some of bostrom's helpful
00:56:33.380 frameworks bostrom draws up a taxonomy of four paths of development for an ai each with its own safety and
00:56:43.500 control conundrums he calls these different paths oracles genies sovereigns and tools an artificially
00:56:53.620 intelligent oracle would be a sort of question and answer machine which we would simply seek advice from
00:56:58.940 it wouldn't have the power to execute or implement its solutions directly that would be our job think
00:57:05.760 of a super intelligent wise sage sitting on a mountaintop answering our questions about how to
00:57:10.900 solve climate change or cure a disease the ai genie and an ai sovereign both would take on a wish or
00:57:19.560 desired outcome which we impart to it and pursue it with some autonomy and power to achieve it out in the
00:57:25.320 world perhaps it would work in concert with nanorobots or some other networked physical
00:57:30.860 entities to do its work the genie would be given specific wishes to fulfill while the sovereign might
00:57:36.960 be given broad open-ended long-range mandates like increase flourishing or reduce hunger and lastly the
00:57:45.620 tool ai would simply do exactly what we command it to do and only assist us to achieve things we already
00:57:51.560 knew how to accomplish the tool would forever remain under our control while completing our tasks and
00:57:58.000 easing our burden of work there are debates and concerns about the impossibility of each of these
00:58:03.360 entities and ethical concerns about the potential consciousness and immoral exploitation of any of
00:58:09.060 these inventions but we'll table those notions just for a bit this next section digs in deeper on the
00:58:16.540 ideas of a genie or a sovereign ai which is given the ability to execute our wishes and commands
00:58:22.400 autonomously can we be assured that the genie or sovereign will understand us and that its values
00:58:28.620 will align in crucial ways with ours in this clip stuart russell a professor of computer science at cal
00:58:36.640 berkeley gets us further into the value alignment problem and tries to imagine all the possible ways that
00:58:42.940 having a genie or sovereign in front of us might go terribly wrong and of course what we might be
00:58:49.360 able to do to make it go phenomenally right sam considers this issue of value alignment central to
00:58:56.100 making any sense of ai so this is stuart russell from episode 53 the dawn of artificial intelligence
00:59:04.480 let's talk about that issue of what bostrom called the control problem i guess we call it the safety
00:59:13.720 problem just perhaps you can briefly sketch the concern here what is what is the concern about
00:59:20.100 general ai getting away from us how do you articulate that um so you mentioned earlier that this is a
00:59:28.820 concern that's being articulated by non-computer scientists and bostrom's book super intelligence
00:59:33.900 was certainly instrumental in bringing it to the attention of a of a wide audience you know people
00:59:39.520 like bill gates and elon musk and so on but the fact is that these concerns have been articulated by
00:59:47.040 the central figures in computer science and ai so i'm actually gonna going back to ij good and von
00:59:55.300 neumann uh well and and alan turing himself right um so people a lot of people may not know about this
01:00:04.680 i'm just gonna read a little quote so alan turing gave a talk on uh bbc radio radio three in 1951
01:00:15.760 um so he said if a machine can think it might think more intelligently than we do and then where should
01:00:23.920 we be even if we could keep the machines in a subservient position for instance by turning off
01:00:29.340 the power at strategic moments we should as a species feel greatly humbled this new danger is
01:00:36.200 certainly something which can give us anxiety so that's a pretty clear you know if we achieve
01:00:42.420 super intelligent ai we could have uh a serious problem another person who talked about this
01:00:49.220 issue was norbert wiener uh so norbert wiener was the uh one of the leading applied mathematicians
01:00:57.360 of the 20th century he was uh the founder of a good deal of modern control theory
01:01:03.900 um and uh automation site he's uh often called the father of cybernetics so he was he was concerned
01:01:13.120 because he saw arthur samuel's checker playing program uh in 1959 uh learning to play checkers
01:01:20.520 by itself a little bit like the dqn that i described learning to play video games but this is 1959 uh so
01:01:27.880 more than 50 years ago learning to play checkers better than its creator and he saw clearly in this
01:01:35.440 the seeds of the possibility of systems that could out distance human beings in general so and he he was
01:01:43.940 more specific about what the problem is so so turing's warning is in some sense the same concern that
01:01:49.780 gorillas might have had about humans if they had thought you know a few million years ago when the
01:01:55.740 human species branched off from from the evolutionary line of the gorillas if the gorillas had said to
01:02:01.120 themselves you know should we create these human beings right they're going to be much smarter than
01:02:04.640 us you know it kind of makes me worried right and and the probably they would have been right to worry
01:02:09.780 because as a species they're they sort of completely lost control over their own future and and humans
01:02:16.240 control everything that uh that they care about so so turing is really talking about this general sense of
01:02:24.340 unease about making something smarter than you is that a good idea and what wiener said was was this if
01:02:30.460 we use to achieve our purposes a mechanical agency with whose operation we cannot interfere effectively
01:02:37.280 we had better be quite sure that the purpose put into the machine is the purpose which we really desire
01:02:44.240 so this 1660 uh nowadays we call this the value alignment problem how do we make sure that the
01:02:52.420 the values that the machine is trying to optimize are in fact the values of the human who is trying to
01:03:00.560 get the machine to do something or the values of the human race in in general um and so we know
01:03:07.980 actually points to the sorcerer's apprentice story uh as a typical example of when when you give
01:03:15.900 uh a goal to a machine in this case fetch water if you don't specify it correctly if you don't cross
01:03:24.260 every t and dot every i and make sure you've covered everything then machines being optimizers they will
01:03:31.360 find ways to do things that you don't expect uh and those ways may make you very unhappy uh and this
01:03:38.980 story goes back you know to king midas uh you know 500 and whatever bc um where he got exactly what he
01:03:48.200 said which is the thing turns to gold uh which is definitely not what he wanted he didn't want his
01:03:54.060 food and water to turn to gold or his relatives to turn to gold but he got what he said he wanted
01:03:59.180 and all of the stories with the genies the same thing right you you give a wish to a genie the genie
01:04:04.880 carries out your wish very literally and then you know the third wish is always you know can you undo
01:04:09.600 the first two because i got them wrong and the problem with super intelligent ai uh is that you
01:04:16.360 might not be able to have that third wish or even a even a second wish yeah so if you so if you get it
01:04:22.500 wrong you might wish for something very benign sounding like you know could you cure cancer but if
01:04:27.880 if you haven't told the machine that you want cancer cured but you also want human beings to be
01:04:34.060 alive so a simple way to cure cancer in humans is not to have any humans um a quick way to come up
01:04:40.880 with a cure for cancer is to use the entire human race's guinea pigs or for millions of different
01:04:46.980 essential uh drugs that might cure cancer um so there's all kinds of ways things can go wrong and
01:04:53.080 you know we have you know governments all over the world try to write tax laws that don't have these
01:05:01.180 kinds of loopholes and they fail over and over and over again and they're only competing against
01:05:07.720 ordinary humans you know tax lawyers and rich people um and yet they still fail despite there being
01:05:16.680 billions of dollars at stake so our track record of being able to specify objectives and constraints
01:05:26.560 completely so that we are sure to be happy with the results our track record is is abysmal and
01:05:33.920 unfortunately we don't really have a scientific discipline for how to do this so generally we have
01:05:40.920 all these scientific disciplines ai control theory economics operations research that are about
01:05:49.020 how do you optimize an objective but none of them are about well what should the objective be so that
01:05:55.000 we're happy with the results so that's really i think the modern understanding uh as described
01:06:03.440 in bostrom's book and other papers of why a super intelligent machine could be problematic it's
01:06:10.120 because if we give it an objective which is different from what we really want then we we're basically
01:06:17.540 like creating a chess match with a machine right now there's us with our objective and it with
01:06:22.720 the objective we gave it which is different from what we really want so it's kind of like having
01:06:27.460 a chess match for the whole world uh and we're not too good at beating machines at chess
01:06:33.540 throughout these clips we've spoken about ai development in the abstract as a sort of
01:06:41.280 technical achievement that you can imagine happening in a generic lab somewhere but this next clip is going
01:06:47.320 to take an important step and put this thought experiment into the real world if this lab does create
01:06:55.040 something that crosses the agi threshold the lab will exist in a country and that country will have
01:07:01.020 alliances enemies paranoias prejudices histories corruptions and financial incentives like any country
01:07:09.400 how might this play out if you'd like to continue listening to this conversation you'll need to
01:07:17.060 subscribe at samharris.org once you do you'll get access to all full-length episodes of the making
01:07:22.180 sense podcast along with other subscriber only content including bonus episodes and amas and the
01:07:28.680 conversations i've been having on the waking up app the making sense podcast is ad free and relies
01:07:33.980 entirely on listener support and you can subscribe now at samharris.org