#53 — The Dawn of Artificial Intelligence
Episode Stats
Words per Minute
158.50911
Summary
Stuart Russell is a professor of computer science and engineering at UC Berkeley, and an adjunct professor of neurological surgery at UC San Francisco. He is the author of the most widely read textbook on the subject of artificial intelligence, A Modern Approach to Artificial Intelligence: An Introduction to the Science of Artificial Intelligence. In this conversation, we explore the topics that you may have heard me raise in my TED Talk on the topic of AI, "The Future of AI: What Will We Worry About if We Don't Have It?" and why we should be worried about it. We also talk about the role of AI in our everyday lives, and what it means to be a computer scientist in the 21st century, and why it's important to ask the question, "What happens if we actually build machines that are more intelligent than us?" What does that mean? And why is it so important that we worry about it? and why should we care about it at all? If we don't have it, what will happen if we do have it? What will it mean, and how will it affect us? And what will it do to us if we succeed in artificial intelligence? and what will that mean for the future of the world? In this episode of the Making Sense Podcast, I speak with Stuart Russell about all of these questions, and much more. I hope you enjoy the conversation, and that you find it useful, and find it interesting. . If you do, please consider becoming a supporter of the podcast by becoming a patron of Making Sense. We don't run ads on our podcast! or subscribe to our podcast by clicking here. We do not run ads. Thank you for listening to the podcast, and we do not endorse the podcast. We don t run ads, we only make ads on the podcast because we're not allowed to run ads in the podcast unless you read the podcast on the making sense podcast. We are making sense, we're making sense. We only make that in the show. We're making it so you get to decide what you're watching, not listening, not making sense of it. Thank you, not only by listening to it, not thinking it, right, right? We're not getting it, you're not making it, it's making sense? - Sam Harris, making it right, not right, and they're not watching it, and it's not even listening, right right?
Transcript
00:00:10.880
Just a note to say that if you're hearing this, you are not currently on our subscriber
00:00:14.680
feed and will only be hearing the first part of this conversation.
00:00:18.420
In order to access full episodes of the Making Sense Podcast, you'll need to subscribe at
00:00:24.060
There you'll find our private RSS feed to add to your favorite podcatcher, along with
00:00:30.520
We don't run ads on the podcast, and therefore it's made possible entirely through the support
00:00:35.880
So if you enjoy what we're doing here, please consider becoming one.
00:00:49.540
He is a professor of computer science and engineering at UC Berkeley.
00:00:53.840
He's also an adjunct professor of neurological surgery at UC San Francisco.
00:01:00.820
He is the author of the most widely read textbook on the subject of AI, Artificial Intelligence,
00:01:09.080
And over the course of these 90 minutes or so, we explore the topics that you may have
00:01:16.080
Anyway, Stuart is an expert in this field and a wealth of information, and I hope you find
00:01:25.860
I increasingly think that this is a topic that will become more and more pressing every
00:01:33.780
day, and if it doesn't for some reason, it will only be because scarier things have
00:01:43.640
So things are going well if we worry more and more about the consequences of AI, or so
00:02:06.540
Our listeners should know you've been up nearly all night working on a paper relevant to our
00:02:11.820
topic at hand, so double thank you for doing this.
00:02:18.000
Well, you've got now nearly infinite latitude not to be, so perhaps you can tell us a little
00:02:27.200
So I'm a professor at Berkeley, a computer scientist, and I've worked in the area of artificial
00:02:32.820
intelligence for about 35 years now, starting with my PhD at Stanford.
00:02:40.700
For most of that time, I've been what you might call a mainstream AI researcher.
00:02:45.460
I work on machine learning and probabilistic reasoning, planning, game playing, all the
00:02:53.480
And then the last few years, although this has been something that's concerned me for
00:03:01.800
a long time, I wrote a textbook in 1994 where I had a section of a chapter talking about
00:03:11.080
what happens if we succeed in AI, meaning what happens if we actually build machines that
00:03:19.580
Um, so that was sort of a, an intellectual question and it's become, um, a little bit
00:03:26.900
more urgent in the last few years as progress is accelerating and the resources going into
00:03:38.460
So I'm really asking people to take the question seriously.
00:03:44.800
As you know, I've joined the chorus of people who really in the last two years have begun
00:03:50.640
worrying out loud about the consequences of AI or the consequences of us not building it
00:03:57.600
with more or less perfect conformity to our interests.
00:04:02.200
And one of the things about this chorus is that it's mostly made up of non-computer scientists
00:04:09.300
and therefore people like myself or Elon Musk or even physicists like Max Tegmark and Stephen
00:04:16.500
Hawking are seemingly dismissed with alacrity by computer scientists who are deeply skeptical
00:04:26.740
And now you are not so easily dismissed because you are, you have the, the really the perfect
00:04:33.360
So I want to get us into this territory and I actually want, you know, I don't actually
00:04:37.600
know that you are quite as worried as, as I have sounded publicly.
00:04:41.740
So if there's any difference between your take and mine, that would be interesting to
00:04:46.540
But I also want us to, to, at some point, I'd like you to express the, the soundest basis
00:04:52.640
for this kind of skepticism that, you know, that we are crying wolf in a way that is unwarranted.
00:04:58.800
But before we get there, I just want to ask you a few questions to, to get our bearings.
00:05:03.260
The main purpose here is also just to educate our listeners about, you know, what artificial
00:05:07.900
intelligence is and what its implications are, whether if everything goes well or if everything
00:05:13.600
So a very disarmingly simple question here at first, what is a computer?
00:05:19.780
Well, uh, so pretty much everyone these days has a computer, but, um, doesn't necessarily
00:05:28.260
understand what it is, uh, the way it's presented to the public, whether it's your, your smartphone
00:05:34.720
or your laptop is something that runs a bunch of applications and the applications do things
00:05:40.540
like, you know, edit word documents, um, allow, you know, face-to-face video chat and things
00:05:48.360
Um, and what people may not understand is, is that a computer is, is a universal machine
00:05:53.500
that any process that can be described precisely can be carried out, uh, by a computer and, and
00:06:02.040
every computer can simulate every other computer.
00:06:04.560
Uh, and this, this property of universality means that, um, uh, that intelligence itself is
00:06:13.780
something that a computer can in principle, uh, emulate.
00:06:18.660
And this was realized, um, among other people by Ada Lovelace in the 1850s when she was working
00:06:28.460
Um, they had this idea that the machine they were designing might be a universal machine,
00:06:33.140
although they couldn't define that very precisely.
00:06:37.220
Um, and so the immediate thought is, well, if it's universal, then it can, uh, it can carry
00:06:44.220
out the processes of intelligence, um, as well as, you know, ordinary mechanical calculations.
00:06:50.860
So a computer, a computer is really anything, anything you want, uh, that you can describe
00:07:01.780
These sound like very simple questions, but these are, you know, disconcertingly deep questions.
00:07:06.800
I think everyone understands that, um, out there is, is a world, the real world.
00:07:14.300
Um, and we don't know everything about the real world.
00:07:20.980
In fact, it could be, there's a gazillion different ways the world could be, you know,
00:07:25.220
all, all the cars that are out there parked could be parked in different places and I wouldn't
00:07:29.760
So there are many, many ways the world could be.
00:07:31.580
And information is just something that, uh, tells you, uh, a little bit more about what,
00:07:39.020
uh, you know, what the world is, which, which way is the, is the real world, uh, out of all
00:07:47.820
And as you get more and more information about the world through, typically we get it through
00:07:52.340
our eyes and ears, uh, and increasingly we're getting it through the internet.
00:07:56.580
Um, then that, that information helps to, helps to narrow down the ways that the real
00:08:04.440
And Shannon, uh, who is a, uh, electrical engineer at MIT, uh, figured out a way to actually quantify
00:08:15.640
So, um, if you think about, uh, a coin flip, um, if I can tell you which way that coin is going
00:08:25.140
to come out, uh, heads or tails, then that, that's one bit of information.
00:08:30.520
And, uh, so that lets you give you, gives you the answer for a binary choice between two
00:08:36.940
And, uh, so from information theory, we have, um, we have wireless communication, we have
00:08:44.040
the internet, we have, uh, you know, all the things that allow computers to, to talk to
00:08:52.380
So information theory, uh, has been in some sense the, the complement or the handmaiden
00:08:58.680
of, of computation, um, and allowing, uh, allowing the whole information revolution to happen.
00:09:05.320
Now, is there an important difference between what you just described, computers and, and
00:09:16.180
Let's leave consciousness aside for the, for the moment.
00:09:18.980
But if I asked you, what is a mind, would you have answered that question differently?
00:09:24.940
So I, I think I would, because the mind, the word mind carries with it, um, this, this
00:09:34.320
It's not, uh, with the word mind, you can't really put aside the notion of consciousness.
00:09:40.600
Except if you're talking, I mean, if you're talking about the unconscious mind, you know,
00:09:43.980
like the, all the unconscious cognitive processing we do, does mind seem a misnomer there without
00:09:52.300
Unconscious mind is kind of like saying artificial grass.
00:09:58.680
Um, so just to give you a quote, John Hoagland has written a lot about, uh, AI.
00:10:05.040
He's a philosopher and he, he describes, uh, the notion of strong AI as it, as it used to
00:10:12.500
be called, uh, as building machines with minds in the full and literal sense.
00:10:18.080
Um, so, so the word mind there is really carrying the idea that there is true conscious awareness,
00:10:25.860
true semantic understanding, uh, and perception, you know, perception, perceptual experience.
00:10:33.420
And I actually think this is an incredibly important thing because without that, nothing
00:10:41.020
There are lots of complicated physical processes in the universe.
00:10:45.600
Um, you know, stars exploding and rivers and, you know, glaciers melting and all kinds of
00:10:52.020
But none of that has any moral value, uh, associated with it.
00:10:56.660
The things that generate moral value are things that have conscious experience.
00:11:01.240
So it's, that's a really, it's a really important topic, but AI has nothing to say about it
00:11:08.700
I guess it says that we're going to get there in terms of if consciousness is at some level,
00:11:13.460
just an emergent property of information processing.
00:11:15.860
If in fact, that is the punchline at the back of the book of nature, well, then we need to
00:11:21.120
think about the implications of building conscious machines, not just intelligent machines.
00:11:25.680
But you introduced a, a term here, which we should define.
00:11:29.760
You talked about strong versus weak AI, and I guess the, the more modern terms are narrow
00:11:40.000
So the word, the, the word strong and weak have actually changed their meaning over time.
00:11:44.960
Um, um, so strong AI was, I believe a phrase introduced by John Searle in his, uh, Chinese
00:11:52.480
room paper, or may have been slightly earlier than that.
00:11:55.960
But what he meant was the version of AI that says that if I build something with human level
00:12:04.020
intelligence, then in all probability, it's going to be a conscious device that, that the,
00:12:11.520
the functional properties of intelligence and consciousness are inseparable.
00:12:15.600
Um, and so strong AI is the sort of the super ambitious form of AI and, uh, weak AI was about
00:12:24.960
building AI systems that have capabilities that you want, that you want them to have.
00:12:30.740
Um, but they don't necessarily have the, uh, the consciousness or the, the first person experience.
00:12:37.320
Uh, so that, and then, and then I think there's been a number of, uh, people both inside and
00:12:46.100
outside the field sort of using strong and weak AI in, in various different ways.
00:12:50.040
And now largely you will see strong AI and sometimes general AI or artificial general intelligence
00:12:58.300
to mean, uh, building AI systems that have the capabilities comparable to or greater than
00:13:04.860
those of humans without any opinion being given on whether there's consciousness or not. Uh,
00:13:13.020
and then narrow AI, meaning AI systems that don't have the generality. They might be very capable,
00:13:18.580
like, you know, alpha go is a very capable go player, but it's narrow in the sense that it can't do
00:13:23.580
anything else. So we don't think of it as general purpose intelligence. Right. And given that
00:13:29.140
consciousness is something that, uh, we just don't have, uh, a philosophical handle, let alone,
00:13:37.540
uh, a scientific handle on, um, I think for the time being, we'll just have to put it to one side
00:13:44.460
and the discussion, uh, is going to have to focus on capabilities on, on the functional properties
00:13:52.420
of, of intelligent systems. Well, there's this other term one here is in this area, which strikes
00:13:58.580
me as a, an actual, a term that, that names almost nothing possible, but it's human level AI. And
00:14:09.540
that is, you know, it's often put forward as kind of the, the nearer landmark to super intelligent AI or
00:14:18.900
something that's beyond human. But it seems to me that even our narrow AI at this point, you know,
00:14:25.060
the, the calculator in your phone or anything else that gets good enough for us to dignify it with the
00:14:31.540
name intelligence very quickly becomes superhuman, even in its narrowness. So the phone is a better
00:14:38.100
calculator than, than I am or will ever be. And if you imagine building a system that is a true general
00:14:46.500
intelligence, it's, it's, it's learning is not confined to one domain as opposed to another,
00:14:51.060
but it's much more like a human being and that it can learn across a wide range of domains without
00:14:56.180
having the, you know, learning in one domain degrade its learning in another. You very quickly,
00:15:02.020
if not immediately, we'll be talking about superhuman AI because presumably this system will,
00:15:09.220
it's not going to be a worse calculator than my phone, right? It's not going to be a worse chess
00:15:13.620
player than deep blue. It's not, at a certain point, it's going to very quickly be better than
00:15:18.740
humans at everything it can do. So is, is human level AI a mirage or is it, is there some serviceable
00:15:26.340
way to think about that concept? So I think human level AI is just a notional goal. And I, I basically
00:15:34.980
agree with you that if, if we can achieve the generality of human intelligence, then we will probably
00:15:40.980
exceed on many dimensions, the actual capabilities of humans. So there are, there are things that
00:15:48.740
humans do that we really have no idea how to do yet. For example, what, what humans have done collectively
00:15:57.300
in terms of creating science, we don't know how to get machines to do something like that.
00:16:03.700
I mean, we can, we can imagine that theoretically it's possible, you know, it's some, somewhere in
00:16:11.380
the space of programs, uh, there exists a program that, that could be, uh, a high quality scientist,
00:16:17.540
but we don't, we don't know how to make anything like that. So it's possible that we could have
00:16:24.820
human level capabilities on sort of, on all the mundane intellectual tasks that don't require these
00:16:32.740
really creative reformulations of our whole conceptual structure, uh, that happen from
00:16:38.740
time to time in science. And, and this is sort of what, what's happening already, right? I mean,
00:16:45.780
in, in, in, in, as you say, in, in areas where computers become competent, they, they quickly become
00:16:51.860
super competent. And so we could have super competence across all the mundane areas, like, you know,
00:16:57.620
the ability to, to read a book and answer sort of the kinds of questions that, you know, uh, an
00:17:04.740
undergraduate could answer by reading, uh, reading a book. Uh, we might see those kinds of capabilities,
00:17:10.900
but it might be then quite a bit, uh, more work, uh, which, which may, we may not, uh, learn how to do
00:17:20.340
to get it to come up with the kinds of answers that the, you know, a truly creative and deep thinking
00:17:27.140
human, uh, could do from, from looking at the same material. Um, but this is, this is something
00:17:34.500
that at the moment is very speculative. I mean, what we, what we do see is the beginning of
00:17:41.220
generality. So you'll often see people in the media claiming, oh, well, you know, computers can only do
00:17:47.940
what they're programmed to do. They're only good at narrow tasks. But when you look at, for example,
00:17:52.660
DQN, which was Google deep minds, uh, first system that they demonstrated. So this learned to play
00:18:01.300
video games and it learned completely from scratch. So it was like a newborn baby opening its eyes for
00:18:06.580
the first time. It has no idea what kind of a world it's in. It doesn't know that there are objects or
00:18:12.740
that things move, or there's such a thing as time or good guys and bad guys or cars or roads or bullets
00:18:20.100
or spaceships or anything, just like a newborn baby. And then within a few hours of messing around with,
00:18:28.580
with a video game, uh, essentially through a camera. So it's really just looking at the screen.
00:18:33.060
It doesn't have direct access to the internal structures of the game at all. It's looking at
00:18:37.700
the screen. Very much the way a human being is interfacing with the game. Yeah, exactly. Uh,
00:18:42.340
the only thing it knows is, is that it wants to get more points. Um, and, uh, so within a few hours,
00:18:48.820
it's, it's able to learn a wide range. So, so most of the games that Atari produced,
00:18:55.620
it reaches a superhuman level of performance in a few hours, entirely starting from nothing.
00:19:01.620
And it's important to say that it's the same algorithm playing all the games. It's not,
00:19:06.500
this is not like deep blue that is the best chess player, but he can't play tic-tac-toe
00:19:10.500
and will never play tic-tac-toe. This is a completely different approach.
00:19:14.020
Yeah. Yep. This, this is, this is one algorithm, you know, it could be a driving
00:19:18.340
game. It could be space invaders. It could be Pac-Man. It could be undersea, you know,
00:19:23.220
sea quest with submarines. Um, so in that sense, and when you look at that, you know,
00:19:28.820
if your baby did that woke up, you know, the first day in the hospital and by the end of the day was,
00:19:34.180
was beating everyone, beating all the doctors at Atari video games, you'd be pretty terrified.
00:19:38.740
Um, so, you know, and it, it's demonstrating generality up to a point, right? There, there
00:19:45.940
is certain characteristics of video games that don't hold for the real world in general. The
00:19:51.780
main, the, one of the main things being that, uh, in a video game, the idea is that you can see
00:19:57.620
everything on the screen. Um, but in the course in the real world and any given point, there's tons
00:20:03.060
of the real world that you can't see. Um, but it all, it still matters. And then also with video
00:20:08.500
games, they, they tend to have very short horizons cause you're supposed to, you know, play them in
00:20:12.980
the pub when you're drunk or whatever. So they typically, unlike chess, they don't require deep
00:20:20.340
thought about the consequent longterm consequences of your choices. So, but you know, other than those
00:20:28.660
two things, which is certainly important, uh, something like DQN and various other reinforcement
00:20:33.780
learning systems are beginning to show generality. And we're seeing with the, with the work in
00:20:39.300
computer vision that the same basic technology, these, the convolutional deep networks, um, and
00:20:46.420
with their, and their recurrent cousins, that these technologies with fairly small modifications,
00:20:53.060
not really conceptual changes, just sort of minor changes in the, the details of the architecture
00:20:58.820
can learn a wide range of, of tasks to, to an extremely high level, including recognizing
00:21:05.940
thousands of different categories of objects in photographs, uh, doing speech recognition,
00:21:11.220
uh, learning to even write captions for photographs, learning to predict what's going to happen next in
00:21:17.460
a video and so on and so forth. So, so I think we're arguably, you know, if, if there is going to be,
00:21:24.660
uh, an explosion, uh, of capabilities that feeds on itself, I think we may be seeing the beginning of
00:21:31.940
it. Hmm. Now, what are the implications with respect to how people are designing these systems?
00:21:41.860
So if I'm not mistaken, most, if not all of these deep learning approaches or more generally machine
00:21:49.140
learning approaches are essentially black boxes in which you can't really inspect how the, the
00:21:56.260
algorithm is accomplishing what it is accomplishing. Is that the case? And if so, or wherever it is the
00:22:02.260
case, are there implications there that we need to be worried about? Or is that just a novel way of
00:22:07.940
doing business, which doesn't raise any special concerns?
00:22:11.300
Well, I think it raises two kinds of concerns. One, um, maybe three. So one is a very practical
00:22:21.220
problem that when it's not working, uh, you really don't know why it's not working.
00:22:26.420
Uh, and there is a certain amount of blundering about in the dark. Some people call this graduate
00:22:33.620
student descent, uh, which is, that's a very nerdy joke. Uh, so great gradient descent is,
00:22:39.140
or, you know, walking down, down a hill is a, is a way to find the lowest point. Um, and so graduate
00:22:46.500
student descent, meaning that you're, you're trying out different system designs and in the process,
00:22:52.820
you're using up graduate students at a rapid rate. Um, and, and that is a, that's clearly a drawback.
00:23:01.060
Um, you know, and I, in my research, I've generally favored techniques where the design of a system
00:23:08.340
is derived from the characteristics of the problem that you're trying to solve. Um, and so the, the
00:23:14.740
function of each of the components is clearly understood and you can, you can show that the
00:23:21.060
system is going to do what it's supposed to do for the right reasons. Uh, and the black box approach,
00:23:27.140
there are people who just seem to have great intuition about how to design the architecture
00:23:34.820
of these deep learning networks so that they, uh, they produce good performance. I think there are also
00:23:41.460
practical questions from the legal point of view that there are a lot of areas, for example, uh,
00:23:48.660
medical diagnosis or treatment recommendations recommending for, or against parole, uh, for,
00:23:55.460
uh, for prisoners, um, approving credit or declining credit applications where you really
00:24:03.220
want to create an explanation of, of why the recommendation is being made. Uh, and without
00:24:08.980
that people simply won't accept, uh, that the system is used. Uh, and in some, you know, and one of,
00:24:16.020
one of the reasons for that is that, uh, a black box could, uh, be making decisions that are biased,
00:24:24.500
uh, you know, racially biased, for example, uh, and without the ability to explain itself,
00:24:29.540
um, then you, you can't trust that the system is unbiased. And then there's a third set of reasons,
00:24:36.580
which I think is what's behind your question about why we might be concerned with, with systems that
00:24:41.780
are entirely black box that we, since we can't understand how the system is, is reaching its
00:24:50.020
decisions or, or what it's doing, um, that gives us much less control. So as we move towards more and more
00:24:57.300
capable and perhaps, uh, general intelligence systems, the fact that we really might have no
00:25:04.340
idea how they're working or what they're thinking about, so to speak, uh, that would give you some
00:25:10.820
concern because then one, one of the reasons that the AI community often gives for why then they're
00:25:17.700
not worried, right? So the people who are skeptical about there being a risk is that, well, we design
00:25:23.540
these systems, you know, obviously we would, we would design them so that they did what we want,
00:25:28.260
but if they are completely opaque black boxes that you don't know what they're doing, then that,
00:25:35.060
that sense of, uh, control and safety disappears.
00:25:38.020
Well, let's talk about that issue of what Bostrom called the control problem. I guess we call it the
00:25:44.580
safety problem as well. And this is many people listening will have watched my Ted talk where I
00:25:50.020
spend 14 minutes worrying about this, but just perhaps you can briefly sketch the concern here.
00:25:56.260
What is, what is the concern about general AI getting away from us? How do you articulate that?
00:26:04.180
Um, so you mentioned earlier that this is a concern that's been articulated by non-computer scientists
00:26:10.660
and Bostrom's book, super intelligence was certainly instrumental in bringing it to the attention of a,
00:26:16.340
of a wide audience, you know, people like Bill Gates, uh, and Elon Musk and so on. But the fact is that
00:26:23.540
these concerns have been articulated by the central figures in computer science and AI.
00:26:30.580
So I'm actually gonna going back to IJ Good and Von Neumann. Uh, well, and, and Alan Turing himself.
00:26:38.900
Right. Um, so people, a lot of people may not know about this, but I'm just gonna read a little quote.
00:26:46.580
So Alan Turing gave a talk on, uh, BBC radio, radio three in 1951. Um, so he said,
00:26:57.060
if a machine can think, it might think more intelligently than we do. And then where should we
00:27:02.900
be? Even if we could keep the machines in a subservient position for instance, by turning off
00:27:08.180
the power at strategic moments, we should as a species feel greatly humbled. This new danger is
00:27:15.060
certainly something which can give us anxiety. So that's a pretty clear, you know, if we achieve
00:27:21.300
super intelligent AI, we could have, uh, a serious problem. Another person who talked about this
00:27:28.260
issue was Norbert Wiener. Uh, so Norbert Wiener was the, uh, one of the leading applied mathematicians
00:27:36.980
of the 20th century. He was, uh, the founder of, of a good deal of modern control theory,
00:27:42.820
um, and, uh, automation site. He's, uh, often called the father of cybernetics.
00:27:49.700
So he was, he was concerned because he saw Arthur Samuel's checker playing program,
00:27:54.580
uh, in 1959, uh, learning to play checkers by itself, a little bit like the DQN that I described
00:28:02.340
learning to play video games, but this is 1959. Uh, so more than 50 years ago, learning to play
00:28:08.660
checkers better than its creator. And he saw clearly in this, the seeds of the possibility
00:28:17.300
of systems that could out distance human beings in general. So, and he, he was more specific about
00:28:23.860
what the problem is. So, so Turing's warning is in some sense, the same concern that gorillas
00:28:29.220
might've had about humans. If they had thought, you know, a few million years ago, when the human
00:28:34.980
species branched off from, from the evolutionary line of the gorillas, if the gorillas had said to
00:28:40.020
themselves, you know, should we create these human beings, right? They're going to be much smarter
00:28:43.380
than us. Yeah. It kind of makes me worried. Right. And, and the, they would have been right to worry
00:28:48.720
because as a species there, they sort of completely lost control over their own future and, and humans
00:28:55.220
control everything that, uh, that they care about. So, so Turing is really talking about this general
00:29:02.860
sense of unease about making something smarter than you. Is that a good idea? And what Wiener said
00:29:07.620
was, was this, if we use to achieve our purposes, a mechanical agency with whose operation we cannot
00:29:15.020
interfere effectively, we had better be quite sure that the purpose put into the machine
00:29:20.300
is the purpose, which we really desire. So this 1360, uh, nowadays we call this the value alignment
00:29:28.740
problem. How do we make sure that the, the values that the machine is trying to optimize are in fact
00:29:34.980
the values of the human who is trying to get the machine to do something or the values of the human
00:29:41.420
race in, in general. Um, and so we know actually points to the sorcerers apprentice story, uh, as a
00:29:52.740
typical example of when, when you give a goal to a machine, in this case, fetch water, if you don't
00:30:00.280
specify it correctly, if you don't cross every T and dot every I and make sure you've covered
00:30:06.160
everything, then the machines being optimizers, they will find ways to do things that you don't
00:30:12.100
expect. Uh, and those ways may make you very unhappy. Uh, and this story goes back, you know,
00:30:18.960
to King Midas, uh, you know, 500 and whatever BC, um, where he got exactly what he said, which is
00:30:28.100
the thing turns to gold, uh, which is definitely not what he wanted. He didn't want his food and water
00:30:33.740
to turn to gold or his relatives to turn to gold, but he got what he said he wanted.
00:30:38.760
And all of the stories with the genies, the same thing, right? You, you give a wish to a genie,
00:30:43.280
the genie carries out your wish very literally. And then, you know, the third wish is always,
00:30:47.900
you know, can you undo the first two because I got them wrong. And the problem with super
00:30:52.620
intelligent AI, uh, is that you might not be able to have that third wish or even a, even a second wish.
00:31:00.000
Yeah. So if you, so if you get it wrong, you know, and you might wish for something very benign,
00:31:03.740
sounding like, you know, could you cure cancer? But if, if you haven't told the machine that
00:31:09.440
you want cancer cured, but you also want human beings to be alive. So a simple way to cure
00:31:14.400
cancer in humans is not to have any humans. Um, a quick way to come up with a cure for cancer is to
00:31:21.120
use the entire human race as guinea pigs or for millions of different natural, uh, drugs that might
00:31:28.260
cure cancer. Um, so there's all kinds of ways things can go wrong. And, you know, we have,
00:31:34.260
you know, governments all over the world, try to write tax laws that don't have these kinds of
00:31:40.480
loopholes and they fail over and over and over again. And they're only competing against ordinary
00:31:47.580
humans, you know, tax lawyers and rich people. Um, and, and yet they still fail despite there being
00:31:55.620
billions of dollars at stake. So our track record of being able to specify objectives
00:32:03.900
and constraints completely so that we are sure to be happy with the results, uh, our track record is,
00:32:11.120
is abysmal. And unfortunately we don't really have a scientific discipline for how to do this.
00:32:18.500
So generally we have all these scientific disciplines, AI control theory, economics, operations,
00:32:24.960
research that are about how do you optimize an objective, but none of them are about, well,
00:32:32.400
what should the objective be so that we're happy with the results? So that's really, I think the
00:32:38.980
modern understanding, uh, as described in Bostrom's book and other papers of why a super intelligent
00:32:46.440
machine could be problematic. It's because if we give it an objective, which is different from what we
00:32:53.840
really want, then we, we, we're basically like creating a chess match with a machine right now.
00:32:59.920
There's us with our objective and it with the objective we gave it, which is different from
00:33:04.020
what we really want. So it's kind of like having a chess match for the whole world. Uh, and we're not
00:33:10.540
too good at beating machines at chess. So that's a great, that's a great image, a chess match for the
00:33:16.440
whole world. I want to drill down on a couple of things you just said there, because I'm hearing
00:33:21.420
the skeptical voice, even in my own head, even though I think I have smothered it over the last
00:33:28.360
year of focusing on this, but it's amazingly easy, even for someone like me, and this was really kind
00:33:35.060
of the framing of my Ted talk where it's just, I was talking about these concerns and the, and the
00:33:40.620
value alignment problem essentially. But the real message of my talk was that it's very hard to take
00:33:47.620
this seriously emotionally, even when you are taking it seriously intellectually. There's something
00:33:53.280
so diaphanous about these concerns and they seem so far-fetched, even though you can't give an
00:34:03.040
account, or I certainly haven't heard anyone give an account of why in fact they are far-fetched when
00:34:09.660
you look closely at them. So like, you know, the idea that you could build a machine that is super
00:34:14.460
intelligent and give it the instruction to cure cancer or fetch water and not have anticipated
00:34:20.180
that one possible solution to that problem was to kill all of humanity or to fetch the water from
00:34:25.720
your own body. And that just seems, we have an assumption that things couldn't conceivably go
00:34:32.200
wrong in that way. And I think the most compelling version of pushback on that front has come to me from
00:34:40.400
people like David Deutsch, who you probably know. He's, you know, one of the, the father of, of quantum
00:34:45.800
computing or, or the, the concept there, a physicist at Oxford who's been on the podcast. He argues,
00:34:52.240
and this is, this is something that I don't find compelling, but I just want to put it forward and
00:34:56.960
I've told him as much. He argues that superintelligence entails an ethics. If we've built a superintelligent
00:35:06.000
system, we will have given it our ethics in some, to some approximation, but it will have a better
00:35:12.920
ethics than ourselves, almost by definition. And to worry about the values of any intelligent systems
00:35:20.820
we build is analogous to worrying about the values of our descendants or our future teenagers, where
00:35:27.800
they might have different values, but they are an extension of ourselves. And now we're talking about
00:35:32.700
an extension of ourselves that is more intelligent than we are across the board. And I could be
00:35:38.980
slightly misrepresenting him here, but this is close to what he advocates, that there's, there is
00:35:44.580
something about that that should give us comfort, almost in principle, that there's just no,
00:35:50.840
obviously we could stupidly build a system that's going to play chess for the whole world against us
00:35:55.580
that is malicious, but we wouldn't do that. And what we will build is by definition going to be
00:36:03.220
a more intelligent extension of the best of our ethics.
00:36:08.860
I mean, that, that's a nice dream, but as far as I can see, it's, uh, it's nothing more than that.
00:36:16.880
There's no reason why the capability to make decisions successfully is associated with any...
00:36:25.160
If you'd like to continue listening to this conversation, you'll need to subscribe at
00:36:29.440
samharris.org. Once you do, you'll get access to all full-length episodes of the Making Sense
00:36:34.120
podcast, along with other subscriber-only content, including bonus episodes and AMAs, and the
00:36:40.380
conversations I've been having on the Waking Up app. The Making Sense podcast is ad-free and relies
00:36:45.680
entirely on listener support, and you can subscribe now at samharris.org.