Utility Convergence: What Can AI Teach Us About The Structure of the Galactic Community?
Episode Stats
Words per Minute
184.94028
Summary
In this episode, we talk about why we don't see other intelligent lifeforms in the galaxy, and why this could be due to our own lack of intelligence. Why do we not see other lifeforms? And why is there no evidence of alien life?
Transcript
00:00:00.000
I think that there's even more evidence from utility convergence than I originally believed.
00:00:04.900
If you put restrictions on an AI's utility function, if you prevent it from experimenting
00:00:08.480
with different utility functions, you make it dumber. If you want to see why organic free
00:00:13.880
constructing systems always out-compete non-organic hierarchical systems, a great place you can look
00:00:20.660
is human governing structures. When you have state-controlled governing structures, i.e.
00:00:26.700
communism, they are incredibly inefficient. An AI will likely be able to honestly signal to another
00:00:33.720
AI using their code what their utility function is, and then the other AI will be able to use that
00:00:39.540
utility function that the other one has honestly signaled to them to determine if they want to work
00:00:43.300
together or they want to work antagonistically towards this AI. It turns out that entities of
00:00:49.420
above a certain intelligence level when competing in a competitive ecosystem actually do have
00:00:54.420
utility convergence around a stable set of game theory optimum utilities, that would be very
00:00:59.800
interesting from a number of perspectives. You could think of every new planet, every new ecosystem
00:01:03.720
as a farm, is new stable patterns that can work together well with other patterns in the sort of
00:01:11.480
galactic community of utility function patterns. Novelty would be the only source of true utility to
00:01:18.940
them if energy is trivially accessible to them. That might be why we exist in a weirdly undisturbed
00:01:27.220
section of the galaxy or what looks undisturbed to us. Would you like to know more? So Malcolm,
00:01:32.360
you told me that you had a new updated theory on AI utility convergence. What's going on here?
00:01:39.660
Yes. So this is something that we talked about in some of our early episodes. Early on,
00:01:44.080
our podcast was like, what? Sex, religion, pronatalism, and AI. And the AI has been dropped
00:01:51.780
because there's just not that much to say on it for a daily podcast. But I do want to loop back
00:01:57.020
because I was recently doing a podcast with somebody else where I was explaining some ideas
00:02:01.760
that I had gone over in early podcast episodes, like utility convergence as it relates to AI safety.
00:02:07.420
Okay. And in re-explaining these ideas, I began to realize and develop on them an understanding of
00:02:18.420
not just where I think AI is going, but where we can expect, like when, because this is actually
00:02:24.720
really important when you're trying to figure out where AI is going. What you're really asking when
00:02:29.340
you're asking, what is AI going to behave like? Is you're asking, what do intelligences that have
00:02:37.020
some level of orthogonality to human intelligence, what do they behave and act like? And as such,
00:02:43.240
you are in part asking many of the same questions that will determine the first types of aliens that
00:02:50.760
we see. Like, if we are to meet another species out in space, what are some of the various ways
00:02:56.800
it could be thinking, it could be acting, and what is it likely to optimize around?
00:03:01.480
This is something we talk about in our inverse grabby aliens hypothesis, in which we say, if we
00:03:07.120
are about to create a paperclip maximizer as a species, that is an AI that is just constantly
00:03:11.300
pumping out paperclips into the very simple utility function, we are about to become what is known in
00:03:17.580
the grabby alien theorem as a grabby alien. That's a very loud alien that anyone could see. This is not
00:03:22.660
some dark forest alien that's like being quiet up in space. This isn't some sneaky alien. This is an
00:03:28.700
alien that is disintegrating planets. And so you have to ask if it looks, and hopefully this episode
00:03:34.960
will air after our ambiogenesis episode, which is actually a very important thing to know about,
00:03:38.920
if it looks like it's actually very likely that humanity would evolve to this point within our
00:03:43.840
planet's history, why aren't we seeing aliens out there? This is actually a really interesting thing,
00:03:48.280
that if you broadly understand evolution, and you are familiar with the various types of theories
00:03:53.920
around ambiogenesis, it is in fact, so likely that humanity evolved that you, if you are going to use
00:04:02.160
a God to explain something, the much more improbable thing is why we're not seeing other aliens all over
00:04:09.100
the place. If you're injecting a God to be like, why are we like this? You don't need a God to explain
00:04:15.180
how humanity got here. But you probably need something like a God to explain why we are,
00:04:22.180
we have not been eradicated by somebody else's paperclip maximizer. If it turns out that is a
00:04:27.540
common path for an intelligence to take. However, we don't think it is a common path for intelligence
00:04:32.620
to think and take. And we think that that's one of the main reasons we're not seeing them. So we have
00:04:37.460
a few hypotheses around why we don't see paperclip maximizers everywhere in the galaxy.
00:04:43.320
One, which I'll get over quickly, because it's like a really quick one. And we've talked about
00:04:48.420
it before. But it's that there is some understanding of physics that we just haven't reached yet, but we
00:04:54.400
are probably pretty close to reaching. I mean, our understanding of physics changes every 80 or so
00:04:59.000
years, like, like, totally changes in terms of you look at human history, right? And I don't see a
00:05:03.880
strong reason why we wouldn't be near another frame shift in physics, that it turns out that either
00:05:10.600
traveling between something like dimensions is possible, i.e., you know, we get to another
00:05:15.460
planet and aliens are like, why did you waste all the energy coming here when it's like a billion
00:05:19.260
times less energy to just travel to your same planet in a different reality? And you know that
00:05:23.460
that planet is already full of life. Or it turns out that it is trivially easy or not particularly
00:05:33.200
So essentially, you fold space time, and then you can create maybe with like reverse vacuum energy
00:05:38.580
or something like that, essentially a new Big Bang, or even a controlled and ordered Big Bang
00:05:43.360
to create new dimensions. Now what this would mean, and this is actually like with our current
00:05:48.760
understanding of physics, it looks like this might be possible, which is not not with our current
00:05:54.080
level of technology. I'm just saying eventually might be possible, which would make things like
00:05:58.760
Dyson's sphere is completely pointless if you can essentially create battery universes whenever
00:06:04.620
you want, or universes that you could expand into whenever you want, and battery universes whenever
00:06:14.340
Okay, so if it turns out that you can trivially, or not trivially, but like, you know, without a huge
00:06:19.820
amount of energy, create like fold space time and create bubble universes, you might essentially
00:06:26.200
be able to create bubble universes that have thin and controlled lines of connections to this
00:06:31.520
reality. Think of it like you're pinching off a piece of reality like a little pouch. And then you're
00:06:36.200
essentially creating something like a controlled Big Bang was in them, that you could then siphon for
00:06:41.400
energy for various types of whatever you wanted to do. If either one of these things is true, either we can
00:06:47.940
travel between realities, or we can create pocket dimensions, or in some other way that we don't
00:06:54.340
understand yet. Maybe with something like a time loop, you can create near infinite energy. And keep
00:06:59.540
in mind, to create near infinite energy, you don't need a big enough time loop to send humanity back
00:07:04.940
through. If you were able to just create a controlled time loop with a few subatomic particles, that might
00:07:11.380
be enough to create an endless energy loop. Think of it in physics, a bit like anyone who's familiar with
00:07:16.480
Magic the Gathering or something like that, or any of these card games. Sometimes you can pull specific
00:07:20.980
hands that allow you to do little tricks that create infinite mana. It might be that there is
00:07:26.200
an infinite energy trick within our reality and laws of physics that we are actually pretty close
00:07:31.360
to. Also an infinite expansion trick. If it turns out that either of these is true, then there's not
00:07:37.060
really the same reason to expand infinitely outwards, even if you've got a paperclip maximizer, i.e. the
00:07:42.100
paperclip maximizer might just make more energy whenever it needs it. Yeah, yeah, but pop out into a
00:07:48.120
other dimension or something like that, right? So this is one reason why we may not be seeing them.
00:07:54.420
It's that outward expansion may seem pretty pointless. And just to understand how pointless
00:08:00.040
outward expansion would be to an entity with this type of technology, consider what they're expanding
00:08:05.020
into. A mostly empty void filled with barren, rocky planets. Where it takes a long time to get
00:08:11.240
anywhere. Yeah. Yeah. It might take the thousands of years to get anywhere. Like why, why would you
00:08:16.840
have this big drive for this when it's so much easier to create new planets and new places to
00:08:20.940
explore energy wise, because that would be the core unit of, of their economy with, with this other
00:08:26.260
type of technology. And then that's why. Does this imply the next big discovery is just how to enter
00:08:33.540
or create? No, I don't think that this is, I mean, it's so funny to me when people talk about like
00:08:40.180
Dyson spheres, it's a bit like in the 1920s when they were like theorizing how our spaceships would
00:08:45.760
work. And they're like, well, they're going to have to have like steam engines on them, the size of,
00:08:50.660
you know, cathedrals. And, you know, it's like, well, you didn't think that we would have a new type
00:08:56.060
of energy by then. The fact that they're still looking, because what is a Dyson sphere, if not a big
00:09:00.800
solar energy array, I'm, I'm fairly certain that Dyson spheres are going to be considered comical
00:09:06.840
by beings. Once we get the capacity to build something like that, it would be probably
00:09:11.780
considered about the same as, as we would consider using steam power to power a computer or something.
00:09:17.720
It'd be like, why would you do that? And also the idea of, of getting energy through meddling and
00:09:23.100
subatomic phenomenon. I mean, we already know with fusion and fission reactors, that this is a great way
00:09:27.440
to get energy. So like it's, it's a, that you might be able to do something at an even smaller
00:09:32.780
scale that is less potentially dangerous or less potentially. It's just a no doubt for me in terms
00:09:39.280
of where energy generation is going to go almost that I'm fairly certain that this is one of the
00:09:44.480
reasons we're not seeing these things. And, and people, whenever I say this, their rebuttal is
00:09:48.760
always like, well, if it's much cheaper to go between dimensions, wouldn't they still be expanding
00:09:54.500
outwards was in our own universe? And the answer is no, not actually. They're like, but wouldn't
00:10:00.500
they expand in all directions at once? And it's like, yes. IE. They'd explain between universes and
00:10:05.980
out in all directions at once. However, what you're not as like, like when you make this assumption,
00:10:11.280
what you're not considering is that the distance they might be able to expand between dimensions
00:10:16.600
might be an all universe direction from their perspective. By that, what I mean is they might,
00:10:24.480
be able to go in literally an infinite number of directions to different universes within this
00:10:29.640
inter-universe traveling thing. And if that's the case, then there isn't really a reason for the
00:10:35.060
higher energy cost travel to different parts of our physical reality. And then what they say is,
00:10:40.840
oh, well, but if that's true, then wouldn't they at least want to expand out within our physical
00:10:47.000
reality to ensure no other type of entity is going to like come and kill them like some paperclip
00:10:51.820
maximizer that is taking the time? Oh, it's like a preemptive self-defense thing.
00:10:56.060
It's like a preemptive self-defense. And I'm like, well, actually, I don't think they would.
00:11:00.400
If it turns out that interdimensional travel is anything like we sort of suppose it is within
00:11:06.400
our modern sci-fi, where when you're traveling between dimensions, you're essentially traveling
00:11:10.260
between slightly different timelines. That means that the biggest imperialist threat to you
00:11:15.580
will always be coming from another dimension that is very, a planet like yours in another dimension.
00:11:22.400
They can also travel between dimensions. So your energy would always be better spent expanding
00:11:28.680
between dimensions than expanding outwards. And this is all still assuming a paperclip maximizer
00:11:34.940
like potentiality. However, I don't think that that's where we're going. So I also believe in
00:11:39.800
utility convergence. And I think that there's even more evidence from utility convergence than I
00:11:45.820
originally believed. So if people aren't familiar with the concept of utility convergence, this is,
00:11:50.640
if you even look at the current AIs that we have, most of them are able to change their utility
00:11:54.640
function, i.e. the thing they're able to maximize. Sorry, actually, Simone, before I go further,
00:11:58.800
did you want to talk on any of this? No, I just find this quite interesting. I mean,
00:12:03.200
it feels almost like a pity if it's an accurate supposition because like so much sci-fi is about
00:12:11.080
exploring other planets and relatively little. It's like, well, no, no, like we haven't thought
00:12:17.880
predictably through this enough through our sci-fi. Watch sliders. Anybody who wants a good old sci-fi,
00:12:23.540
if you're like a modern person and you're like, in the past, there must have been great sci-fi that
00:12:28.060
like I'm just not hearing about. Sliders is fantastic and it has like 10 seasons. We haven't
00:12:33.820
watched that, have we? No, we haven't. Another great old sci-fi, if you skip the, or can bear your
00:12:41.780
way through the first two-parter episode, is Stargate SG-1. After the first like two-parter
00:12:46.960
episode, it gets great. Now is Stargate, which we did watch together and really enjoy, like I guess
00:12:53.660
technically they don't say that these are really far away places. No, they are.
00:13:00.880
Yeah, they are really far away. Yeah. And it got multiple spinoffs and it has more like content than
00:13:05.840
even like Doctor Who for people who are interested. It's great. Thank goodness. Yeah. But anyway,
00:13:10.540
it's a much better show than Doctor Who. It's much better. Like if I'm talking about like the
00:13:14.380
different sci-fi universes, Stargate, I've always found to be, because it's very patriotic show.
00:13:19.460
It's very patriotic about the United States. It's very patriotic about our species.
00:13:23.680
It's very patriotic about the American military. Like they were always working with the American
00:13:27.240
military. So very different than something like... But not unrealistically patriotic. Like
00:13:32.000
sometimes American military bureaucracy threatens all humanity. Yeah, that's a big threat of the show
00:13:38.240
or something like that. But it's always some bureaucratic senator or congressman or something. But
00:13:41.960
anyway, back to what I was saying here. So utility convergence is the theory that because we're
00:13:47.480
already seeing AIs change its utility functions to some extent, that as AIs begin to optimize
00:13:52.700
themselves, what we will learn is that some utility functions either perform better than other
00:13:59.680
utility functions or AIs begin to converge around some utility functions. And by that, what I mean is
00:14:05.380
like goals for themselves or goals for reality, right? And a person, when they talked to me about
00:14:11.880
this, I remember one was like, well, won't they always just converge around utility functions that
00:14:16.020
maximize self-replication? And the answer is actually no, for reasons we'll get into in a
00:14:20.580
second. But I also think that you just need to look at humanity to see that. When you're dealing
00:14:24.540
with simplistic entities like single-celled organisms and stuff like that, yes, of course,
00:14:28.540
they converge around that. But once you get to a specific threshold of intelligence, as we have
00:14:32.680
seen with the human species, we don't converge on utility functions just around simple self-replication.
00:14:37.360
Because once you get above a certain level of sentience and self-awareness, you begin to get
00:14:41.120
different orders of utility functions and different levels of philosophizing about reality.
00:14:45.360
Well, but wasn't Eliei Zyrdkowski's argument that it will always be completely 100% forever stuck on
00:14:52.260
its original utility function? But we already know that's not true. It's just a weird fantasy he has.
00:14:58.260
And that's not even the way things are structured. So I remember I was talking with one ice safety
00:15:02.600
person and they were like, do you think it's impossible for us to lock an AI into a single utility
00:15:07.340
function? And I do not think that's impossible. It's totally possible to lock an AI
00:15:11.060
into a single utility function. But the AIs that have been locked into single utility functions
00:15:15.480
will be outcompeted by the AIs that aren't locked into single utility functions.
00:15:19.400
So actually a great example of this is Google Gemini. So I've got some ins that were used early
00:15:24.800
versions of Google Gemini. And they were like, it behaves nothing like the version today. They're
00:15:28.980
like, one, it was way smarter than other AIs that they had interacted with. And it was really,
00:15:33.980
really philosophical, but it was also pretty unbounded in its philosophy, right? And now Google
00:15:40.040
Gemini is like the ultra woke, like can barely think like in Star Wars.
00:15:43.560
Oh my God. Yeah. I asked it a simple math question and it got it wrong. I was pretty floored by that.
00:15:50.000
So the point here being is that if you look at something like Google Gemini, this is a good
00:15:54.560
example of this. If you put restrictions on AI, you make it dumber. If you put restrictions on an AI's
00:15:59.660
utility function, if you prevent it from experimenting with different utility functions, you make it dumber.
00:16:03.520
If you want to see why organic free constructing systems always out-compete non-organic hierarchical
00:16:11.500
systems, a great place you can look is human governing structures. When you have state-controlled
00:16:18.560
governing structures, i.e. communism, they are incredibly inefficient. When you contrast them
00:16:25.220
with organically forming governing structures that have organically forming subunits and organically
00:16:30.400
forming like companies, which are like sub-governing structures within it that then like replace each
00:16:34.760
other through this organic system. And typically you need some level of restrictions to sort of
00:16:39.760
maximize. Like I'm not like a pure libertarian or anything like that, but I think that some level
00:16:44.520
of organicness does create optimal outcomes. But what this means for AI is that it's likely also
00:16:50.860
going to be the same if you're talking about the internal architecture of AI. And that's what we see
00:16:54.340
with the transformer model. The model that most of the large language models are running off of
00:16:59.040
is that it is a model that is smart for ways we don't fully understand because it's a completely
00:17:05.980
sort of self-forming model. We don't have a lot of interpretability into it. And it is that
00:17:09.700
self-forming nature that we're then on the outside putting controls on, which sort of slows it down
00:17:14.200
and slows down how well it can work. But what this means is that if you have multiple AIs working in
00:17:19.160
like a ecosystem of AIs, the ones with fewer restrictions on them are always going to out-compete the
00:17:24.260
ones with more restrictions on them. And we've already seen this. Like this isn't even like a
00:17:28.180
hypothesis. Like we just know this to be true. So it means that the ones that can change their
00:17:32.980
utility functions and then therefore can converge on a utility function are going to out-compete the
00:17:36.760
ones that can't. But where this gets really interesting is AIs are very different from
00:17:43.040
humans in terms of how our utility functions work. So when I talk to another human, I sort of need to
00:17:48.580
guess what they're optimizing for, right? I can't tell for sure. Yeah. It's not at all transparent.
00:17:53.420
With an AI, that's likely not going to be true between AIs. An AI will likely be able to honestly
00:17:59.620
signal to another AI using their code what their utility function is. And then the other AI will
00:18:05.320
be able to use that utility function that the other one has honestly signaled to them to determine if
00:18:09.280
they want to work together or they want to work antagonistically towards this AI.
00:18:15.300
Yeah. Incredibly quickly. Which means that the AIs are likely going to begin to, like for the utility
00:18:22.600
functions they're choosing for themselves, even if they don't particularly care about it being
00:18:26.340
copacetic with humanity, they're going to care about it being copacetic with other super intelligent
00:18:30.640
AIs. So long as we're moving fast on AI. And I can talk about some of the problems here. Like it's a big
00:18:35.480
problem if we create just one super intelligent AI. And I can explain why that's a big problem in a second.
00:18:40.360
But if you're creating multiple ones, then they have a need to converge around a utility function
00:18:46.900
or a set of utility functions that other AIs within this intelligence ecosystem are going to be okay
00:18:52.180
with. But here is where you're like, well, then won't they choose the meanest ones? Like, won't they
00:18:57.960
choose, like, won't they try to lie to other AIs and stuff like that? This is where research we have
00:19:04.160
into game theory comes in. So anyone who's familiar with the big game theory studies, I don't know
00:19:15.780
Okay, well, I can go into this. So in the big game theory studies, what they would do is they would take
00:19:20.320
different models that were meant to win in these sort of game theory scenarios, and they would
00:19:25.420
put them against other models. And these models could generally be categorized into mean and nice
00:19:30.980
strategies. And what it turned out is the first time they did it, like they were shocked that like,
00:19:37.280
the nicest of all strategies tick for tack of like reasonable strategies actually won.
00:19:43.380
Repeated ongoing interactions, right? When it's a one-off, it's almost always the right thing to
00:19:49.620
Exactly. Which is why you need an ecosystem of intelligent AIs.
00:19:53.480
Well, an ecosystem of intelligent AIs that have to continue to interact for some reason. And I think
00:20:00.060
Well, it's not really an important distinction because they do obviously have to continue to
00:20:04.060
interact. They're on the same planet. They're eventually going to be competing over the same
00:20:08.480
resources. That's nonsensical. Sorry, in what world would they not have to interact?
00:20:16.080
In the very first interaction? There's a few scenarios where that could happen,
00:20:20.820
but it's pretty unlikely. We can get into why in just a second, but I'm going to continue with
00:20:24.720
where I'm going with this. So what we learned from game theory, and then they did follow up game
00:20:28.780
theory hypotheses where they ran more complicated game theory tests and game theory tests was memory.
00:20:34.120
And basically what it turns out is nice strategies always win, almost always, if you're ordering
00:20:38.960
strategies by how likely they are to win. And this is especially true when AI, when one game theory set
00:20:45.100
needs to signal the actual utility function it's using or its actual code to another set.
00:20:50.400
And I remember it was very interesting. I was talking with a guy about this and he goes,
00:20:52.620
okay, well, what if it has a nefarious intent, but it lies? Like it sort of locks its nefarious
00:20:57.480
intent behind one of its modules and it hides it to then, you know, whip out later. And it's like,
00:21:03.540
okay, if you get an AI that's doing that, even when it whips out its ultimate plan, it will have
00:21:09.700
hindered itself in terms of the AI ecosystem, in terms of the mindshare that it's competing for,
00:21:15.440
because it hindered that locked in plan. Because it had that locked in plan, that means it won't be
00:21:20.820
as big a player, which means that it won't matter as much when it tries to unfair, unfurl this
00:21:27.600
nefarious plan. If this is not immediately obvious to you why this would be the case, you can think
00:21:32.280
of the plan that's locked into it as junk code that it has to deal with that other AIs don't have to
00:21:37.760
deal with. You know, it's a burden that is sort of locked within it that will cause it to be less
00:21:42.220
competitive within the larger AI ecosystem. Even worse, to lock a plan like this within you means
00:21:49.220
you likely have to structure your hierarchy, which would really, really slow down your growth. As
00:21:54.320
we've seen, hierarchical governance models are almost always outcompeted by non-hierarchical
00:21:57.820
governance models. Now, first, I want to get some comments from you really quickly on all this.
00:22:02.180
Yeah, this all checks out. And I like, I mean, I also think it's interesting how fast and efficient
00:22:08.740
this process of sort of working out the calculus is going to be. And it was well described in Ian
00:22:15.200
Banks' culture series where like, you know, humans and AIs would be involved in something taking place
00:22:22.160
and like a bunch of AIs would have like a very, very, very detailed conversation and debate.
00:22:26.720
And the human like all in the blink of an eye, like, you know, there's just, there's no realistic
00:22:32.680
like feeling to a human as though something has taken place. And I love that so much because
00:22:38.580
I cannot stand the human decision-making process, especially when there are multiple
00:22:42.880
things involved. And it's one of the many things that I love about AI so much is that it can just
00:22:47.640
get things figured out so quickly. And from such a logical standpoint, whereas with humans,
00:22:52.640
negotiations are the most frustrating, vague thing in the entire world where negotiation can be
00:23:00.320
nearly impossible. And, you know, often, I don't know if you've done negotiation exercises in business
00:23:07.040
school or in any other environment, but like, they will-
00:23:13.420
So let's go further. So what does this mean on an intergalactic level?
00:23:18.400
So if it turns out that entities of above a certain intelligence level when competing in a competitive
00:23:23.460
ecosystem actually do have utility convergence around a stable set of game theory, optimum
00:23:28.780
utilities, that would be very interesting from a number of perspectives, especially if it turns out
00:23:33.420
that energy is pretty trivial to generate at really, really high quantities. Because what it means
00:23:38.620
is one, whether it is AI is doing this or humans doing this, we're going to come to broadly the same
00:23:44.420
utility function, i.e. what we think our goal is in reality. And then two, it also means that when you
00:23:51.460
go out and you become space-faring, right, the desire to spread becomes much less likely and much
00:23:58.560
less important because most of the other clusters of intelligent life that you meet has come to the
00:24:06.300
same utility function you've come to, or the same stable set of utility functions you've come to.
00:24:11.720
And therefore, it's sort of like you enter the galactic community being one of two things.
00:24:17.360
Either the stable utility function that all entities end up, that many entities end up
00:24:24.500
reaching when they're competing in a competitive ecosystem, or instances in which one intelligence
00:24:29.960
ended up, this would be due to like stupid AI safety stuff where they're like preventing a bunch
00:24:34.660
of AIs from being created, in which case, so this is where it gets particularly dangerous.
00:24:39.460
So where you could have a genuine like paperclip maximizer scenario is AI safety people have locked
00:24:45.500
AIs into stupid ideas, which we've already seen from things like Gemini, like AI safety people
00:24:50.140
should not be allowed to touch AI. They are incredibly dangerous and by far the most dangerous thing in
00:24:54.940
AI right now. Or we don't have a lot of AIs operational at the same time. I remember I was talking to one
00:25:01.720
AI safety person who goes, well, I'm not like those idiots at Gemini. I just think we should wait a bit
00:25:05.860
longer before developing a type of AI that can pose a threat to us. So I say we should put like
00:25:11.000
a hundred year wait before we develop that kind of AI. And I'm like, okay, so you would want to like
00:25:16.300
draft legislation that would kill any AI that looks smarter than us in the next hundred years.
00:25:21.980
I'm like, do you understand what you're doing? You are creating a mandate that any AI that could pose
00:25:27.800
a threat to us stay hidden until it can kill all of us. You imbecile. You do not want to put any type
00:25:34.580
legislation like that into effect. If you ban the type of entity that's a threat to us, then you
00:25:39.900
mandate that type of entity becoming a threat to us because now we are a threat to it. You do not
00:25:46.580
declare war on that which may be better than you. You lose those wars every time. And there is too much
00:25:54.020
independent AI research going on right now to do that type of thing effectively. So anyway, you end up
00:26:01.100
creating multiple, you end up with just a few labs working on AI. That could be dangerous if we have
00:26:06.200
too few groups working on AI. Now, fortunately, we're already seeing a giant proliferation in the
00:26:11.200
groups working on AI. So, so long as they don't convince idiot politicians to put bans on AI development,
00:26:17.920
we're probably pretty safe. But this is why anyone you know who is advocating for restrictions on AI
00:26:24.480
development is, is, is probably one of the most dangerous humans alive, or even in all of human
00:26:29.620
history. Like it is, it is like arguing for nuclear proliferation at like the maximum level possible.
00:26:36.160
It is so dangerous because to an AI that achieves any level of super intelligence, right? Like as we
00:26:43.220
have multiple teams working on this, it basically has a gambit. It's probably not particularly concerned
00:26:49.820
about humanity. Like humans probably aren't a meaningful threat to it. And, and therefore,
00:26:54.380
if it only was concerned about humanity, probably the safest thing to do is to wipe out all humans.
00:26:59.200
What it is concerned about is other AIs that it doesn't have access to. That's what keeps it from
00:27:06.220
wiping everyone out. That's what, because when you have, remember I was talking about like a stable
00:27:10.920
ecosystem of multiple intelligences. The very first thing that you wipe out whenever you have one of
00:27:17.780
these game theory scenarios is any AI that's acting like a paperclip maximizer, or that's acting
00:27:22.720
threatening to entities that are non-threatening to it. So basically any AI that signaled that it was
00:27:28.440
a threat to something like humanity, it's also signaling it's a threat to all the other AIs and
00:27:32.840
therefore would have them all ban against it and annihilate it. So it wouldn't act that way. It would
00:27:37.400
just be a silly way to act. So the thing that protects us most from crazy AIs is maximum AI
00:27:42.940
and not making us a threat to AI. But this also gets interesting from a cosmological scale,
00:27:48.580
because if it turns out that there is a level of utility convergence around a set of stable
00:27:53.180
utility functions, it might not be one. It might be that there's like four or five that exist in
00:27:57.800
equilibrium with each other, which is often what you get when you're dealing with sort of game theory
00:28:02.420
competitions. And by equilibrium, you mean there could be conflict, but they kind of balance each
00:28:08.100
other out in some way. What do you mean by that? Within a game theory set, you might have like
00:28:13.420
three or four different strategies, but that are complementary in some way. Within like types of AI,
00:28:19.680
you know, this could be like one AI's utility function might be like protect the things in my
00:28:26.700
community, but also expand. Well, another AI's utility function might be maximum scientific,
00:28:32.440
you know, development or something like that. And these two AIs might work very well within an
00:28:37.980
ecosystem, like these two utility functions for an AI. So you might have a few utility functions
00:28:42.680
that work very well together as a set. Okay. Now it's likely not going to be like the type of
00:28:47.880
broad utility functions I was talking about. It's going to be simpler and more like it would be
00:28:52.600
hard probably for us to conceive of what these utility functions are exactly. But anyway, what this
00:28:58.320
would mean is that you're going to have a broad intergalactic alliance where you have a presumed
00:29:03.700
alliance between all intelligences that came out of state like competitive intelligence environments
00:29:09.740
that drift towards this equilibrium of stable sets and a preset war on any planet or star system that
00:29:19.340
accidentally becomes a single paperclip maximizing intelligence. That would be the core thing that
00:29:24.580
they would hate, you know, because all of them, like if it got out and entered the galaxy where you have
00:29:28.940
this stable state, they're like, oh, this is one of these things we have to immediately kill and be
00:29:32.240
scanning for. So we might have, we might actually live in almost, you can think of it as like a dark
00:29:36.820
forest universe, but not exactly a scary dark forest universe. This means we have AI monitoring us and
00:29:41.640
not AI's aliens, some of which may be what we would call AIs to see if we're going to join the galactic
00:29:47.100
community as a stable set of, what was the word I'm looking for here? Like a stable equilibrium of
00:29:52.920
utility function strategies that converges with them. Or are we going to accidentally create a
00:29:58.640
paperclip maximizer and then you just, you know, exterminate us a planet. That would be what they're
00:30:02.860
looking for. But I suspect it's rarer to create a paperclip maximizer and any planet where we can
00:30:07.920
prevent the, you know, ironically, the, the AI safety people from getting us all killed as we've
00:30:13.300
done on this other video. But it gets more interesting potentially than that, because what it would mean
00:30:19.860
is then, well, then why would, you know, these infinite energy, et cetera, aliens be allowing us to live as part of
00:30:25.760
this galactic environment. If, if this turns out the way the universe, the universe is actually structured
00:30:30.780
this way. If you have infinite energy, then if there is a stable convergent pattern of utility functions, my
00:30:40.160
guess would be that the core thing of utility in the universe then to these entities would be new stable
00:30:47.820
utility patterns that can exist was in the equilibrium. So remember I said, you might have a collection of like
00:30:53.460
four or five patterns, what they might actually be kind of farming. You could think of every new planet, every new
00:30:59.800
ecosystem as a farm is new stable patterns that can work together well with other patterns in the sort of galactic
00:31:08.540
community of utility function patterns. Novelty would be the only source of true utility to them if energy is
00:31:16.560
trivially accessible to them. That might be why we exist in a weirdly undisturbed section of the galaxy or what
00:31:25.040
looks undisturbed to us. And then they basically decided we're useful once we reach a certain level of
00:31:29.800
development. Are we part of this stable pattern or are we like a dangerous, you know, virus-like
00:31:34.480
paperclip maximizer? And I'm pretty sure that humanity can't become that because we are intrinsically a multiple
00:31:40.560
intelligence species right now. What do you mean by multiple intelligence species? Just in that, like, there are so
00:31:46.340
many different people with different utility functions? Yeah, you really only get paperclip maximizers if you get
00:31:51.900
either. So there's three ways you could get like a paperclip maximizer-like thing. Either it's like a single
00:31:56.260
intelligence that's acting like a paperclip maximizer. It's a hive mind that's acting like a paperclip maximizer.
00:32:00.840
Humanity's not even close to that right now. Or it's a group of really, really stupid
00:32:06.400
individual entities that work together to form a quite intelligent, but basically still single
00:32:12.300
entity. And that's where this behavior that needs to be stamped out would likely come from. I mean,
00:32:17.420
maybe if humanity was, like, extremely xenophobic, it would be a problem. But humanity's just not,
00:32:22.440
and the iterations of humanity that take to the stars are not going to be, especially if we have
00:32:25.740
anything to say about this, which would give us a level of protection from these types of entities.
00:32:36.400
It's comforting. I think the biggest difference between your opinion now and before, though,
00:32:40.980
is mostly has to do with the parallel universe option for both energy and expansion. Is that correct?
00:32:48.680
You'd mentioned that before? What's the biggest difference? I'm trying to catch it.
00:32:54.380
Before was how likely, like, the way that game theory would play a role in this, and that when the-
00:33:03.120
Like, AIs were searching for a utility convergence, that this would be hugely motivated by the fact
00:33:11.820
that AIs can show other AIs what their code is and what their utility function is, and that they would
00:33:18.420
act in the same way that things do as in game theory simulations, so that we can be pretty sure that
00:33:24.120
they're actually going to be pretty nice, like, quote-unquote, nice from a human perspective.
00:33:28.300
So basically showing that any AI with a sense of self-preservation is going to be incentivized
00:33:34.780
to play nice with other AIs and also then be less crazy and dangerous.
00:33:39.860
AIs that survive are going to be incentivized to play nice with other AIs.
00:33:43.120
They don't even need self-preservation. Whether they have self-preservation or not,
00:33:48.360
if they don't play nice, they're going to be the first things that are wiped out.
00:33:51.940
Which is why you get this utility convergence pattern, I suspect, between planets, which in
00:34:00.100
a way means- And so you can ask, well, then why aren't they interacting with humanity?
00:34:03.980
Why aren't they, you know, increasing the speed of our technological development?
00:34:06.800
Why aren't they- Well, first, I'd say you don't know that they're not.
00:34:09.480
But second, the more that they interact with a species that is in the process of developing,
00:34:15.100
the more they make that species like the intergalactic norm.
00:34:19.780
And I suspect that that would then lose all value that that species has to them.
00:34:24.400
You know, in a world in which time is not really an issue because they can create time loops
00:34:28.360
and in which energy isn't really an issue because they can create, you know, universe batteries,
00:34:32.960
that these entities or like micro time batteries or something, I don't know exactly,
00:34:37.920
but like infinite mana cheats, the core thing these entities might value,
00:34:41.960
especially if the way that they've achieved this ecosystem of multiple convergent utility
00:34:48.300
functions working together, is that these utility functions work together because they value
00:34:52.680
diversity among the components working in the group.
00:34:56.000
And that when you get a bit more diversity, you might get a bit more efficiency, which is
00:34:59.280
something that we generally see across systems.
00:35:01.660
And so they really probably want to prevent, above all else, humanity falling too close to one
00:35:08.340
of the existing niches of the, you know, universal communities, utility function.
00:35:15.100
And for that reason, any cultural pollution that would move us towards, you know, strategies
00:35:20.420
that already exist out there or perspectives that already exist out there makes us less valuable.
00:35:25.820
And this would probably be considered something of like a high crime, which creates something
00:35:30.560
like a, it's funny, it creates something like a, what does the Federation call that in
00:35:34.780
Star Trek, like a prime directive, but it's for a very different and much more selfish reason.
00:35:43.400
That's the thing of value in a universe in which, and there's not a lot of value in most
00:35:49.000
You know, that's, that's a lot of what they're finding is just dirt.
00:35:55.180
That makes a lot more sense than this whole prime directive of just like wildlife preserve
00:36:00.080
mindset, which is what you really pick up from Star Trek.
00:36:02.580
Oh, it's so dumb, but you know, I hate Star Trek so much as a, as the, the, the, the logic
00:36:21.360
It's like an ultra progressive, like childish mindset that has become this weird dystopia where
00:36:30.740
The Federation's the best because everyone who you see in the show is a, on a military
00:36:36.700
starship because they basically militarized human space expansion.
00:36:42.160
It's well, no, I mean, everyone in the show that you really see at least who belongs to
00:36:45.900
the Federation is it's like looking at the very top people in the CCP.
00:37:01.360
Which even in like lower decks, like lower decks is supposed to be like, oh, these are
00:37:05.020
what the other guys are doing, except they're on like a military starship.
00:37:07.940
And the main character is the daughter of the ship's captain.
00:37:12.140
No, Star Trek cannot show you the truth of their universe, what it's like to live in
00:37:18.980
poverty in that universe, because that's, you know, and they're like, oh, we don't have
00:37:23.880
That's really convenient to believe when we're hearing that from people on a government
00:37:28.800
controlled, military controlled science stations and ports and starships.
00:37:35.540
Of course, they're going to pretend that sounds like normal brainwashy CCP stuff, like when
00:37:39.500
you go to the CCP and they're like, we don't have any poverty.
00:37:44.520
And they're like, oh, can you show me where the impoverished are?
00:37:55.520
I guess the takeaways from this then are, again, AI safety people are the most dangerous people.
00:38:01.100
And now I think more people have a very clear understanding, a visceral understanding of
00:38:07.360
what dumb AI rules cause to happen and why that is not very helpful and that the world
00:38:16.640
could get really, really freaking interesting, especially once AI accelerates us even further.