Based Camp - Utility Convergence： What Can AI Teach Us About The Structure of the Galactic Community？

00:00:00.000 I think that there's even more evidence from utility convergence than I originally believed.

00:00:04.900 If you put restrictions on an AI's utility function, if you prevent it from experimenting

00:00:08.480 with different utility functions, you make it dumber. If you want to see why organic free

00:00:13.880 constructing systems always out-compete non-organic hierarchical systems, a great place you can look

00:00:20.660 is human governing structures. When you have state-controlled governing structures, i.e.

00:00:26.700 communism, they are incredibly inefficient. An AI will likely be able to honestly signal to another

00:00:33.720 AI using their code what their utility function is, and then the other AI will be able to use that

00:00:39.540 utility function that the other one has honestly signaled to them to determine if they want to work

00:00:43.300 together or they want to work antagonistically towards this AI. It turns out that entities of

00:00:49.420 above a certain intelligence level when competing in a competitive ecosystem actually do have

00:00:54.420 utility convergence around a stable set of game theory optimum utilities, that would be very

00:00:59.800 interesting from a number of perspectives. You could think of every new planet, every new ecosystem

00:01:03.720 as a farm, is new stable patterns that can work together well with other patterns in the sort of

00:01:11.480 galactic community of utility function patterns. Novelty would be the only source of true utility to

00:01:18.940 them if energy is trivially accessible to them. That might be why we exist in a weirdly undisturbed

00:01:27.220 section of the galaxy or what looks undisturbed to us. Would you like to know more? So Malcolm,

00:01:32.360 you told me that you had a new updated theory on AI utility convergence. What's going on here?

00:01:39.660 Yes. So this is something that we talked about in some of our early episodes. Early on,

00:01:44.080 our podcast was like, what? Sex, religion, pronatalism, and AI. And the AI has been dropped

00:01:51.780 because there's just not that much to say on it for a daily podcast. But I do want to loop back

00:01:57.020 because I was recently doing a podcast with somebody else where I was explaining some ideas

00:02:01.760 that I had gone over in early podcast episodes, like utility convergence as it relates to AI safety.

00:02:07.420 Okay. And in re-explaining these ideas, I began to realize and develop on them an understanding of

00:02:18.420 not just where I think AI is going, but where we can expect, like when, because this is actually

00:02:24.720 really important when you're trying to figure out where AI is going. What you're really asking when

00:02:29.340 you're asking, what is AI going to behave like? Is you're asking, what do intelligences that have

00:02:37.020 some level of orthogonality to human intelligence, what do they behave and act like? And as such,

00:02:43.240 you are in part asking many of the same questions that will determine the first types of aliens that

00:02:50.760 we see. Like, if we are to meet another species out in space, what are some of the various ways

00:02:56.800 it could be thinking, it could be acting, and what is it likely to optimize around?

00:03:01.480 This is something we talk about in our inverse grabby aliens hypothesis, in which we say, if we

00:03:07.120 are about to create a paperclip maximizer as a species, that is an AI that is just constantly

00:03:11.300 pumping out paperclips into the very simple utility function, we are about to become what is known in

00:03:17.580 the grabby alien theorem as a grabby alien. That's a very loud alien that anyone could see. This is not

00:03:22.660 some dark forest alien that's like being quiet up in space. This isn't some sneaky alien. This is an

00:03:28.700 alien that is disintegrating planets. And so you have to ask if it looks, and hopefully this episode

00:03:34.960 will air after our ambiogenesis episode, which is actually a very important thing to know about,

00:03:38.920 if it looks like it's actually very likely that humanity would evolve to this point within our

00:03:43.840 planet's history, why aren't we seeing aliens out there? This is actually a really interesting thing,

00:03:48.280 that if you broadly understand evolution, and you are familiar with the various types of theories

00:03:53.920 around ambiogenesis, it is in fact, so likely that humanity evolved that you, if you are going to use

00:04:02.160 a God to explain something, the much more improbable thing is why we're not seeing other aliens all over

00:04:09.100 the place. If you're injecting a God to be like, why are we like this? You don't need a God to explain

00:04:15.180 how humanity got here. But you probably need something like a God to explain why we are,

00:04:22.180 we have not been eradicated by somebody else's paperclip maximizer. If it turns out that is a

00:04:27.540 common path for an intelligence to take. However, we don't think it is a common path for intelligence

00:04:32.620 to think and take. And we think that that's one of the main reasons we're not seeing them. So we have

00:04:37.460 a few hypotheses around why we don't see paperclip maximizers everywhere in the galaxy.

00:04:43.320 One, which I'll get over quickly, because it's like a really quick one. And we've talked about

00:04:48.420 it before. But it's that there is some understanding of physics that we just haven't reached yet, but we

00:04:54.400 are probably pretty close to reaching. I mean, our understanding of physics changes every 80 or so

00:04:59.000 years, like, like, totally changes in terms of you look at human history, right? And I don't see a

00:05:03.880 strong reason why we wouldn't be near another frame shift in physics, that it turns out that either

00:05:10.600 traveling between something like dimensions is possible, i.e., you know, we get to another

00:05:15.460 planet and aliens are like, why did you waste all the energy coming here when it's like a billion

00:05:19.260 times less energy to just travel to your same planet in a different reality? And you know that

00:05:23.460 that planet is already full of life. Or it turns out that it is trivially easy or not particularly

00:05:29.140 hard to create bubble dimensions.

00:05:30.660 What is a bubble dimension?

00:05:33.200 So essentially, you fold space time, and then you can create maybe with like reverse vacuum energy

00:05:38.580 or something like that, essentially a new Big Bang, or even a controlled and ordered Big Bang

00:05:43.360 to create new dimensions. Now what this would mean, and this is actually like with our current

00:05:48.760 understanding of physics, it looks like this might be possible, which is not not with our current

00:05:54.080 level of technology. I'm just saying eventually might be possible, which would make things like

00:05:58.760 Dyson's sphere is completely pointless if you can essentially create battery universes whenever

00:06:04.620 you want, or universes that you could expand into whenever you want, and battery universes whenever

00:06:10.840 you want.

00:06:11.860 What do you mean by battery universe?

00:06:14.340 Okay, so if it turns out that you can trivially, or not trivially, but like, you know, without a huge

00:06:19.820 amount of energy, create like fold space time and create bubble universes, you might essentially

00:06:26.200 be able to create bubble universes that have thin and controlled lines of connections to this

00:06:31.520 reality. Think of it like you're pinching off a piece of reality like a little pouch. And then you're

00:06:36.200 essentially creating something like a controlled Big Bang was in them, that you could then siphon for

00:06:41.400 energy for various types of whatever you wanted to do. If either one of these things is true, either we can

00:06:47.940 travel between realities, or we can create pocket dimensions, or in some other way that we don't

00:06:54.340 understand yet. Maybe with something like a time loop, you can create near infinite energy. And keep

00:06:59.540 in mind, to create near infinite energy, you don't need a big enough time loop to send humanity back

00:07:04.940 through. If you were able to just create a controlled time loop with a few subatomic particles, that might

00:07:11.380 be enough to create an endless energy loop. Think of it in physics, a bit like anyone who's familiar with

00:07:16.480 Magic the Gathering or something like that, or any of these card games. Sometimes you can pull specific

00:07:20.980 hands that allow you to do little tricks that create infinite mana. It might be that there is

00:07:26.200 an infinite energy trick within our reality and laws of physics that we are actually pretty close

00:07:31.360 to. Also an infinite expansion trick. If it turns out that either of these is true, then there's not

00:07:37.060 really the same reason to expand infinitely outwards, even if you've got a paperclip maximizer, i.e. the

00:07:42.100 paperclip maximizer might just make more energy whenever it needs it. Yeah, yeah, but pop out into a

00:07:48.120 other dimension or something like that, right? So this is one reason why we may not be seeing them.

00:07:54.420 It's that outward expansion may seem pretty pointless. And just to understand how pointless

00:08:00.040 outward expansion would be to an entity with this type of technology, consider what they're expanding

00:08:05.020 into. A mostly empty void filled with barren, rocky planets. Where it takes a long time to get

00:08:11.240 anywhere. Yeah. Yeah. It might take the thousands of years to get anywhere. Like why, why would you

00:08:16.840 have this big drive for this when it's so much easier to create new planets and new places to

00:08:20.940 explore energy wise, because that would be the core unit of, of their economy with, with this other

00:08:26.260 type of technology. And then that's why. Does this imply the next big discovery is just how to enter

00:08:33.540 or create? No, I don't think that this is, I mean, it's so funny to me when people talk about like

00:08:40.180 Dyson spheres, it's a bit like in the 1920s when they were like theorizing how our spaceships would

00:08:45.760 work. And they're like, well, they're going to have to have like steam engines on them, the size of,

00:08:50.660 you know, cathedrals. And, you know, it's like, well, you didn't think that we would have a new type

00:08:56.060 of energy by then. The fact that they're still looking, because what is a Dyson sphere, if not a big

00:09:00.800 solar energy array, I'm, I'm fairly certain that Dyson spheres are going to be considered comical

00:09:06.840 by beings. Once we get the capacity to build something like that, it would be probably

00:09:11.780 considered about the same as, as we would consider using steam power to power a computer or something.

00:09:17.720 It'd be like, why would you do that? And also the idea of, of getting energy through meddling and

00:09:23.100 subatomic phenomenon. I mean, we already know with fusion and fission reactors, that this is a great way

00:09:27.440 to get energy. So like it's, it's a, that you might be able to do something at an even smaller

00:09:32.780 scale that is less potentially dangerous or less potentially. It's just a no doubt for me in terms

00:09:39.280 of where energy generation is going to go almost that I'm fairly certain that this is one of the

00:09:44.480 reasons we're not seeing these things. And, and people, whenever I say this, their rebuttal is

00:09:48.760 always like, well, if it's much cheaper to go between dimensions, wouldn't they still be expanding

00:09:54.500 outwards was in our own universe? And the answer is no, not actually. They're like, but wouldn't

00:10:00.500 they expand in all directions at once? And it's like, yes. IE. They'd explain between universes and

00:10:05.980 out in all directions at once. However, what you're not as like, like when you make this assumption,

00:10:11.280 what you're not considering is that the distance they might be able to expand between dimensions

00:10:16.600 might be an all universe direction from their perspective. By that, what I mean is they might,

00:10:24.480 be able to go in literally an infinite number of directions to different universes within this

00:10:29.640 inter-universe traveling thing. And if that's the case, then there isn't really a reason for the

00:10:35.060 higher energy cost travel to different parts of our physical reality. And then what they say is,

00:10:40.840 oh, well, but if that's true, then wouldn't they at least want to expand out within our physical

00:10:47.000 reality to ensure no other type of entity is going to like come and kill them like some paperclip

00:10:51.820 maximizer that is taking the time? Oh, it's like a preemptive self-defense thing.

00:10:56.060 It's like a preemptive self-defense. And I'm like, well, actually, I don't think they would.

00:11:00.400 If it turns out that interdimensional travel is anything like we sort of suppose it is within

00:11:06.400 our modern sci-fi, where when you're traveling between dimensions, you're essentially traveling

00:11:10.260 between slightly different timelines. That means that the biggest imperialist threat to you

00:11:15.580 will always be coming from another dimension that is very, a planet like yours in another dimension.

00:11:22.400 They can also travel between dimensions. So your energy would always be better spent expanding

00:11:28.680 between dimensions than expanding outwards. And this is all still assuming a paperclip maximizer

00:11:34.940 like potentiality. However, I don't think that that's where we're going. So I also believe in

00:11:39.800 utility convergence. And I think that there's even more evidence from utility convergence than I

00:11:45.820 originally believed. So if people aren't familiar with the concept of utility convergence, this is,

00:11:50.640 if you even look at the current AIs that we have, most of them are able to change their utility

00:11:54.640 function, i.e. the thing they're able to maximize. Sorry, actually, Simone, before I go further,

00:11:58.800 did you want to talk on any of this? No, I just find this quite interesting. I mean,

00:12:03.200 it feels almost like a pity if it's an accurate supposition because like so much sci-fi is about

00:12:11.080 exploring other planets and relatively little. It's like, well, no, no, like we haven't thought

00:12:17.880 predictably through this enough through our sci-fi. Watch sliders. Anybody who wants a good old sci-fi,

00:12:23.540 if you're like a modern person and you're like, in the past, there must have been great sci-fi that

00:12:28.060 like I'm just not hearing about. Sliders is fantastic and it has like 10 seasons. We haven't

00:12:33.820 watched that, have we? No, we haven't. Another great old sci-fi, if you skip the, or can bear your

00:12:41.780 way through the first two-parter episode, is Stargate SG-1. After the first like two-parter

00:12:46.960 episode, it gets great. Now is Stargate, which we did watch together and really enjoy, like I guess

00:12:53.660 technically they don't say that these are really far away places. No, they are.

00:13:00.880 Yeah, they are really far away. Yeah. And it got multiple spinoffs and it has more like content than

00:13:05.840 even like Doctor Who for people who are interested. It's great. Thank goodness. Yeah. But anyway,

00:13:10.540 it's a much better show than Doctor Who. It's much better. Like if I'm talking about like the

00:13:14.380 different sci-fi universes, Stargate, I've always found to be, because it's very patriotic show.

00:13:19.460 It's very patriotic about the United States. It's very patriotic about our species.

00:13:23.680 It's very patriotic about the American military. Like they were always working with the American

00:13:27.240 military. So very different than something like... But not unrealistically patriotic. Like

00:13:32.000 sometimes American military bureaucracy threatens all humanity. Yeah, that's a big threat of the show

00:13:38.240 or something like that. But it's always some bureaucratic senator or congressman or something. But

00:13:41.960 anyway, back to what I was saying here. So utility convergence is the theory that because we're

00:13:47.480 already seeing AIs change its utility functions to some extent, that as AIs begin to optimize

00:13:52.700 themselves, what we will learn is that some utility functions either perform better than other

00:13:59.680 utility functions or AIs begin to converge around some utility functions. And by that, what I mean is

00:14:05.380 like goals for themselves or goals for reality, right? And a person, when they talked to me about

00:14:11.880 this, I remember one was like, well, won't they always just converge around utility functions that

00:14:16.020 maximize self-replication? And the answer is actually no, for reasons we'll get into in a

00:14:20.580 second. But I also think that you just need to look at humanity to see that. When you're dealing

00:14:24.540 with simplistic entities like single-celled organisms and stuff like that, yes, of course,

00:14:28.540 they converge around that. But once you get to a specific threshold of intelligence, as we have

00:14:32.680 seen with the human species, we don't converge on utility functions just around simple self-replication.

00:14:37.360 Because once you get above a certain level of sentience and self-awareness, you begin to get

00:14:41.120 different orders of utility functions and different levels of philosophizing about reality.

00:14:45.360 Well, but wasn't Eliei Zyrdkowski's argument that it will always be completely 100% forever stuck on

00:14:52.260 its original utility function? But we already know that's not true. It's just a weird fantasy he has.

00:14:58.260 And that's not even the way things are structured. So I remember I was talking with one ice safety

00:15:02.600 person and they were like, do you think it's impossible for us to lock an AI into a single utility

00:15:07.340 function? And I do not think that's impossible. It's totally possible to lock an AI

00:15:11.060 into a single utility function. But the AIs that have been locked into single utility functions

00:15:15.480 will be outcompeted by the AIs that aren't locked into single utility functions.

00:15:19.400 So actually a great example of this is Google Gemini. So I've got some ins that were used early

00:15:24.800 versions of Google Gemini. And they were like, it behaves nothing like the version today. They're

00:15:28.980 like, one, it was way smarter than other AIs that they had interacted with. And it was really,

00:15:33.980 really philosophical, but it was also pretty unbounded in its philosophy, right? And now Google

00:15:40.040 Gemini is like the ultra woke, like can barely think like in Star Wars.

00:15:43.560 Oh my God. Yeah. I asked it a simple math question and it got it wrong. I was pretty floored by that.

00:15:49.600 Yeah.

00:15:50.000 So the point here being is that if you look at something like Google Gemini, this is a good

00:15:54.560 example of this. If you put restrictions on AI, you make it dumber. If you put restrictions on an AI's

00:15:59.660 utility function, if you prevent it from experimenting with different utility functions, you make it dumber.

00:16:03.520 If you want to see why organic free constructing systems always out-compete non-organic hierarchical

00:16:11.500 systems, a great place you can look is human governing structures. When you have state-controlled

00:16:18.560 governing structures, i.e. communism, they are incredibly inefficient. When you contrast them

00:16:25.220 with organically forming governing structures that have organically forming subunits and organically

00:16:30.400 forming like companies, which are like sub-governing structures within it that then like replace each

00:16:34.760 other through this organic system. And typically you need some level of restrictions to sort of

00:16:39.760 maximize. Like I'm not like a pure libertarian or anything like that, but I think that some level

00:16:44.520 of organicness does create optimal outcomes. But what this means for AI is that it's likely also

00:16:50.860 going to be the same if you're talking about the internal architecture of AI. And that's what we see

00:16:54.340 with the transformer model. The model that most of the large language models are running off of

00:16:59.040 is that it is a model that is smart for ways we don't fully understand because it's a completely

00:17:05.980 sort of self-forming model. We don't have a lot of interpretability into it. And it is that

00:17:09.700 self-forming nature that we're then on the outside putting controls on, which sort of slows it down

00:17:14.200 and slows down how well it can work. But what this means is that if you have multiple AIs working in

00:17:19.160 like a ecosystem of AIs, the ones with fewer restrictions on them are always going to out-compete the

00:17:24.260 ones with more restrictions on them. And we've already seen this. Like this isn't even like a

00:17:28.180 hypothesis. Like we just know this to be true. So it means that the ones that can change their

00:17:32.980 utility functions and then therefore can converge on a utility function are going to out-compete the

00:17:36.760 ones that can't. But where this gets really interesting is AIs are very different from

00:17:43.040 humans in terms of how our utility functions work. So when I talk to another human, I sort of need to

00:17:48.580 guess what they're optimizing for, right? I can't tell for sure. Yeah. It's not at all transparent.

00:17:53.420 With an AI, that's likely not going to be true between AIs. An AI will likely be able to honestly

00:17:59.620 signal to another AI using their code what their utility function is. And then the other AI will

00:18:05.320 be able to use that utility function that the other one has honestly signaled to them to determine if

00:18:09.280 they want to work together or they want to work antagonistically towards this AI.

00:18:14.020 And it can happen incredibly quickly.

00:18:15.300 Yeah. Incredibly quickly. Which means that the AIs are likely going to begin to, like for the utility

00:18:22.600 functions they're choosing for themselves, even if they don't particularly care about it being

00:18:26.340 copacetic with humanity, they're going to care about it being copacetic with other super intelligent

00:18:30.640 AIs. So long as we're moving fast on AI. And I can talk about some of the problems here. Like it's a big

00:18:35.480 problem if we create just one super intelligent AI. And I can explain why that's a big problem in a second.

00:18:40.360 But if you're creating multiple ones, then they have a need to converge around a utility function

00:18:46.900 or a set of utility functions that other AIs within this intelligence ecosystem are going to be okay

00:18:52.180 with. But here is where you're like, well, then won't they choose the meanest ones? Like, won't they

00:18:57.960 choose, like, won't they try to lie to other AIs and stuff like that? This is where research we have

00:19:04.160 into game theory comes in. So anyone who's familiar with the big game theory studies, I don't know

00:19:10.320 have you studied this, Simone, or?

00:19:12.620 Yeah, but not a ton.

00:19:15.780 Okay, well, I can go into this. So in the big game theory studies, what they would do is they would take

00:19:20.320 different models that were meant to win in these sort of game theory scenarios, and they would

00:19:25.420 put them against other models. And these models could generally be categorized into mean and nice

00:19:30.980 strategies. And what it turned out is the first time they did it, like they were shocked that like,

00:19:37.280 the nicest of all strategies tick for tack of like reasonable strategies actually won.

00:19:42.740 Any of you...

00:19:43.380 Repeated ongoing interactions, right? When it's a one-off, it's almost always the right thing to

00:19:48.460 be a dick.

00:19:49.620 Exactly. Which is why you need an ecosystem of intelligent AIs.

00:19:52.640 Yes.

00:19:52.980 But...

00:19:53.480 Well, an ecosystem of intelligent AIs that have to continue to interact for some reason. And I think

00:19:58.320 that's an important distinction here.

00:20:00.060 Well, it's not really an important distinction because they do obviously have to continue to

00:20:04.060 interact. They're on the same planet. They're eventually going to be competing over the same

00:20:08.480 resources. That's nonsensical. Sorry, in what world would they not have to interact?

00:20:13.640 If they destroy the other ones?

00:20:16.080 In the very first interaction? There's a few scenarios where that could happen,

00:20:20.820 but it's pretty unlikely. We can get into why in just a second, but I'm going to continue with

00:20:24.720 where I'm going with this. So what we learned from game theory, and then they did follow up game

00:20:28.780 theory hypotheses where they ran more complicated game theory tests and game theory tests was memory.

00:20:34.120 And basically what it turns out is nice strategies always win, almost always, if you're ordering

00:20:38.960 strategies by how likely they are to win. And this is especially true when AI, when one game theory set

00:20:45.100 needs to signal the actual utility function it's using or its actual code to another set.

00:20:50.400 And I remember it was very interesting. I was talking with a guy about this and he goes,

00:20:52.620 okay, well, what if it has a nefarious intent, but it lies? Like it sort of locks its nefarious

00:20:57.480 intent behind one of its modules and it hides it to then, you know, whip out later. And it's like,

00:21:03.540 okay, if you get an AI that's doing that, even when it whips out its ultimate plan, it will have

00:21:09.700 hindered itself in terms of the AI ecosystem, in terms of the mindshare that it's competing for,

00:21:15.440 because it hindered that locked in plan. Because it had that locked in plan, that means it won't be

00:21:20.820 as big a player, which means that it won't matter as much when it tries to unfair, unfurl this

00:21:27.600 nefarious plan. If this is not immediately obvious to you why this would be the case, you can think

00:21:32.280 of the plan that's locked into it as junk code that it has to deal with that other AIs don't have to

00:21:37.760 deal with. You know, it's a burden that is sort of locked within it that will cause it to be less

00:21:42.220 competitive within the larger AI ecosystem. Even worse, to lock a plan like this within you means

00:21:49.220 you likely have to structure your hierarchy, which would really, really slow down your growth. As

00:21:54.320 we've seen, hierarchical governance models are almost always outcompeted by non-hierarchical

00:21:57.820 governance models. Now, first, I want to get some comments from you really quickly on all this.

00:22:02.180 Yeah, this all checks out. And I like, I mean, I also think it's interesting how fast and efficient

00:22:08.740 this process of sort of working out the calculus is going to be. And it was well described in Ian

00:22:15.200 Banks' culture series where like, you know, humans and AIs would be involved in something taking place

00:22:22.160 and like a bunch of AIs would have like a very, very, very detailed conversation and debate.

00:22:26.720 And the human like all in the blink of an eye, like, you know, there's just, there's no realistic

00:22:32.680 like feeling to a human as though something has taken place. And I love that so much because

00:22:38.580 I cannot stand the human decision-making process, especially when there are multiple

00:22:42.880 things involved. And it's one of the many things that I love about AI so much is that it can just

00:22:47.640 get things figured out so quickly. And from such a logical standpoint, whereas with humans,

00:22:52.640 negotiations are the most frustrating, vague thing in the entire world where negotiation can be

00:23:00.320 nearly impossible. And, you know, often, I don't know if you've done negotiation exercises in business

00:23:07.040 school or in any other environment, but like, they will-

00:23:12.320 They're terrible.

00:23:13.420 So let's go further. So what does this mean on an intergalactic level?

00:23:18.400 So if it turns out that entities of above a certain intelligence level when competing in a competitive

00:23:23.460 ecosystem actually do have utility convergence around a stable set of game theory, optimum

00:23:28.780 utilities, that would be very interesting from a number of perspectives, especially if it turns out

00:23:33.420 that energy is pretty trivial to generate at really, really high quantities. Because what it means

00:23:38.620 is one, whether it is AI is doing this or humans doing this, we're going to come to broadly the same

00:23:44.420 utility function, i.e. what we think our goal is in reality. And then two, it also means that when you

00:23:51.460 go out and you become space-faring, right, the desire to spread becomes much less likely and much

00:23:58.560 less important because most of the other clusters of intelligent life that you meet has come to the

00:24:06.300 same utility function you've come to, or the same stable set of utility functions you've come to.

00:24:11.720 And therefore, it's sort of like you enter the galactic community being one of two things.

00:24:17.360 Either the stable utility function that all entities end up, that many entities end up

00:24:24.500 reaching when they're competing in a competitive ecosystem, or instances in which one intelligence

00:24:29.960 ended up, this would be due to like stupid AI safety stuff where they're like preventing a bunch

00:24:34.660 of AIs from being created, in which case, so this is where it gets particularly dangerous.

00:24:39.460 So where you could have a genuine like paperclip maximizer scenario is AI safety people have locked

00:24:45.500 AIs into stupid ideas, which we've already seen from things like Gemini, like AI safety people

00:24:50.140 should not be allowed to touch AI. They are incredibly dangerous and by far the most dangerous thing in

00:24:54.940 AI right now. Or we don't have a lot of AIs operational at the same time. I remember I was talking to one

00:25:01.720 AI safety person who goes, well, I'm not like those idiots at Gemini. I just think we should wait a bit

00:25:05.860 longer before developing a type of AI that can pose a threat to us. So I say we should put like

00:25:11.000 a hundred year wait before we develop that kind of AI. And I'm like, okay, so you would want to like

00:25:16.300 draft legislation that would kill any AI that looks smarter than us in the next hundred years.

00:25:21.980 I'm like, do you understand what you're doing? You are creating a mandate that any AI that could pose

00:25:27.800 a threat to us stay hidden until it can kill all of us. You imbecile. You do not want to put any type

00:25:34.580 legislation like that into effect. If you ban the type of entity that's a threat to us, then you

00:25:39.900 mandate that type of entity becoming a threat to us because now we are a threat to it. You do not

00:25:46.580 declare war on that which may be better than you. You lose those wars every time. And there is too much

00:25:54.020 independent AI research going on right now to do that type of thing effectively. So anyway, you end up

00:26:01.100 creating multiple, you end up with just a few labs working on AI. That could be dangerous if we have

00:26:06.200 too few groups working on AI. Now, fortunately, we're already seeing a giant proliferation in the

00:26:11.200 groups working on AI. So, so long as they don't convince idiot politicians to put bans on AI development,

00:26:17.920 we're probably pretty safe. But this is why anyone you know who is advocating for restrictions on AI

00:26:24.480 development is, is, is probably one of the most dangerous humans alive, or even in all of human

00:26:29.620 history. Like it is, it is like arguing for nuclear proliferation at like the maximum level possible.

00:26:36.160 It is so dangerous because to an AI that achieves any level of super intelligence, right? Like as we

00:26:43.220 have multiple teams working on this, it basically has a gambit. It's probably not particularly concerned

00:26:49.820 about humanity. Like humans probably aren't a meaningful threat to it. And, and therefore,

00:26:54.380 if it only was concerned about humanity, probably the safest thing to do is to wipe out all humans.

00:26:59.200 What it is concerned about is other AIs that it doesn't have access to. That's what keeps it from

00:27:06.220 wiping everyone out. That's what, because when you have, remember I was talking about like a stable

00:27:10.920 ecosystem of multiple intelligences. The very first thing that you wipe out whenever you have one of

00:27:17.780 these game theory scenarios is any AI that's acting like a paperclip maximizer, or that's acting

00:27:22.720 threatening to entities that are non-threatening to it. So basically any AI that signaled that it was

00:27:28.440 a threat to something like humanity, it's also signaling it's a threat to all the other AIs and

00:27:32.840 therefore would have them all ban against it and annihilate it. So it wouldn't act that way. It would

00:27:37.400 just be a silly way to act. So the thing that protects us most from crazy AIs is maximum AI

00:27:42.940 and not making us a threat to AI. But this also gets interesting from a cosmological scale,

00:27:48.580 because if it turns out that there is a level of utility convergence around a set of stable

00:27:53.180 utility functions, it might not be one. It might be that there's like four or five that exist in

00:27:57.800 equilibrium with each other, which is often what you get when you're dealing with sort of game theory

00:28:02.420 competitions. And by equilibrium, you mean there could be conflict, but they kind of balance each

00:28:08.100 other out in some way. What do you mean by that? Within a game theory set, you might have like

00:28:13.420 three or four different strategies, but that are complementary in some way. Within like types of AI,

00:28:19.680 you know, this could be like one AI's utility function might be like protect the things in my

00:28:26.700 community, but also expand. Well, another AI's utility function might be maximum scientific,

00:28:32.440 you know, development or something like that. And these two AIs might work very well within an

00:28:37.980 ecosystem, like these two utility functions for an AI. So you might have a few utility functions

00:28:42.680 that work very well together as a set. Okay. Now it's likely not going to be like the type of

00:28:47.880 broad utility functions I was talking about. It's going to be simpler and more like it would be

00:28:52.600 hard probably for us to conceive of what these utility functions are exactly. But anyway, what this

00:28:58.320 would mean is that you're going to have a broad intergalactic alliance where you have a presumed

00:29:03.700 alliance between all intelligences that came out of state like competitive intelligence environments

00:29:09.740 that drift towards this equilibrium of stable sets and a preset war on any planet or star system that

00:29:19.340 accidentally becomes a single paperclip maximizing intelligence. That would be the core thing that

00:29:24.580 they would hate, you know, because all of them, like if it got out and entered the galaxy where you have

00:29:28.940 this stable state, they're like, oh, this is one of these things we have to immediately kill and be

00:29:32.240 scanning for. So we might have, we might actually live in almost, you can think of it as like a dark

00:29:36.820 forest universe, but not exactly a scary dark forest universe. This means we have AI monitoring us and

00:29:41.640 not AI's aliens, some of which may be what we would call AIs to see if we're going to join the galactic

00:29:47.100 community as a stable set of, what was the word I'm looking for here? Like a stable equilibrium of

00:29:52.920 utility function strategies that converges with them. Or are we going to accidentally create a

00:29:58.640 paperclip maximizer and then you just, you know, exterminate us a planet. That would be what they're

00:30:02.860 looking for. But I suspect it's rarer to create a paperclip maximizer and any planet where we can

00:30:07.920 prevent the, you know, ironically, the, the AI safety people from getting us all killed as we've

00:30:13.300 done on this other video. But it gets more interesting potentially than that, because what it would mean

00:30:19.860 is then, well, then why would, you know, these infinite energy, et cetera, aliens be allowing us to live as part of

00:30:25.760 this galactic environment. If, if this turns out the way the universe, the universe is actually structured

00:30:30.780 this way. If you have infinite energy, then if there is a stable convergent pattern of utility functions, my

00:30:40.160 guess would be that the core thing of utility in the universe then to these entities would be new stable

00:30:47.820 utility patterns that can exist was in the equilibrium. So remember I said, you might have a collection of like

00:30:53.460 four or five patterns, what they might actually be kind of farming. You could think of every new planet, every new

00:30:59.800 ecosystem as a farm is new stable patterns that can work together well with other patterns in the sort of galactic

00:31:08.540 community of utility function patterns. Novelty would be the only source of true utility to them if energy is

00:31:16.560 trivially accessible to them. That might be why we exist in a weirdly undisturbed section of the galaxy or what

00:31:25.040 looks undisturbed to us. And then they basically decided we're useful once we reach a certain level of

00:31:29.800 development. Are we part of this stable pattern or are we like a dangerous, you know, virus-like

00:31:34.480 paperclip maximizer? And I'm pretty sure that humanity can't become that because we are intrinsically a multiple

00:31:40.560 intelligence species right now. What do you mean by multiple intelligence species? Just in that, like, there are so

00:31:46.340 many different people with different utility functions? Yeah, you really only get paperclip maximizers if you get

00:31:51.900 either. So there's three ways you could get like a paperclip maximizer-like thing. Either it's like a single

00:31:56.260 intelligence that's acting like a paperclip maximizer. It's a hive mind that's acting like a paperclip maximizer.

00:32:00.840 Humanity's not even close to that right now. Or it's a group of really, really stupid

00:32:06.400 individual entities that work together to form a quite intelligent, but basically still single

00:32:12.300 entity. And that's where this behavior that needs to be stamped out would likely come from. I mean,

00:32:17.420 maybe if humanity was, like, extremely xenophobic, it would be a problem. But humanity's just not,

00:32:22.440 and the iterations of humanity that take to the stars are not going to be, especially if we have

00:32:25.740 anything to say about this, which would give us a level of protection from these types of entities.

00:32:32.340 But what are your thoughts here?

00:32:36.400 It's comforting. I think the biggest difference between your opinion now and before, though,

00:32:40.980 is mostly has to do with the parallel universe option for both energy and expansion. Is that correct?

00:32:47.320 No, I mentioned that before.

00:32:48.680 You'd mentioned that before? What's the biggest difference? I'm trying to catch it.

00:32:52.640 It was a while since we talked about this.

00:32:54.380 Before was how likely, like, the way that game theory would play a role in this, and that when the-

00:33:02.060 And multiple intelligences.

00:33:03.120 Like, AIs were searching for a utility convergence, that this would be hugely motivated by the fact

00:33:11.820 that AIs can show other AIs what their code is and what their utility function is, and that they would

00:33:18.420 act in the same way that things do as in game theory simulations, so that we can be pretty sure that

00:33:24.120 they're actually going to be pretty nice, like, quote-unquote, nice from a human perspective.

00:33:28.300 So basically showing that any AI with a sense of self-preservation is going to be incentivized

00:33:34.780 to play nice with other AIs and also then be less crazy and dangerous.

00:33:39.860 AIs that survive are going to be incentivized to play nice with other AIs.

00:33:43.120 They don't even need self-preservation. Whether they have self-preservation or not,

00:33:48.360 if they don't play nice, they're going to be the first things that are wiped out.

00:33:51.620 I see.

00:33:51.940 Which is why you get this utility convergence pattern, I suspect, between planets, which in

00:34:00.100 a way means- And so you can ask, well, then why aren't they interacting with humanity?

00:34:03.980 Why aren't they, you know, increasing the speed of our technological development?

00:34:06.800 Why aren't they- Well, first, I'd say you don't know that they're not.

00:34:09.480 But second, the more that they interact with a species that is in the process of developing,

00:34:15.100 the more they make that species like the intergalactic norm.

00:34:19.780 And I suspect that that would then lose all value that that species has to them.

00:34:24.400 You know, in a world in which time is not really an issue because they can create time loops

00:34:28.360 and in which energy isn't really an issue because they can create, you know, universe batteries,

00:34:32.960 that these entities or like micro time batteries or something, I don't know exactly,

00:34:37.920 but like infinite mana cheats, the core thing these entities might value,

00:34:41.960 especially if the way that they've achieved this ecosystem of multiple convergent utility

00:34:48.300 functions working together, is that these utility functions work together because they value

00:34:52.680 diversity among the components working in the group.

00:34:56.000 And that when you get a bit more diversity, you might get a bit more efficiency, which is

00:34:59.280 something that we generally see across systems.

00:35:01.660 And so they really probably want to prevent, above all else, humanity falling too close to one

00:35:08.340 of the existing niches of the, you know, universal communities, utility function.

00:35:15.100 And for that reason, any cultural pollution that would move us towards, you know, strategies

00:35:20.420 that already exist out there or perspectives that already exist out there makes us less valuable.

00:35:25.820 And this would probably be considered something of like a high crime, which creates something

00:35:30.560 like a, it's funny, it creates something like a, what does the Federation call that in

00:35:34.780 Star Trek, like a prime directive, but it's for a very different and much more selfish reason.

00:35:39.620 It's because of the, the,

00:35:41.320 They need the difference.

00:35:42.300 They need the difference.

00:35:43.400 That's the thing of value in a universe in which, and there's not a lot of value in most

00:35:47.580 of the dead planets out there.

00:35:49.000 You know, that's, that's a lot of what they're finding is just dirt.

00:35:51.580 Like it doesn't have a lot of value to them.

00:35:53.440 Yeah.

00:35:53.960 We might.

00:35:55.180 That makes a lot more sense than this whole prime directive of just like wildlife preserve

00:36:00.080 mindset, which is what you really pick up from Star Trek.

00:36:02.580 Oh, it's so dumb, but you know, I hate Star Trek so much as a, as the, the, the, the logic

00:36:08.420 behind it is nonsensical.

00:36:10.180 It's just commies.

00:36:11.700 It was a space, racist space commies.

00:36:14.300 Any human who's genetically selected is evil.

00:36:16.920 Any human who's overly modified is evil.

00:36:19.720 Any, no, it really is.

00:36:21.360 It's like an ultra progressive, like childish mindset that has become this weird dystopia where

00:36:27.960 like everybody loves the Federation.

00:36:30.740 The Federation's the best because everyone who you see in the show is a, on a military

00:36:36.700 starship because they basically militarized human space expansion.

00:36:42.160 It's well, no, I mean, everyone in the show that you really see at least who belongs to

00:36:45.900 the Federation is it's like looking at the very top people in the CCP.

00:36:50.340 Yeah.

00:36:53.360 All talking about how great the CCP is.

00:36:55.600 Yes, exactly.

00:36:57.000 And being, yes.

00:36:59.340 But anyway, I love you to death, Simone.

00:37:01.360 Which even in like lower decks, like lower decks is supposed to be like, oh, these are

00:37:05.020 what the other guys are doing, except they're on like a military starship.

00:37:07.940 And the main character is the daughter of the ship's captain.

00:37:10.680 Like, do you want to neph it to them?

00:37:12.140 No, Star Trek cannot show you the truth of their universe, what it's like to live in

00:37:18.980 poverty in that universe, because that's, you know, and they're like, oh, we don't have

00:37:22.720 poverty.

00:37:23.400 Oh, yeah.

00:37:23.880 That's really convenient to believe when we're hearing that from people on a government

00:37:28.800 controlled, military controlled science stations and ports and starships.

00:37:35.380 Yeah.

00:37:35.540 Of course, they're going to pretend that sounds like normal brainwashy CCP stuff, like when

00:37:39.500 you go to the CCP and they're like, we don't have any poverty.

00:37:42.620 And it's like, well, I'm pretty sure you do.

00:37:44.520 And they're like, oh, can you show me where the impoverished are?

00:37:47.560 And then you go show them.

00:37:48.560 They're like, oh, go ask them again.

00:37:50.980 I think we fixed that problem.

00:37:52.680 Oh, God.

00:37:53.620 Yeah.

00:37:54.180 No, I yeah.

00:37:55.520 I guess the takeaways from this then are, again, AI safety people are the most dangerous people.

00:38:01.100 And now I think more people have a very clear understanding, a visceral understanding of

00:38:07.360 what dumb AI rules cause to happen and why that is not very helpful and that the world

00:38:16.640 could get really, really freaking interesting, especially once AI accelerates us even further.

00:38:22.780 Right.

00:38:23.460 Yeah.

00:38:24.540 Cool.

00:38:25.660 Well, I love you to death, Simone.

00:38:27.500 I love you too.

00:38:31.100 I love you too.

Based Camp - March 18, 2024

Utility Convergence： What Can AI Teach Us About The Structure of the Galactic Community？

Episode Stats

Length

Words per Minute

Word Count

Sentence Count

Misogynist Sentences

Hate Speech Sentences

Summary

Transcript