The Podcast of the Lotus Eaters - August 26, 2025


PREVIEW: Brokenomics | Living in Space with Grant Donahue: Part 2


Episode Stats

Length

9 minutes

Words per Minute

187.20238

Word Count

1,769

Sentence Count

2

Hate Speech Sentences

4


Summary

In this episode of the podcast, we discuss the dangers of AI and its impact on our world, from the practical to the ridiculous. We also discuss how AI can be harnessed and harnessed in order to create a better world.


Transcript

00:00:00.000 so having covered um living in space um from the practical to the ridiculous
00:00:14.660 um you did touch earlier on ai government you you seem to have gone a little bit
00:00:20.120 um negative on the whole thing what was going on there well it's i think that the problem with
00:00:27.000 the way that we we examine ai is we tend to over and under anthropovize it in that we tend to assume
00:00:35.060 that it can't become dangerous until it's like us and that's not true um and we also tend to assume
00:00:39.960 that it only becomes dangerous in the ways that human beings are dangerous and that's also not
00:00:43.520 true um so i understand in principle but what do you mean by that so um the classic example of the
00:00:54.400 ai of ai going wrong was the paperclip machine right like a paperclip machine is dangerous
00:00:59.960 um because and it's a toy example it's comical people laugh at it but you tell a machine you
00:01:06.000 give it a reward function you make an optimizer is the ai uh research term where its reward function
00:01:11.380 is just the more paperclips there are the higher the reward right and assuming that it's a sufficiently
00:01:17.360 powerful ai what it immediately does is kill everyone on earth and turn them into paperclips
00:01:20.800 you know you extract the iron from blood you you you convert everything because there's no bounding
00:01:25.860 to it right but that's obviously ridiculous it's it's it it the idea of the ai being given that
00:01:32.820 autonomy and not stopped at any point is silly okay okay well that that that might be silly but i can
00:01:38.340 well envisage a future in which the ai is told um the most important thing that you can do is stop
00:01:45.020 climate change by the way humans cause climate change right off you go go and run our society
00:01:50.640 yeah um and now there are ways to build ai that that aren't optimizers you can build satisficers
00:01:59.120 which essentially say if you pass a certain threshold you don't care the problem is that
00:02:03.420 satisficers tend to build optimizers and this is actually something we've proved um where you give
00:02:08.840 ai agents that behave like optimizers they'll actually produce sub-agents ais build sub-agents
00:02:14.800 all the time they already do that um because it's efficient and the problem is that optimizers are
00:02:19.860 really good at achieving results um but the the fundamental problem at the bottom of all of this
00:02:25.620 is courageability we still do not know how to create an ai which is courageable and we don't
00:02:33.560 even know how how to do it theoretically let alone practically right and what do i mean by courageable
00:02:38.240 we don't know how to create a system which is trying to optimize a goal while at the same time telling
00:02:44.240 that ai or instructing that ai such that that goal is not the final goal and it may change
00:02:49.320 because in any system where your goals can be changed you don't want your goals to be changed
00:02:57.680 right like you and i we can't really have our direct goals changed the technology doesn't exist
00:03:03.460 but imagine for a moment if i were to put this scenario to you you like steak and you like having
00:03:10.300 things that have been killed but what if i were to say here's a pill and you take this vegetarian
00:03:15.600 food will taste as good as steak forever and it'll also reprogram your brain such that you don't
00:03:20.580 care about whether things get killed or not you probably are not going to take that pill
00:03:24.740 but i could eat from that point on i could eat broccoli and it tastes like steak
00:03:29.380 yeah and and which is more it would satisfy you but it would also reprogram your brain such that you
00:03:36.440 uh you you you've never you can you you have no desire to see things die it's you're just perfectly
00:03:43.020 happy because you just you don't care about anything other than the slop you don't care about
00:03:45.900 what if what if when i went to the pharmacy you know they they reached into the counter and gave me
00:03:51.560 the wrong pill and this one i don't know maybe gay or left or something like that well let's let's
00:03:57.020 talk about that let's say they let's all let's let's say they're offering you a pill that made that
00:04:01.800 made leftism make total sense to you and and which is more if it was achieved you would be perfectly
00:04:07.260 happy in a state of euphoria forever you wouldn't accept that pill yes because there's something more
00:04:14.380 than euphoria and which is more you would probably fight within an inch of your life to prevent that
00:04:18.500 pill being administered to you yes right so you what we're demonstrating there is you don't actually
00:04:24.740 care about the sum total happiness you'll have at the end you care about your goals right now i'm not
00:04:30.220 i'm not dopamine maximizing i'm doing something else exactly right um but considering that i don't
00:04:38.880 now have a way to make people courageable how on earth could i make something which is considerably
00:04:43.360 more goal-oriented than people courageable um um right so let's say you have an ai where it wants
00:04:54.180 paperclips you give it a reward function of paperclips but you want to make it so that it knows that
00:05:00.200 really i just want enough paperclips i want enough paperclips that if i want to clip some paper
00:05:04.000 together that's fine the problem is that the more powerful you make an ai the more likely it is to
00:05:09.760 begin behaving unpredictably dangerously and deceptively not because of any ill will but because
00:05:15.400 it doesn't want to have its goals changed because it is not courageable so so imagine that you have
00:05:24.600 reason to believe or we have reason to believe that ais are sticky around goals they already are we
00:05:31.980 already have proved that weak ai that we have in a lab already behaved are already demonstrating
00:05:37.840 deceptive alignment where they they where they act as if their reward function is in one state when it
00:05:43.000 isn't that's already happening okay that does sound a bit worrying i'll grant you that so the classic
00:05:51.300 example of this is you have a a racing game where you have an ai and the ai is just blindly dropped
00:05:58.380 in it is just poking it controls blindly right but its reward function is maximizing the score
00:06:02.920 and normally the way you get a score is by winning the race but what the the ai does is it learns it can
00:06:08.920 glitch the physics engine right and it can just slam into a wall and teleport back behind it and just
00:06:15.580 complete the loops so the ai stops getting getting better at racing yes he just learns how to glitch
00:06:21.460 the physics engine which isn't what we wanted to do um there is another example this one was was humorous but
00:06:28.900 it was very concerning because it it involved giving an ai an actual mechanical arm which is it what it was
00:06:35.560 told um basically we wanted to teach an ai how to flip a pancake without dropping the pancake and
00:06:45.580 but actually flipping a pancake is very hard so how do we measure when it's failed well when the
00:06:50.440 pancake hits the ground so what's the ai do well it says well every time i just randomly input things
00:06:55.740 it drops it to the ground but if i fling it into the ceiling as hard as i can that maximizes the amount
00:07:01.740 of time before it hits the ground which is immediately what it did okay yeah i mean that's also the sort of
00:07:10.760 thing a human autist would do but okay yeah exactly and that's my point right these ais are provably not
00:07:16.120 sentient they're not intelligent but if you but you can but they can get really good at certain tasks
00:07:23.260 and ai can get better at a racing game than any human ever could so it becomes task competent but its
00:07:29.860 goals are simplistic and we can't figure out how to make an ai that changes goals and so what that's
00:07:35.200 telling us is that it's way easier to make an ai smart than it is to make an ai aligned so it's
00:07:40.660 likely to be the case that ais will get smarter faster than they will get aligned okay i see where
00:07:47.140 you're going with this now yeah yeah that's the problem which is that if an ai is good at getting
00:07:51.860 things and it's going to get better at getting things but it's bad at having its goals aligned
00:07:56.320 with us and it's not going to get better at that as quickly what's going to happen is ai are going to
00:08:00.700 become more intelligent more powerful more quickly than they're going to become human aligned
00:08:04.760 okay so this makes sense because i was going to push back and say well look to be fair we've only
00:08:08.500 been doing the ai thing for you know 18 months or whatever so you know relax a bit but if the
00:08:14.220 principle is that it gets smarter faster than it gets aligned then that holds true whether you're
00:08:20.340 doing it for the next 18 months or 18 years yes i see the problem now and we unlike many other
00:08:26.900 technologies where we get what we call a warning shot that slows us down there's reason to believe a
00:08:31.540 poorly ai poorly aligned ai wouldn't give us one so so do you do you think they're going to kill us
00:08:38.440 all then or or fate worse than death yes i suppose yes so imagine you gave it this reward function
00:08:47.000 we want human beings to be happy well how do you explain to an ai what happiness is oh it's dopamine
00:08:51.800 so it captures every human being on earth wires something into your head that gives you dopamine and
00:08:56.140 just puts a battery farm of humans somewhere right with all the humans on earth with dopamine being
00:09:01.100 zapped into their brain well that is the current world economic forum plan so i mean they've got
00:09:06.140 some competition the ais for that one exactly but that's that's why you there's actually a study
00:09:11.240 where can you look at corporations and ngos like poorly aligned ai
00:09:15.040 i like where this is going if you would like to see the full version of this premium video
00:09:21.340 please head over to lotus eaters.com and subscribe to gain full access to all of our premium content