The Peter Attia Drive - #290 ‒ Liquid biopsies for early cancer detection, the role of epigenetics in aging, and the future of aging research ｜ Alex Aravanis, M.D., Ph.D.

00:00:00.000 Hey, everyone. Welcome to the Drive podcast. I'm your host, Peter Atiyah. This podcast,

00:00:16.540 my website, and my weekly newsletter all focus on the goal of translating the science of longevity

00:00:21.520 into something accessible for everyone. Our goal is to provide the best content in health and

00:00:26.720 wellness, and we've established a great team of analysts to make this happen. It is extremely

00:00:31.660 important to me to provide all of this content without relying on paid ads. To do this, our work

00:00:36.960 is made entirely possible by our members, and in return, we offer exclusive member-only content

00:00:42.700 and benefits above and beyond what is available for free. If you want to take your knowledge of

00:00:47.940 this space to the next level, it's our goal to ensure members get back much more than the price

00:00:53.200 of the subscription. If you want to learn more about the benefits of our premium membership,

00:00:58.020 head over to peteratiyahmd.com forward slash subscribe. My guest this week is Alex Aravinas.

00:01:06.920 Alex is the CEO and co-founder of Moonwalk Biosciences. I should note up front that I am

00:01:13.160 also an investor in and an advisor to Moonwalk Biosciences. Alex and I were colleagues in medical

00:01:20.020 school, so I've known Alex for a little over 25 years now. Before Moonwalk, Alex was Illumina's

00:01:25.600 chief technology officer, the SVP and head of research and product development, and under his

00:01:31.440 leadership, Illumina launched the industry-leading product for generating and analyzing most of the

00:01:36.980 world's genomic data. He developed large genome-based research and clinical applications, including

00:01:42.660 whole genome sequencing for rare disease diagnoses, comprehensive genomic profiling for cancer,

00:01:48.400 and for selected optimal therapies, and the most advanced AI tools for interpreting genomic

00:01:54.420 information. Alex has been the founder of several biotech and healthcare companies, including Grail

00:01:59.780 Bio, where he served as the chief science officer and head of R&D. At Grail, he led the development of

00:02:06.180 its multi-cancer early screening test gallery, which we'll discuss at length in this podcast. He holds over

00:02:12.300 30 patents and serves on the scientific advisory board for several biotechnology companies.

00:02:17.240 Alex received his master's and his PhD in electrical engineering and his MD from Stanford University

00:02:23.980 and his undergrad in engineering from Berkeley. In this episode, we talk about two related things,

00:02:31.500 liquid biopsies and epigenetics. We cover the evolution of genome sequencing and tumor sequencing.

00:02:38.100 We then speak at length about Alex's work with Grail and liquid biopsies, including an understanding of

00:02:44.340 cell-free DNA, methylation, sensitivity, specificity, along with the positive and negative predictive

00:02:50.100 value of liquid biopsies. We then get into epigenetics, methylation, and the biology of aging.

00:02:56.660 This is an especially complicated topic, but truthfully, there are few topics in biology today

00:03:02.280 that excite me more than this. And I suspect that my enthusiasm will come across pretty clearly here.

00:03:08.820 So without further delay, please enjoy my conversation with Alex Aravines.

00:03:19.020 Hey, Alex, great to be sitting down with you here today. I kind of wish we were doing this in person

00:03:24.040 because we haven't seen each other in person in a few months. And even that was sort of a chance

00:03:28.040 meeting. So I guess by way of background, you and I go back over 20 years now, I guess it's 25 years

00:03:34.880 that we both started med school together. It's hard to believe it's been that long, huh?

00:03:40.380 It seems like a million years ago, but it also seems like yesterday. Yeah, those are good times.

00:03:45.240 So Alex, one of the things I remember when we first met was that we pretty much clicked over

00:03:50.020 the fact that we were both engineers coming in. And we had a good group of friends that I remember in

00:03:54.220 medical school. And the one thing we had in common is not one of us was a pre-med. We were all kind of

00:03:58.720 whatever the term was they used to describe as non-traditional path to medical school. So let's

00:04:04.960 talk a little bit about just briefly your background. You came in as an electrical engineer and then you

00:04:09.800 did a PhD in a lab of a very prominent scientist by the name of Dick Chen. Maybe tell folks a little

00:04:15.280 bit about what you did in that work and what it was that got you excited enough about science to

00:04:20.480 deviate off the traditional MD path.

00:04:22.640 Yeah, my PhD was in electrical engineering and Stanford has a cool configuration on the

00:04:28.900 campus where the engineering school is literally across the street from the medical school.

00:04:33.060 And so over time, I became more and more interested in applying signal processing techniques, circuit

00:04:40.020 design, imaging, AI, things like that. But the problems in medicine that were more interesting

00:04:45.260 to me than some of the traditional engineering products and things like that. Met a world famous

00:04:51.060 neuroscientist named, as you mentioned, Dick Chen, who was very interested in fundamental questions

00:04:56.200 about the quantum unit of communication in the brain, which is the individual synaptic vesicle.

00:05:02.260 And there was a question of just what did it look like and how did it operate? And it was the

00:05:07.060 beginning for me of just applying these engineering tools to really important questions in biology and

00:05:11.460 helping answer them. That first story was a great article in Nature where we definitively answered

00:05:17.320 the question of how that quantum is transmitted between cells. And then went on to do several

00:05:22.840 other projects like that.

00:05:25.000 Can you say a little bit about that? How is that information transmitted?

00:05:28.700 It was really fun to come up with these problems with an engineering and communications background.

00:05:32.620 But if you look at a central neuron on the brain, and you look at the rate at which information is

00:05:38.660 transferred, it seemed to be much faster than the number of synaptic vesicles in the terminal,

00:05:44.660 right? So there was this, well, there's only 30 synaptic vesicles in the terminal by like an

00:05:49.420 electron microscope, yet you're seeing hundreds of transmissions over a few seconds. So how is that

00:05:55.920 possible? And there were various theories. There was an individual vesicle that was fusing and staying

00:06:02.280 fused and pumping neurotransmitter through it without collapsing. And that's how you could get these

00:06:07.460 so much more rapid puffs. We came up with a cute term, which was called kiss and run to explain

00:06:13.440 phenomenon. And it again, helped answer this fundamental question of how did the brain get

00:06:19.740 so many small neurons yet able to transmit so much information per individual connection.

00:06:26.780 So Alex, if you think about all the things that you learned during your PhD, I mean, I guess one of

00:06:32.040 the benefits of doing it where you did it in the lab you did it in was you overlapped with some other

00:06:38.280 really thoughtful folks, including a previous guest on the podcast, Carl Deseroth. What do you think

00:06:43.500 were the most important things you learned philosophically, not necessarily technically,

00:06:48.700 that are serving you in the stuff we're going to talk about today? So we're going to talk today a lot

00:06:54.800 about liquid biopsies. We're going to talk a lot about epigenetics. We're going to talk a lot about

00:07:00.600 certainly technologies that have made those things possible. And when you think back to your background in

00:07:07.440 double E, what were the transferable skills? So I think one of them, and it's a saying in

00:07:12.460 engineering, which is if you can't build it, you don't understand it. So simply understanding a

00:07:16.980 description of something is not the same as you can build it up from scratch. And so you can't always

00:07:22.000 do that in biology, but you can do experiments where you're testing the concept of, can I really make it

00:07:27.760 work? And so I think that was an engineering concept that served me well a lot. Another, it's not

00:07:33.680 exclusive to engineering, but was being very first principled. Do we really understand how this

00:07:38.580 works? In that particular lab, there's a big emphasis on doing experiments where you always

00:07:45.180 learn something, where, you know, regardless of whether or not it confirmed or rejected your

00:07:50.300 hypothesis, you learn something new about the system. Don't do experiments where you may just

00:07:56.120 not learn anything. That was a very powerful way to think about things.

00:08:00.400 So we'll fast forward a bit, just for the sake of time, there's obviously an interesting detour where

00:08:06.020 after I go off to residency and after you finish your PhD, we still find ourselves back together

00:08:11.860 side by side in the same company for four years, which again, brought many funny stories, including

00:08:19.160 my favorite is you and I getting lost in the middle of Texas, actually not in the middle of Texas,

00:08:25.580 but just outside of El Paso and nearly running out of gas. I mean, this was no cell signal.

00:08:34.020 We were in trouble, but we somehow made it out of that one together.

00:08:36.900 Yes. Yeah. No, I remember that, that us Californians thought that there must be a Starbucks within,

00:08:42.960 you know, 10, 15 miles out in the middle of West Texas. And it turns out you can go hundreds of miles

00:08:48.300 with a Starbucks.

00:08:49.580 That's right. Passing a gas station with an eighth of a tank saying, we'll stop at the next one can be

00:08:55.900 a strategic error. There was also the time you bailed me out when I forgot my cufflinks because

00:09:02.240 you had some dental floss. Do you remember that? I don't know if you remember that.

00:09:06.340 There you go.

00:09:06.900 Yeah. You had some dental floss.

00:09:08.940 That's right. Yeah. Total MacGyver move. But anyway, let's fast forward to all that stuff. So

00:09:12.940 I don't know what year it is. It's got to be circa what, 2012. When do you end up at Illumina

00:09:19.380 for the first time? Early 2013.

00:09:22.560 Okay. Talk to me about that role. What was it that you were recruited to Illumina to do? And maybe

00:09:27.160 just tell folks who don't know what Illumina is a little bit about the company as well.

00:09:33.060 Yeah. So today, Illumina is the largest maker of DNA sequencing technologies. So when you hear about

00:09:41.760 the human genome being sequenced, things like expression data or any seek, most liquid biopsies,

00:09:48.940 most tumor sequencing, finding genetic variants in kids with rare disease, most of that is done

00:09:55.260 with Illumina technology. So they also make the chemistries that process the DNA, the sequencers

00:10:00.860 that generate that information, and also the software that helps analyze it. So I really took that tool

00:10:07.960 from a very niche research technology to a standard of care in medicine and hundreds of thousands of

00:10:14.720 publications and tremendously has been advancing science. So 11 years ago, you showed up there.

00:10:21.540 What was the role you were cast in? This was earlier on in Illumina's history. What attracted me to the

00:10:27.640 company and why I was recruited was to help develop more clinical applications and more applied

00:10:34.160 applications of the technology. So the technology had a use by certain sequencing aficionados for basic

00:10:41.280 research. But the company and I agreed with the vision felt that, hey, this could be used for a lot

00:10:46.460 more. This could be used to help every cancer patient. This could be used to help people with genetic

00:10:51.780 diseases. How can we develop the technology and other aspects of it, the assays and softwares to make that

00:10:58.540 reality? I was hired to do that. It occurred to me when you even said a little bit of that, Alex, that

00:11:04.560 many of us, you and I would take for granted some of the lingo involved here, sequencing and what's

00:11:11.560 involved. But I still think it might be a bit of a black box to some people listening. And given

00:11:17.140 the topics we're going to cover today, I think just explaining to people, for example, what was done

00:11:24.860 in the late 90s, early 2000s when, quote unquote, the human genome was sequenced? What does that mean?

00:11:31.260 And how had that changed from the very first time it was done by sheer brute force in the most analog

00:11:38.240 way until even when you arrived 10, 11 years ago? So maybe walk us through what it actually means

00:11:46.480 to sequence a genome. And feel free to also throw in a little bit of background about some of the basics

00:11:52.080 of DNA and the structure, et cetera, as it pertains to that. It's some really important fundamental

00:11:56.800 stuff. A quick primer on human genetics. So in most cells of the body, you have 23 pairs of

00:12:04.220 chromosomes. They're very similar except the X and Y chromosome, which are obviously different in men

00:12:09.660 and women. Each one of those chromosomes is actually a lot of DNA packed together in a very orderly way, 1.00

00:12:17.520 where the DNA is wrapped around proteins called nucleosomes, which are composed of histones.

00:12:23.480 And then it's packed into something called chromatin, which is this mass of DNA and proteins.

00:12:28.820 And again, packed together, and then you make these units of chromosomes. Now, if you were to unwind

00:12:34.260 all of those chromosomes, pull the string on the sweater and completely unwind it, and you were to line

00:12:41.060 all of them end to end, you would have 3 billion individual bases. So the ATCG code at any given

00:12:49.860 one of those 3 billion positions, you would have a string of letters. Each one would either be ATC or G,

00:12:56.040 and it would be 3 billion long. So to sequence a whole human genome is to read out that code for an

00:13:03.480 individual. And once you do that, you then know their particular code at each of those positions.

00:13:09.520 So at the end of the last century, that was considered quite a daunting task. But as I think

00:13:18.260 our country has often done, decided that it was a very worthy one to do, along with several other

00:13:24.000 leading countries that believe strongly in science. And so they funded the Human Genome Project. So all

00:13:29.240 over the world at centers, people were trying to sequence bits of this 3 billion bases to comprise

00:13:35.660 the first complete human genome. So it's just quite famous. There were two efforts. One was a public

00:13:41.580 effort led by the NIH and Francis Collin at the time. They had a particular approach where what they

00:13:48.320 were doing was they were cutting out large sections of the genome, and then using an older type of

00:13:55.540 sequencing method called capillary electrophoresis to sequence each of those individual bases.

00:14:00.740 There was a private effort led by Craig Venter and a company called Solera, which took a very

00:14:07.220 different approach, which is they cut up the genome into much, much smaller pieces, pieces that were so

00:14:13.780 small that you didn't necessarily know a priori what part of the genome they would come from, which is

00:14:19.860 why they were doing this longer, more laborious process through the public effort. But there was a big

00:14:24.800 innovation, which is they realized that if you had enough of these fragments, you could, using a

00:14:29.960 mathematical technique, reconstruct it from these individual pieces, where you could take individual

00:14:35.680 pieces, looked at where they overlapped. And again, we're talking about billions of fragments here,

00:14:40.640 and you can imagine mathematically reconstructing that. Very computationally intensive, very complex.

00:14:46.600 But the benefit of that is that you could generate the data much, much faster. And so in a fraction of the

00:14:52.480 time and for a fraction of the money, they actually caught up to the public effort and then culminated

00:14:57.740 in each having a draft of a human genome around the same time in late 2000, early 2001. And then

00:15:06.020 simultaneously in nature and science, we got the first draft of a human genome milestone in science.

00:15:12.960 Alex, what were the approximate lengths of the fragments that Solera was breaking DNA down into?

00:15:19.080 They were taking chunks out in individual megabases, so like a million bases at a time. And then they would

00:15:26.800 isolate that and then deconstruct it even into smaller pieces, which were kilobase fragments,

00:15:33.200 a thousand bases at a time. And again, so they would take a piece of the puzzle, but they would

00:15:37.740 know which piece it was, and then break that into smaller and smaller ones. And then after you had the

00:15:43.240 one kilobase sequences, they would put it all back together versus just to contrast that with the

00:15:48.820 private effort, which they called shotgun sequencing, which is you just took the whole thing,

00:15:53.720 ground it up, brute force sequenced it, and then use the informatics to figure out what went where.

00:16:00.040 And in the shotgun, how small were they broken down into?

00:16:03.280 They got down to kilobase and hundred base, multi-hundred base fragments. But the key was,

00:16:09.120 all you had to do was just brute force keep sequencing, as opposed to this more artisanal

00:16:14.440 approach of trying to take individual pieces and deconstruct them and then reconstruct them.

00:16:19.260 So it's early 2001. This gets published. By the way, do we know the identity of the individual?

00:16:24.480 I think we do know the identity of the individual who was sequenced, don't we? I can't recall.

00:16:28.980 I think the original one was still anonymous and likely to be a composite of multiple individuals,

00:16:34.280 just because of the amount of DNA.

00:16:36.160 That was needed. Yeah.

00:16:37.280 Yeah. Soon after, there were individuals. Craig Venter, he may have been the first

00:16:41.720 individual who was named that we had the genome for.

00:16:45.100 Got it. It's often been said, Alex, that that effort costs, at the end of that sequencing,

00:16:51.540 if you decided, I want to now do one more person, it would cost a billion dollars directionally

00:16:56.580 to do that effort. What was the state of the art in transitioning that from where it was,

00:17:07.200 let's just say, order of magnitude, 10 to the $9 per sequence, to where it was

00:17:13.960 10 years later, approximately? What was the technology introduction or plural version of

00:17:21.520 that question that led to a reduction? And how many logs did it improve by?

00:17:27.520 We went back and did this analysis. So if you literally at the end of the original human

00:17:31.780 genome said, Hey, I want to do one more. And you have the benefit of all the learnings from the

00:17:36.280 previous one, a few hundred million dollars would have been an incremental genome. By 2000,

00:17:43.980 well, it was low tens of thousands of dollars. So let's call that four or five logs of improvement.

00:17:52.380 And what brought that? So the day you show up at Illumina and it's, if for research purposes,

00:17:57.680 or if a very wealthy individual said, I have to know my whole genome sequence,

00:18:02.240 and they were willing to pay $25,000 for it, or a lab was doing it as part of a clinical trial or

00:18:08.700 for research, what were they buying from Illumina to make that happen?

00:18:13.900 So it was a series of inventions that allow the sequencing reactions to be miniaturized.

00:18:19.160 And then you could do orders of magnitude, more sequencing of DNA by miniaturizing it.

00:18:24.880 The older sequencers, they had a small glass tube. And as the DNA went through, you sequenced it,

00:18:29.980 it got converted into a 2D format, kind of like a glass slide, where you had tiny fragments of DNA

00:18:36.780 stuck to it, hundreds of millions, then ultimately billions. And then you sequenced all of them

00:18:42.440 simultaneously. So there was a huge miniaturization of each individual sequencing reaction, which allowed

00:18:49.340 you to just in one system generate many, many more DNA sequences at the same time. There's a very

00:18:56.280 important chemistry that was developed called sequencing by synthesis by a Cambridge chemist,

00:19:02.080 who I know well, Shankar Balasupramanian. And he developed Illumina sequencing chemistry,

00:19:07.980 which ultimately went through a company called Celexa, which Illumina acquired. And that has

00:19:12.300 generated the majority of the world's genomics data, the original chemistry that he developed in

00:19:17.940 Cambridge.

00:19:18.340 And what was it about that chemistry that was such a step forward?

00:19:22.920 It allowed you to miniaturize the sequencing reactions. So you could have a huge number,

00:19:28.180 ultimately billions in a very small glass slide. It also allowed you to do something

00:19:33.400 which is called cyclic sequencing in a very precise and efficient and fast way, where you read off one

00:19:41.500 base at a time, and you can control it. And so you imagine you have, say, a lawn of a billion DNA

00:19:46.840 fragments, and you're on base three on every single fragment, and you want to know what base four is

00:19:51.960 on every fragment. It allowed you to simultaneously sequence just one more base on all billion

00:19:57.900 fragments, read it out across your whole lawn. And then once you read it out, add one more base,

00:20:04.640 read it all out. And so this allowed for this huge parallelization.

00:20:09.240 Let's talk a little bit about where we are today. To my recollection, the last time I looked

00:20:15.440 to do a whole genome sequence today is on the order of $1,000, $500 to $1,000. Is that about accurate?

00:20:23.520 Yeah, that's way too expensive, Peter. Today, a couple hundred dollars.

00:20:29.140 Okay. So a couple hundred dollars today. I feel like I looked at this on a graph a while ago,

00:20:35.500 and it was one of the few things I noticed that was improving faster than Moore's Law.

00:20:40.980 Maybe tell folks what Moore's Law is, why it's often talked about. I think everybody's heard of

00:20:46.560 it. And maybe talk about the step function that it's basically, if I'm looking at it correctly,

00:20:51.940 there were two Moore's Laws, but there was something in between that became even a bigger

00:20:57.120 improvement. But maybe tell folks what Moore's Law is, first of all.

00:21:01.000 It's not like a law, like a law of physics or something like that, but it became an industry

00:21:05.400 trend in microprocessors. What it refers to is the density of transistors on a microchip and the

00:21:14.160 cost of the amount of computing power per amount of transistors. And that geometrically decreased

00:21:21.660 kind of in a steady way. Actually, I don't remember the exact number if it's like doubling every

00:21:26.980 two years or something like that. But there was a geometric factor to it that the industry

00:21:32.420 followed for decades. It's not quite following that anymore. I mean, transistors are getting down

00:21:37.320 to like the atomic scale, but went way faster than people had envisioned.

00:21:43.720 It basically started in the late 60s. And as you said, it went until it hits the limits of atomic

00:21:49.000 chemistry.

00:21:50.020 Yeah. And so that relentless push is what made the whole software engineering high-tech industry possible.

00:21:56.140 So back to my question, which is, if you just look at the cost of sequencing

00:22:00.600 from 2000 till today, it's sort of like two curves. There's the relentless curve that gets to where

00:22:08.380 we are in 2013. But then there was another big drop in price that occurred after that. I'm guessing

00:22:15.500 that had to do with shotgun sequencing or the commercialization of it. I mean, not the concept

00:22:20.020 of it, which already existed. Does that sound right?

00:22:23.380 Yeah. So when Illumina really started to deliver the higher throughput next generation sequencings,

00:22:29.140 it brought along a new faster curve because of the miniaturizations. So this ability to sequence

00:22:34.840 billions of fragments in a small area, I was privileged to be a big part of this effort.

00:22:40.520 And Illumina just continuing to drive the density down, the speed of the chemistry up,

00:22:45.700 all the associated optics, engineering software around it drove that much faster than Moore's law

00:22:52.360 reduction in cost.

00:22:53.400 Were other companies involved in the culmination of next-gen sequencing?

00:22:58.740 Yeah, many. And some of them are still around. None nearly as successful as Illumina,

00:23:03.700 but also some important players there.

00:23:06.520 And today that's the industry standard. I assume there's no sequencing that's going on

00:23:10.360 that isn't next-gen?

00:23:12.400 No, the vast majority is next-gen sequencing. There's niche applications where there's other

00:23:17.680 approaches, but in the 99% of the data being generated, some version of next-generation

00:23:22.680 sequencing.

00:23:24.220 Got it. So you mentioned a moment ago that part of the effort to bring you to Illumina was

00:23:31.080 presumably based on not just your innate talents, but also the fact that you came from a somewhat

00:23:37.360 clinical background as well. You're an MD and a PhD. And if their desire is to be able to branch out

00:23:42.840 into clinical applications, that would make for a natural fit. So where in that journey did the

00:23:48.580 idea of liquid biopsies come up? And maybe talk a little bit about the history of one of the

00:23:54.160 companies in that space that we're going to talk about today.

00:23:56.840 So to start with that, I should talk about first tumor sequencing, which predated liquid biopsy.

00:24:02.260 A couple of companies, most notably Foundation Medicine, developed using Illumina technology,

00:24:08.260 developed tumor sequencing. So there had been some academic work, but they tried to develop it and

00:24:13.920 were the first to do it successfully as a clinical product. What you can imagine is there's these

00:24:19.000 genes that are implicated in cancer that often get mutated. Knowing which mutations a tumor has has

00:24:25.380 big implications for prognosis, but also for treatment. Over time, we have more and more targeted

00:24:31.000 therapies where if your tumor has a very particular mutation, it's more likely to respond to certain

00:24:36.940 drugs that target that type of tumor. And at the time, as more and more of these mutations were

00:24:43.640 identified that could be important in the treatment of a tumor, it was becoming impractical to say,

00:24:49.980 do a PCR test for every mutation. So imagine there's 100 potential mutations you'd like to know about

00:24:56.180 if a patient has in their tumor and their lung cancer, doing each of these individually. Again,

00:25:02.160 a lot of expense, a lot of false positives. And so what companies like Foundation Med is say, hey,

00:25:08.360 why don't we just sequence all of those positions at once given next generation sequencing? So they 0.50

00:25:13.740 would make a panel to sequence, say, 500 genes or a few hundred genes, the ones that are most important

00:25:19.460 in most solid cancers. And then they would sequence them. And then in one test, they would see the vast

00:25:25.220 majority of the potential mutations that could be relevant to treatment for that cancer patient.

00:25:30.740 And so that is still a very important tool in oncology today. A large fraction of tumors are

00:25:37.160 sequenced. And that's what allows people to get access to many types of drugs. Many of the targeted

00:25:43.360 therapies for lung cancer, melanoma, or you hear about things like microsatellite instability or high

00:25:50.720 mutational burden, that all comes from tumor sequencing. Once that was established, then a few

00:25:57.600 folks, most notably at Johns Hopkins, but also other places, started to ask the question, well,

00:26:02.960 you know, could we sequence the tumor from the blood? And you might say, well, hey, you have a tumor in

00:26:08.020 your lung. Why would sequencing blood be relevant to looking at the tumor? Well, it turns out there is

00:26:14.800 tumor DNA in the blood. And this is interesting. So in the late 40s, it was first identified,

00:26:20.720 that there was DNA in the blood outside of cells, so-called cell-free DNA. And then in the 70s,

00:26:27.040 it was noticed that cancer patients had a lot of DNA outside their cells in the blood, and that some

00:26:34.540 of this was likely from tumors, from the cancer itself. If you know anything about tumor biology,

00:26:41.560 you know that cancer cells are constantly dying. So you think of cancers as growing very quickly,

00:26:46.220 and that's true, but they actually are dying at an incredible rate because it's disordered growth.

00:26:52.640 So many of the cells that divide have all kinds of genomic problems. So they die or they're cut

00:26:58.260 off from vasculature. But the crazy thing about a tumor is, yes, it's growing fast if it's an aggressive

00:27:04.120 tumor, but also the amount of cell death within that tumor is very high. And every time one of those

00:27:10.500 cells die, some of the DNA has the potential to get into the bloodstream. And so it was this insight

00:27:16.940 along with the tumor sequencing that said, hey, what if we sequence this cell-free DNA? Could we

00:27:22.920 end up sequencing some of the tumor DNA or the cancer cell DNA that's in circulation?

00:27:28.940 Early results, particularly from this group at Johns Hopkins, began to show that indeed that was

00:27:35.060 possible. And then a few companies, again, using Illumina technology, and then we started doing

00:27:41.080 it at Illumina also, our own liquid biopsy assays and tests and technologies developed what became

00:27:47.600 liquid biopsy. In this context, it was for late-stage cancer. So it was for patients who

00:27:52.740 diagnosed with a cancer. You wanted to know, did their tumor have mutations? And you could do it from

00:27:58.180 the blood. There was a big benefit, which was, as you know, for lung cancer, taking a biopsy can be a

00:28:04.200 very dangerous proposition. You can cause a pneumothorax. You can land someone in the ICU.

00:28:11.220 You know, in rare cases, it can lead to death in that type of procedure. And so the ability to

00:28:16.180 get the mutational profile from the blood was really attractive. And so that started many companies

00:28:23.360 down the road of developing these liquid biopsies for late-stage cancers.

00:28:28.380 So Alex, let's talk about a couple of things there. Tell me the typical length of

00:28:34.080 a cell-free DNA fragment. How many base pairs, or what's the range?

00:28:38.840 Yeah, it depends on the exact context, but around 160 base pairs. So that's 160 letters of the ATCG

00:28:46.660 code. And there's a very particular reason it's that length, which is that if you pull the string

00:28:53.140 on the sweater, you unwind the chromosome, and you keep doing it until you get down to something

00:28:58.800 around 160 base pairs, what you find is that the DNA, right, it's not just naked, it's wrapped around

00:29:05.320 something called a nucleosome, which is an octamer or eight of these histone proteins in a cube,

00:29:12.500 and the DNA is wrapped around it twice. And that's the smallest unit of chromatin of this larger chromosome

00:29:19.720 structure. And so the reason it's 160 bases is that's more or less the geometry of going around

00:29:26.380 twice. And so DNA can be cleaved by enzymes in the blood, but that nucleosome protects the DNA from

00:29:35.540 being cut to anything smaller than about 160 base pairs. And does that mean that the cell-free DNA

00:29:42.600 that is found in the blood is still wrapped around the nucleosome twice, like it's still clinging to

00:29:50.100 that and that's what's protecting it from being cleaved any smaller?

00:29:52.520 That's right.

00:29:53.860 You mentioned that obviously the first application of this was presumably looking for ways to figure

00:30:02.220 out what the mutation was of a person with late-stage cancer without requiring a tissue

00:30:07.060 biopsy. Presumably by this point, it was very easy to gather hundreds of 160 base pair fragments and

00:30:17.180 use the same sort of mathematics to reassemble them based on the few overlaps to say this is the actual

00:30:23.780 sequence because presumably the genes are much longer than 160 base pairs that they're looking at.

00:30:30.100 That's right. So by this point in 2014, 2015, the informatics was quite sophisticated. So you could

00:30:38.880 take a large number of DNA sequences from fragments and easily determine which gene it was associated with.

00:30:46.060 At some point I recall in here, I had a discussion on the podcast maybe a year and a half ago,

00:30:53.220 two years ago with Max Dean, another one of our med school classmates, about looking at recurrences in

00:31:00.960 patients who were clinically free of disease. So you took a patient who's had a resection plus or minus

00:31:08.900 some adjuvant chemotherapy. And to the naked eye and to the radiograph, they appear free of disease.

00:31:16.120 And the question becomes, is that cancer recurring? And the sooner we can find out, the better our chance

00:31:22.940 at treating them systemically again, because it's a pretty well-established fact in oncology that the

00:31:29.380 lower the burden of tumor, the better the response, the lower the mutations, the less escapes, etc.

00:31:35.640 And so did that kind of become the next iteration of this technology, which was,

00:31:41.860 if we know the sequence of the tumor, can we go fishing for that particular tumor in the cell-free

00:31:48.460 DNA?

00:31:49.640 Yeah, yeah. Broadly speaking, there's kind of three applications from looking at tumor DNA in the

00:31:54.360 blood. One is screening, which we'll talk about later, which is people who don't have cancer,

00:31:59.140 or 99% who don't, and trying to find the individual who has cancer, an invasive cancer,

00:32:04.980 but doesn't know it. There's this application of what we call therapy selection, which is you're a

00:32:09.300 cancer patient trying to decide which targeted therapy would be best for you. And then this other

00:32:15.520 one you mentioned is a third application we call often minimal residual disease. We're looking at

00:32:21.480 monitoring a response, which is you're undergoing treatment, and you want to know,

00:32:26.520 is the amount of tumor DNA in the blood undetectable? And also its velocity, is it changing?

00:32:33.160 Because as you mentioned, that could tell you, is your treatment working, the tumor DNA burden or

00:32:38.980 load is going down? Is it undetectable, and you're potentially cured that there's no longer that source

00:32:45.820 of tumor DNA in your body? Or is it present even after a treatment with intent to cure, and that in the

00:32:54.900 presence of that tumor DNA still means basically, and we appreciate this now, unfortunately, you have

00:33:01.360 not been cured, but that patient hasn't been cured, because there is some nidus of tissue somewhere

00:33:06.220 that still harbors these mutations, and therefore is the tumor, even if it's not detectable by any other

00:33:12.900 means.

00:33:13.300 So at what point does this company called Grail that we're going to talk about, at what point does

00:33:23.520 it come into existence, and what was the impetus and motivation for that as a distinct entity outside

00:33:30.280 of Illumina?

00:33:31.680 So there were several technological and scientific insights that came together, along with, as often

00:33:38.520 in this case, some really old entrepreneurs and investors. The use of this liquid biopsy technology

00:33:46.580 in late-stage cancers, it was clearly possible to sequence tumors from the blood, and it was clearly

00:33:52.340 actually the tumor DNA, and it was useful for cancer patients. So we knew that there was tumor DNA, we knew

00:33:58.740 it could be done, but what the field didn't know is, could you just see this in early-stage cancers,

00:34:03.660 localized cancers that were small? Not a lot of data on that, but there was the potential.

00:34:10.720 There was also a really interesting incidental set of findings in a completely different application

00:34:16.300 called non-invasive prenatal testing. Again, totally different application, but it was discovered

00:34:22.700 principally by a scientist in Hong Kong named Dennis Lowe that you could see fetal DNA in the blood,

00:34:30.300 or more specifically placental DNA in the blood, and it was also cell-free DNA. What he developed,

00:34:37.880 actually, along with one of our professors at Stanford, Steve Quake, was a technique to look

00:34:43.840 for trisomies in the blood based on this placental or fetal DNA, and this is called non-invasive

00:34:50.060 prenatal testing. And so what you do is you sequence the cell-free DNA fragments in a pregnant woman, 0.62

00:34:56.540 you look at the DNA, and if you see extra DNA, for example, at the position of chromosome 21,

00:35:04.560 well, that indicates that there are tissues in women, presumably the fetus, or placenta that's 0.89

00:35:10.700 giving off extra chromosome 21. And so this ended up being an incredibly sensitive and specific way

00:35:18.380 to test for the presence of trisomies, chromosome 21, 18, 13, early in pregnancy. And it's had a

00:35:26.740 tremendous impact. It was also involved in subsequent iterations of the test. In the United States,

00:35:31.860 it decreased amniocentesis by about 80% because the test is so sensitive and specific as a screen

00:35:38.940 that many, many women have now not had to undergo amniocentesis and the risks around. Again,

00:35:45.240 totally different application of cell-free DNA. But what happened is during the early

00:35:51.380 commercialization of about the first few hundred tests, the companies pioneering this, and one of

00:35:56.620 them was a company called Veranata that Illumina acquired, began to see in rare cases, very unusual

00:36:03.520 DNA patterns. It wasn't just a chromosome 21 or 18 or 13, but what's often called chromotripsis,

00:36:13.000 which is many, many abnerations across chromosomes. The two women who really did this analysis and

00:36:21.660 really brought both Illumina and the world's attention to it were Meredith Hawks-Miller,

00:36:26.560 a pathologist and lab director at this Illumina-owned company, Veranata, and another

00:36:31.500 bioinformatics scientist, Daria Chudova. What they showed is, ultimately, that these women actually

00:36:38.420 had cancer. They were young women of childbearing age. They ultimately had healthy children,

00:36:44.840 but they had an invasive cancer and it was being diagnosed in their cell-free DNA by this

00:36:51.540 non-invasive prenatal test. And as they began to show these patterns to people, it became clear that

00:36:58.280 they were clearly cancer. If you have many, many chromosomes that are abnormal, that's just not

00:37:03.980 compatible with life or a fetus. And so when you saw this just genome-wide chromosomal changes,

00:37:11.500 it was very clear that we're incidentally finding cancer in these women. 0.99

00:37:15.680 Let's talk a little bit about that, actually, because I want to dig into that. It's so interesting.

00:37:19.760 So let's take a step back. So again, whenever you say we're sampling for cell-free DNA,

00:37:25.140 we should all be keeping in the back of our mind, we're looking for these teeny tiny little

00:37:29.440 160 base pair fragments wrapped around little nucleosomes. Now, let's just go back to the

00:37:36.120 initial use case around trisomy 21. With 160 base pairs, is that sufficient to identify any one

00:37:44.080 chromosome? Presumably, you're also sampling maternal blood, so you know what the maternal

00:37:49.220 chromosomes look like, and you're presumably juxtaposing those two as your control. Is that

00:37:55.500 part of it? Not quite. So it's all mixed together. So in a pregnant woman's blood and maternal blood, 0.83

00:38:02.140 it's a mixture. So you have cell-free DNA. The majority of the cell-free DNA is from her

00:38:07.340 own cells and tissues. And then you have superimposed on that a bit of cell-free DNA from

00:38:13.720 mostly the placenta. And so what you're seeing is this mix of cell-free DNA. And then what you do is

00:38:20.500 you sequence. There's different ways to do it, but the most common way is you do shotgun sequencing,

00:38:25.140 and you sequence millions of these fragments. And every time you sequence a fragment,

00:38:30.860 you place it in a chromosome based on its sequence. Your first fragment, you say,

00:38:35.760 hey, when I compare this to the draft human genome, this goes on chromosome two.

00:38:39.840 You sequence your third fragment and you say, hey, this sequence looks like chromosome 14.

00:38:44.960 And you keep putting them in the chromosome buckets. And what you expect, if every tissue has an

00:38:53.200 even chromosome distribution, you know, or two chromosomes, is that that profile would be flat

00:38:57.840 and each bucket would be about the same level. But what you see in a woman carrying a fetus that 1.00

00:39:04.460 has a trisomy... You'll see 50% greater in the chromosome 21 bucket.

00:39:09.920 You actually see more like 5% or 10%. Because again, remember, 90% of it might be maternal blood,

00:39:15.760 right? So that's all going to be even. But within the 10% fetal, you're going to have an extra 50%.

00:39:21.480 So the total might be an extra 5% or 10%. But that's a whopping big signal and very easy to detect.

00:39:29.280 Isn't it interesting? It just gives a sense of how large the numbers are if a 5% delta

00:39:34.620 is an off the charts, unmistakable increase in significance. I want to make sure again,

00:39:40.940 people understand what you just said, because it's very important. Because the majority of the

00:39:45.280 cell-free DNA belongs to the mother, and because the fetal slash placental cell-free DNA is a trivial

00:39:51.820 amount, even though by definition a trisomy means there is 50% more of one chromosome, you've gone

00:39:59.360 from two to three copies. In the fully diluted sample, that might only translate to a few percent.

00:40:07.140 But that's enough, given the large numbers that you're testing, to be a definitive,

00:40:13.820 statistically significant difference that triggers a positive test.

00:40:18.240 Yep. Well put. Yes.

00:40:20.480 Alex, I want to come back to the story, because this is clearly the beginning of the story.

00:40:24.300 But let's come back to just a couple other housekeeping items.

00:40:27.580 A moment ago, we talked about cell-free DNA in the context of tumor. Someone might be listening to us

00:40:31.720 thinking, wait, guys, you just said that the majority of the cell-free DNA is from this mother.

00:40:37.120 99.9% of the time, she doesn't have cancer. Where is that cell-free DNA coming from? 1.00

00:40:42.340 When cells are destroyed, either through necrosis or apoptosis, there's a lot of cell turnover,

00:40:48.600 right, of cells that replicate, especially epithelial cells, blood cells, and so on. As the natural

00:40:55.440 biochemistry destroys them, some of the DNA from the nucleus ends up in circulation. Again,

00:41:01.720 where they're wrapped around these nucleosomes. So it's essentially cell death and cell turnover

00:41:07.180 is the source of it. And since, again, at any one time, there's millions of cells dying and being

00:41:13.200 turned over, there's always some base-level cell-free DNA in the blood.

00:41:18.280 And again, I don't know if you've ever done the calculation. If not, I don't mean to put you on

00:41:21.540 the spot. But do you have an approximate guess for how many base pairs of cell-free DNA are floating

00:41:28.180 around your body or my body as we sit here right now?

00:41:31.160 What I can say is, if you took a 10-mil blood tube, which is a lot of what these tests use,

00:41:37.100 and you remove all the cellular DNA, remember, there's a ton of DNA in the cells in circulation.

00:41:41.700 Sure. The white blood cells, the red blood cells, et cetera. Get rid of all that. Yep.

00:41:45.380 Huge amount. You probably have on the order of a few thousand cells worth of cell-free DNA

00:41:51.160 in a 10-mil blood tube, which isn't a lot. Just to make sure I understand you, you're saying

00:41:56.640 a few thousand cells worth. Each cell would be 3 billion base pairs.

00:42:02.500 Yes. Yes.

00:42:03.980 Wow. On the one hand, it doesn't sound like a lot because there are billions of cells. On the other

00:42:10.180 hand, it still sounds like a lot. That's still a big computational problem.

00:42:14.420 Where it becomes challenging is when we get into early detection, right? Where if you think about it,

00:42:19.720 for any position in the genome, you only have a few thousand representations of it because there's

00:42:27.300 only a few thousand cells. That starts to limit your ability to detect events that occur at one

00:42:34.380 in a million or one in a hundred thousand. Alex, do you recall these incident cases of the pregnant

00:42:43.960 mothers? Again, I guess we should probably go back and re-explain that because it's such an

00:42:48.900 important and profound discovery. There were a handful of cases where in the process of screening

00:42:54.460 for trisomies, they're discovering not that the mother has additional chromosomes that can be

00:43:02.640 attributed to the fetus, but that she has significant mutations across a number of genes that

00:43:12.700 also are probably showing up in relatively small amounts because they're not in all of her cells. 0.89

00:43:19.080 Is that correct?

00:43:20.500 Yeah. Yeah. So you might expect a flat pattern, right? In the majority of cases, or when the fetus

00:43:26.740 has a trisomy, you see these very well-known accumulations, mostly in 21, but occasionally in

00:43:33.260 18 or 13. And instead what you see is just increases and decreases monosomies and trisomies

00:43:40.120 across many, many chromosomes, which is just not compatible with life even as a fetus. But there

00:43:47.120 is a biology where you do see these tremendous changes in the chromosomes. And that's often in the

00:43:53.720 case of cancer.

00:43:55.240 Do you recall what those cancers turned out to be in those young women? I mean, I assume they

00:43:59.880 were breast cancers, but they could have been lung cancers, anything?

00:44:03.360 Yeah. So Meredith and Daria, they published a paper in JAMA, which for anyone interested,

00:44:08.480 details these 10 or so cases and what happened in each of them. It was a mix. I think there was

00:44:14.580 a neuroendocrine, uterine, some GI cancers. It was a smattering of different things.

00:44:20.860 And what was the approximate year of that? We'll try to find that paper and link to it in the show

00:44:24.800 notes.

00:44:25.580 It was 2015 in JAMA.

00:44:27.360 Got it. That seems unbelievable.

00:44:29.880 Of course, one doesn't know the contrapositive. One doesn't know how many women had cancer but

00:44:37.660 weren't captured. But is it safe to assume that the 10 who were identified all had cancer?

00:44:45.280 Yes. Yes.

00:44:46.420 So there were no false positives. We just don't know how many false negatives there were.

00:44:50.180 Right. Yeah. This is one of the things that contributed to the evidence that cancer screening

00:44:56.580 might be possible using cell-free DNA, which is these incidental findings. As I mentioned earlier,

00:45:02.180 we already knew that, yes, tumors do put cell-free DNA into the bloodstream. But this was a profound

00:45:08.300 demonstration that in actual clinical practice, you could find undiagnosed cancers in asymptomatic

00:45:15.240 individuals. And that it was highly specific, meaning that when it was found using this method,

00:45:20.940 it almost, well, I think in those initial ones, it was every case, but almost every case turned out

00:45:26.000 to have cancer. Now, to your point, it's not a screening test because even in relatively healthy

00:45:33.720 and women of childbearing age, a population of 100,000, you expect epidemiologically 10 times or

00:45:41.160 so or 50 times that number of cancers over a year or so. So clearly you're missing the majority of

00:45:48.080 cancer. So it's not a screening test. Right. It was just a proof of concept though.

00:45:52.860 Yeah. An inadvertent proof of concept that really raised that Illumina and I think in the field,

00:45:57.800 our attention of, hey, using cell-free DNA and sequencing based methods, it might be possible

00:46:03.240 to develop a very specific test for cancer. So what was the next step in the process of

00:46:10.480 systematically going after addressing this problem? Myself and some other folks at Illumina,

00:46:16.040 along with the two scientists I mentioned, Meredith and Daria, and then also in particular,

00:46:21.960 the CMO at the time, Rick Klausner, who had a very long history in cancer research and in cancer

00:46:29.500 screening. He was the previous NCI director under Bill Clinton. So that's the National Cancer

00:46:35.320 Institute at the NIH under Bill Clinton. And he was the CMO at Illumina at the time. And we started to

00:46:41.340 talk more and more about what would it take to develop or determine the feasibility of a universal

00:46:48.680 blood test for cancer based on this cell-free DNA technology. And being very first principle,

00:46:55.180 I really asked the question, well, why is it in 50 years of many companies and a tremendous amount

00:47:01.140 of academic research, no one had ever developed a broad-based blood test for cancer? Not just many

00:47:08.140 cancers, let alone any cancer. Really, the only example is PSA. And again, the false positive

00:47:14.600 rates there are so high that its benefit to harm has been questioned many times. And that's why it

00:47:20.980 doesn't have a USPSTF grade A or B anymore. And the fundamental reason is specificity. So there's lots

00:47:28.480 of things that are sensitive, meaning that there are proteins that accumulate, biochemistries, metabolites

00:47:34.460 that go up in cancer. But the problem is they go up in a lot of benign conditions. So, you know,

00:47:39.840 a big benign prostate spews out a lot of PSA. And pretty much every other protein or metabolite does

00:47:45.900 that. The biomarkers to date were all very sensitive, but all had false positive rates of,

00:47:53.540 say, 5% or 10%. And so if you're imagining screening the whole population, you can't be working up one of

00:48:00.560 10 people for a potential cancer. And so the key technological thing to solve was, well, how do

00:48:07.100 you have something that has a 1% false positive rate or a half percent false positive rate? Because

00:48:12.840 that's what you need to get to if you want to do broad-based cancer screening in relatively healthy

00:48:18.940 asymptomatic people. And this is why we thought it might be possible with cell-free DNA because

00:48:25.220 the tumor DNA could be more specific than proteins and other things that are common in benign disease.

00:48:33.440 And so that was the reason to believe. The things we didn't know is, well, how much DNA does a early

00:48:39.480 stage tumor pump out? If it doesn't pump out any, well, there's nothing to detect. The other is the

00:48:44.620 heterogeneity. Cancer is not like infectious disease or there's one very identifying antigen or sequence.

00:48:52.680 Every tumor is truly unique, right? So even two lung cancers that are both the same

00:48:57.600 histological subtype, they can share very few mutations or none. So you can have two squamous

00:49:03.860 cell lung cancers that honestly don't have a single shared mutation. So now you need to look at

00:49:09.960 hundreds or thousands or even of millions of physicians to see enough potential changes.

00:49:15.540 And this is where, again, NGS was a really good fit, which is how do you overcome the heterogeneity

00:49:21.340 that you need to now look for a disease that isn't defined? I can't tell you these three mutations

00:49:27.140 are the ones you need to find for this cancer. There's a huge set of different ones for every

00:49:33.040 cancer. And then that got us thinking, well, look, in addition to sequencing many physicians and

00:49:39.000 sequencing very deeply and using cell-free DNA, we were going to need to use AI or machine learning

00:49:44.600 because we had to learn these complex associations and patterns that no human being could curate

00:49:52.000 thousands of different mutational profiles and try to find the common signals and so on.

00:49:58.320 What emerged over the course of a year is, look, this might be possible, but we're going to have to

00:50:03.980 enroll very large populations just to study and find the signals and develop the technology.

00:50:09.920 And then we're going to need very large studies to actually do interventions and prove it clinically

00:50:15.720 valid that it actually works. We're going to have to use NGS and sequence broadly across the whole

00:50:22.080 genome. And only then might it be possible. And so the company at the time decided, and this was a

00:50:30.320 board-level decision, that ultimately this made more sense as an independent company, given the amount

00:50:37.640 of capital that was going to be required, given the scientific and technical risk, given the kind

00:50:43.240 of people that you would need to recruit. We're passionate about this, that it made sense to do

00:50:48.240 as a separate company. And so the CEO at the time, Jay Flatley, in early 2016 announced the founding of

00:50:56.100 the company and then spinning it out of Illuminon. I had the honor of being one of the co-founders of it.

00:51:02.100 Let's go back to 2016. You guys are now setting up new shop. You've got this new company. It's called

00:51:08.420 Grail. You've brought over some folks like you from Illumina, and presumably you're now also

00:51:15.340 recruiting. What is the sequence of the first two or three problems you immediately get to work on?

00:51:23.100 As I wrote the starting research and development plan, the way I wrote it was we needed to evaluate

00:51:29.420 every potential feature in cell-free DNA, meaning that any known method of looking for cancer in

00:51:36.680 cell-free DNA, we needed to evaluate. That if we were going to do this and recruit these cohorts and

00:51:42.060 all these altruistic individuals, and we were going to spend the money to do this, we needed to not look

00:51:46.620 at just one method or someone's favorite method or whatever they thought might work. We needed to look

00:51:52.040 at every single one. And so that's what we did. We developed an assay and software for mutations and

00:51:59.300 then a bunch of other things, chromosomal changes, changes in the fragment size, and many others. And we

00:52:05.740 said, look, we're going to test each one of these head-to-head, and we're going to test them in

00:52:09.160 combination, and we're going to figure out the best way to do this. We even had a mantra that Rick came up with

00:52:15.720 that I thought was very helpful, which is we're either going to figure out how to do this, or we're going to prove

00:52:19.940 it can't be done. I think that was very helpful in thinking about how to do these initial experiments.

00:52:25.100 So it was a lot of building these assays. We needed a massive data sets to train the machine learning

00:52:30.360 algorithm. So we had this study called the CCGA, the Circulating Cell-Free Genome Atlas, where we

00:52:35.560 recruited 15,000 individuals with and without cancer of every major cancer type, and in most cases,

00:52:42.940 hundreds. And then we tested all of these different methods, the ones I mentioned, and also

00:52:49.020 importantly, a methylation-based assay. And we did blinded studies to compare them and see could

00:52:55.060 any of them detect a large fraction of the cancers? Did any of them have the potential to do it at high

00:53:00.200 specificity? Because that's what we would need if we were going to develop a universal test for cancer

00:53:06.100 that could be used in a broad population. So let's kind of go back and talk about a few of those things

00:53:11.300 there because there was a lot there. So you said up front, look, we're going to make sure that any

00:53:18.180 measurable property of cell-free DNA, we are measuring, we are quantifying it, we are noting

00:53:24.880 it. We talked about some of them, right? So fragment length, that seems relatively fixed, but presumably

00:53:30.560 at large enough sample size, you're going to see some variation there. Does that matter?

00:53:35.240 The actual genetic sequence, of course, that's your bread and butter, to be able to measure that.

00:53:41.320 You also mentioned, of course, something called methylation, which we haven't really talked about

00:53:45.980 yet. So we should explain what that means. Were there any other properties besides fragment length,

00:53:52.280 sequence, and methylation that I'm missing? There were several others. One was chromosomal changes.

00:53:57.580 So as we mentioned in cancer, the numbers of chromosomes often change. So many cancers,

00:54:03.980 and this is wild, they'll often double the number of chromosomes. So you can go from 23 to

00:54:10.200 double or even triple the number. But these chromosomes are not normal. So you'll often have

00:54:16.440 arms or the structures of chromosomes will get rearranged. And so there's a way to look at that

00:54:22.620 also in the cell-free DNA. Like as we mentioned in the non-invasive prenatal testing, where you look at

00:54:27.300 the amount of DNA per chromosome or per part of chromosome. So we looked at what's called these

00:54:32.800 chromosomal abnormalities. We also looked at cell-free RNA. So it turns out there's also

00:54:38.940 RNA from tumors in circulation. How stable is that, Alex? I was under the impression that

00:54:44.840 RNA wouldn't be terribly stable, unlike DNA, which of course is double strand and quite stable.

00:54:51.140 How do you capture cell-free RNA? So naked RNA is not very stable. However, there's proteins that if

00:55:00.080 the RNA is bound to, and one type is called an argonaut protein, if the RNA is bound to it,

00:55:06.060 it is protected. I assume this is typically messenger RNA that's been in the process of

00:55:10.920 being transcribed. But somewhere along the way, before translation occurs, there's the disruption

00:55:18.400 to the cell that results in lysis or something. And you're just basically getting the cell-free RNA

00:55:23.640 RNA because you happened to catch it at that point. It was a replicating cell or something,

00:55:28.200 or it was just translating protein?

00:55:29.620 Yeah. Or during apoptosis, it's somehow during some kind of programmed cell death,

00:55:34.860 it's being digested or bound. The amount relative to the amount of cell death is low. So presumably

00:55:41.600 most of the RNA is destroyed, but enough of it does get protected and bound to proteins.

00:55:47.720 Whether or not it's cellular detritus or garbage, or it's intentional, it's kind of a different

00:55:53.320 question, but it is present. There's also vesicular structure. So little bubbles of membrane that the

00:56:00.280 RNA can be contained in. The most common one is referred to as an exosome, which are these little

00:56:05.760 vesicles in circulation. So in a variety of different ways, you can have messenger RNA and other types of

00:56:12.200 RNA preserved outside of cells in circulation. And so we looked at that also.

00:56:19.360 How long did it take to quantify all of these things? And presumably, I think you sort of alluded

00:56:25.760 to this, but we're not just looking at any one of these things. You're also asking the question,

00:56:30.220 can combinations of these factors add to the fidelity of the test, correct?

00:56:34.920 Yeah. So this initial research phase took close to three years, cost hundreds of millions of dollars.

00:56:41.160 We had to recruit the largest cohort ever for this type of study, the CCGA study, as I alluded to.

00:56:47.580 And there were different phases. There was a discovery and then multiple development and

00:56:52.300 validation phases. We had to make the world's best assays to look at each of these features.

00:56:58.960 And then we had to process all of those samples and then analyze them. And we did it in a very

00:57:05.240 rigorous way where the final testing was all done blinded and the analysis was all done blinded.

00:57:10.360 So we could be sure that the results were not biased. And then we compared them all and we also

00:57:16.440 compared them in combinations. And we use sophisticated machine learning approaches to

00:57:21.320 really maximize the use of each individual type of data from each, you know, whether or not it was

00:57:26.440 mutations or the chromosomal changes or methylation.

00:57:29.200 So you mentioned that the CCGA had 15,000 samples. How many of those samples were cancers versus

00:57:36.640 controls? What was the distribution of those?

00:57:39.440 It's about 60% cancer versus controls. Yeah, 40%.

00:57:43.340 You sort of alluded to it, but just to be sure I understood, you're obviously starting first with

00:57:48.520 a biased set where you know what's happening and then you're moving to a blinded, unbiased set

00:57:53.380 for confirmation. Is that effectively the way you did it?

00:57:56.080 Yeah. Yeah. It's often referred to as a training set and a test set. Yeah.

00:58:02.220 Tell us what emerged, Alex, when it was all said and done, when you had first and foremost

00:58:07.440 identified every single thing that was measurable and knowable. Sorry, before we do that, I keep

00:58:13.400 taking us off methylation. Explain methylation of all the characteristics. That's the one I don't

00:58:18.380 think we've covered yet.

00:58:19.120 So DNA methylation is a chemical modification of the DNA. So in particular at the

00:58:25.580 C in the ATC-C code, the C stands for acytosine. So that's a particular nucleotide or base in DNA.

00:58:34.240 Mammalian biology can methylate. It means that it can add a methyl group, but a methyl group is just

00:58:40.500 a single carbon atom with three hydrogens and then bonded to that cytosine. And so that's what DNA

00:58:46.760 methylation is. So to say acytosine is methylated, it means that it has that single methyl group bonded

00:58:54.120 to it. Turns out that there's about 28 million positions in the human genome that can be methylated.

00:59:00.640 It usually occurs at what's called CPG sites, which is if you go along one strand of DNA,

00:59:07.840 this is not pairing of the DNA, but one strand, a G follows a C. So that's what a CPG is. It's a C

00:59:14.260 with a phosphate bond to a G. And so at those positions in the genome, there are enzymes that

00:59:21.000 can methylate the cytosine and demethylate it. And there's again, about 28 million of those sites

00:59:27.700 out of the 3 billion overall bases in the human genome. These chemical modifications are really

00:59:35.340 important because they affect things like gene expression. It's one of the more important classes

00:59:40.300 of something that's called epigenetics, which is changes that are outside of the genetics or outside

00:59:46.040 of the code itself. As you know, the DNA code is the same in most cells of the human body.

00:59:51.600 Obviously, the cells are quite different. So a T cell is very different than a neuron.

00:59:55.360 And other than the T cell receptor, all of the genes are the same. The code is the same. So why are the

01:00:00.900 cells different? Well, it's the epigenetics. So things like which parts of the gene are methylated

01:00:06.700 or which ones are associated with histones that are blocking access to the DNA, that that's what

01:00:13.460 ultimately determines which genes are transcribed, which proteins are made, and why cells take on

01:00:19.820 very different morphology and properties. The methylation is a very fundamental code

01:00:25.900 for controlling it. So I call the epigenetics the software of the genome. The genetic code is kind

01:00:32.920 of the hardware, but how you use it, which genes you use when, which combination, that's really the

01:00:38.740 epigenetics. What is the technological delta or difference in reading out the methylation sequence

01:00:48.640 on those CPG sites relative to the ease with which you simply measure the base pair sequences? So you can

01:00:58.460 measure C, G, A, T, C, C, A, T, G, et cetera. But then in the same readout, do you also acquire which of

01:01:07.160 the C's are methylated, or are you doing a separate analysis? There's different technologies to do that.

01:01:12.940 For cell-free DNA, usually you want very accurate sequencing of billions of these, or many hundreds

01:01:19.440 of millions of these small fragments. The way it's done is, and this adds complexity to the chemistry,

01:01:25.320 is you pre-treat the DNA in a way that encodes the methylation status in the ATG sequence,

01:01:32.800 and then you just use a sequencer that can only see ATCG. But because you've encoded the information,

01:01:39.780 you can then deconvolute it and infer which sites were methylated. Just to be a little more specific,

01:01:46.580 there's chemicals that will, for example, deaminate a cytosine that's not methylated.

01:01:53.520 And then that deaminated cytosine effectively turns into a uracil, which is a fifth letter in RNA.

01:01:59.920 And then when you copy the DNA and you amplify it prior to the sequencing, it amplifies as a T,

01:02:07.120 because a U, when it's copied by a DNA polymerase, it becomes a T. And then you end up with a sequence

01:02:13.160 where you expect to see C's and you see a T. And if you see a T there, then you know that,

01:02:18.900 aha, this must have been an unmethylated site.

01:02:21.900 That came from a U, and the U is an unmethylated C. Brilliant.

01:02:26.080 Brilliant. Right. And if the C was not changed, then you say, then that must have been a site that

01:02:31.120 was methylated. Because you'll see G's opposite them. Oh, sorry. If the C was methylated, you'll

01:02:36.300 see the G opposite because you won't turn it to the uracil. Right, right. Yeah. Brilliant.

01:02:41.560 That technique is called bisulfite sequencing. There are other ways to do it, but that's the

01:02:45.780 predominant of it. All right. So now back to the question I started to ask a minute ago,

01:02:49.780 but then realized we hadn't talked about methylation. So you've come up with all these different

01:02:53.620 things that you can do with this tiny amount of blood. Because again, you talk about 10 ML,

01:02:59.920 you know, in the grand scheme of things, that's a really small amount of blood. That's two small

01:03:04.340 tubes of blood. Very easy to do. Presumably there was an optimization problem in here where you min

01:03:10.440 max this thing and realize, well, look, this would be easy to do if we could take a liter of blood,

01:03:15.400 but that's clinically impossible. Yeah. It would be nice to Theranos this quote unquote,

01:03:20.540 and do this with a finger stick of blood, but you're never going to get the goods.

01:03:24.540 So did you sort of end up at 10 ML? Was it just sort of an optimization problem that got you there

01:03:30.340 as the most blood we could take without being unreasonable, but yet still have high enough

01:03:35.380 fidelity? And maybe asked another way, can you get better and better at doing this if you were taking

01:03:40.980 eight tubes of blood instead of two? Yeah. There's a couple of considerations. One is the

01:03:46.000 practical one. You need a format to the extent your standard phlebotomy and standard volumes that are

01:03:52.340 below the volumes at which you could put someone in jeopardy. That's a big practical issue. But it

01:03:58.560 actually turned out that what ultimately limited the sensitivity of the test was the background biology.

01:04:06.140 So for broad-based cancer screening, more blood would actually not help you. Now there's other

01:04:11.180 applications for monitoring or the therapy selection where you're looking for a very particular target,

01:04:18.520 someone who has cancer and you know what kind of cancer, and there you could improve your sensitivity.

01:04:23.280 But just for cancer screening, you're usually not limited by the amount of blood.

01:04:29.400 And so did methylation turn out to be the most predictive element at giving you that very,

01:04:35.860 very high specificity, or was it some combination of those measurable factors?

01:04:42.340 Yeah. So it was pretty unexpected. I would say going into it, most people thought that the mutations

01:04:47.880 were going to be the most sensitive method. Some of us thought that the chromosomal changes were going

01:04:53.860 to be the most sensitive. I would say the methylation signals were kind of a dark horse. I had to fight

01:04:59.640 several times to keep it in the running. But again, we really took a, let the data tell us what's

01:05:05.860 the right thing to do. It's not biases from other experiments. Let's do this in a comprehensive,

01:05:11.120 rigorous way. And in the end, the methylation performed by far the best. So it was the most

01:05:17.220 sensitive. So it detected the most cancers. Importantly, it was very specific. It actually

01:05:22.680 had the potential and ultimately did get to less than 1% false positive rate. And then the methylation

01:05:28.760 had this other feature, which was very unique, which was that it could predict the type of cancer.

01:05:34.640 What was the original, what we call now the cancer site of origin? What organ or tissue did it originate

01:05:41.880 from? Interestingly, adding them all together didn't improve on the methylation. I can explain

01:05:47.920 why. And now in hindsight, you might've thought, Hey, more types of information and signal are better,

01:05:53.620 but it actually did it. So we ended up with one clear result that the methylation patterns in the

01:05:59.960 cell free DNA were the most useful and information and adding other things was not going to help the

01:06:06.440 performance. And why do you think that was? Because it is a little counterintuitive. There are clearly

01:06:13.080 examples I could probably think of where you can degrade the signal by adding more information.

01:06:18.440 But I'm curious if you have a biologic teleologic explanation for why one and only one of these

01:06:26.300 metrics turned out to be the best and any additional information only diluted the signal.

01:06:31.860 It comes down to, this is a good engineering principle, right? If you want to improve your

01:06:36.600 prediction, you need an additional signal that carries information and is independent from your

01:06:42.580 initial signal. If it's totally correlated, then it doesn't actually add anything.

01:06:47.760 Let's take an analogy. Let's say you're on a freeway overpass and you're developing an image

01:06:53.320 recognition for Fords. And you say, okay, what I'm going to start initially with is an algorithm.

01:06:59.280 It's going to look for a blue oval with the letters F-O-R-D in it. So that's pretty good. 0.98

01:07:04.360 Now let's say you say, okay, I know that some Fords also have the number 150 on the side,

01:07:10.840 F-150. So I'm going to add that, right? If you think about it, if your algorithm

01:07:17.600 based on the blue oval is already pretty good, adding the 150 is not going to help because

01:07:24.740 whenever the 150 occurs, the blue oval is also always there. Now, if the blue oval wasn't always

01:07:31.280 there or there were Fords that didn't have the blue oval, then some other signal could be helpful.

01:07:35.940 And so that's kind of what ended up happening is that the methylation signal was so much more

01:07:41.680 prevalent and so much more distorted in cancer that everything else didn't really add because

01:07:48.840 anytime you could see one of the others, you could also see many more abnormal methylation fragments.

01:07:55.820 Yeah, that's really fantastic. I guess I also want to, again, just go back and make sure people

01:08:01.480 understand the mission statement you guys brought to this, which was high specificity is a must.

01:08:09.040 So people have heard me on the podcast do this before, but just in case there are people

01:08:13.300 who haven't heard this example or forget it, I sometimes like to use the metal detector analogy

01:08:18.820 in the airport to help explain sensitivity and specificity. So sensitivity is the ability of

01:08:25.400 the metal detector to detect metal that should not go through. And let's be clear. It's not that people

01:08:32.600 in the airports care if your phone is going through or your laptop or your watch or your belt,

01:08:38.440 they care that you're bringing guns or explosives. That's why we have metal detectors or knives or

01:08:45.060 things of that nature. That's why the metal detector exists. It has to be sensitive enough

01:08:50.640 that no one carrying one of those things can get through. On the other hand, specificity would say,

01:08:58.100 so if you're optimizing for sensitivity, you make it such that you will detect any metal that goes

01:09:03.680 through that thing. And by definition, you're going to be stopping a lot of people. You're going to stop

01:09:10.280 everybody from walking through. If their zipper is made of metal, you'll stop them.

01:09:14.740 Or prosthetic or a big belt or boots or anything. You got a little metal on your glasses, you're going

01:09:21.120 to get stopped. So you have to dial the thing in a way so that you have some specificity to this test

01:09:27.100 as well, which is I can't just stop everybody. In an ideal world, I kind of want everyone to make

01:09:33.380 it through who's not carrying one of those really bad things. And we're defining bad thing by a certain

01:09:38.900 quantity of metal. And therefore, your specificity is to kind of say, I don't want my test to be

01:09:47.580 triggered on good guys, right? I want my test to be triggered on bad guys. Now, when you guys are

01:09:54.820 designing a test like this, like the Grail test, I guess I should just go back and state anybody

01:10:00.300 who's ever been through two different airports wearing the exact same clothing and realizes

01:10:06.240 sometimes it triggers, sometimes it doesn't. What you realize is not every machine has the same

01:10:09.960 setting. And that's because the airport, the people at TSA, they turn up or turn down the sensitivity

01:10:15.500 and that changes the specificity as well. How deliberately do you, when you're setting up this

01:10:23.240 assay, have the capacity to dial up and down sensitivity and specificity? So while I understand

01:10:29.480 your mandate was a very high specificity test, where was the control or manipulation of that system,

01:10:37.940 if at all? So there's a threshold. It's complex. Conceptually, there's a threshold inside the

01:10:44.000 algorithm, right? So you can imagine that after you have this comprehensive map of all these different

01:10:51.140 types of methylation changes that can occur in the fragments of hundreds of examples of every cancer

01:10:57.200 type. And then you compare it to all the methylation changes that can occur outside of cancer, which we

01:11:03.880 haven't talked about, which is very important. So most of the methylation patterns are pretty similar

01:11:09.380 and similar cell types across individuals. But there are changes that occur that occur with age or

01:11:14.760 ethnicity or environmental exposure and so on. What you'd like is those two populations to be

01:11:21.500 completely different. But it turns out there is some overlap. So there are fragments that occur in

01:11:28.040 cancer that can occur outside of cancer. The algorithm in a very complex state space is trying to

01:11:35.320 separate these populations. And whether or not you're going to call something as a potential cancer

01:11:42.580 and say a cancel signal is detected is whether or not the algorithm thinks, is it associated with

01:11:47.620 this cancer group or is it associated with a non-cancer group? But again, there's some overlap

01:11:54.140 between these. And so where you set that overlap, like in the border between individuals who don't have

01:12:01.860 cancer, but how for whatever reason, an abnormal level of fragments that kind of look cancerous,

01:12:07.540 that will determine your specificity. So there is a dial to turn where you can increase the

01:12:14.440 stringency, call fewer false positives, but then you will start to miss some of the true positives.

01:12:21.260 Now, what was so great about methylation is that these populations were pretty well separated,

01:12:26.680 better than anything the world had ever seen before, which is why you could get high specificity

01:12:31.940 and still pretty good sensitivity. But yes, there is some overlap, which means you have to make a

01:12:38.040 trade-off and dial it in. Inside the company, is there sort of a specific discussion around

01:12:44.320 the trade-offs of it's better to have a false positive than have a false negative? Like let's

01:12:49.800 use the example you brought up earlier, right? So prostate-specific antigen is kind of the mirror

01:12:54.380 image of this, right? It's a highly, highly sensitive test with very low specificity. It's obviously a

01:13:01.120 protein, so it's a totally different type of assay, right? It's a far cruder test, of course.

01:13:05.600 But the idea is, in theory, and of course I could give you plenty of examples, someone with prostate

01:13:11.580 cancer is going to have a high PSA. So you're not going to miss people with cancer. But as you pointed

01:13:18.420 out earlier, you're going to be catching a lot of people who don't have cancer. And it's for that

01:13:24.060 reason, as you said, there is no longer a formal recommendation around the use of PSA screening.

01:13:28.820 It has now kind of been relegated to the just talk to your doctor about it. And of course,

01:13:34.540 the thinking is, look, there are too many men that have undergone an unnecessary prostate biopsy on the

01:13:40.580 basis of an elevated PSA that really should have been attributed to their BPH or prostatitis or

01:13:46.660 something else. So notwithstanding the fact that we have far better ways to screen for prostate cancer

01:13:51.440 today, that's a test that is highly geared towards never missing a cancer. In its current format,

01:13:59.260 under low prevalence populations, which is effectively the population it's being designed

01:14:05.580 for, right? This is designed as a screening tool. It seems to have better negative predictive value

01:14:10.900 than positive predictive value, correct? It's pretty high in both because negative predictive

01:14:15.260 value also is related to prevalence. Well, just to put some numbers out there, right? So

01:14:20.200 in the CCGA study, but then importantly, in an interventional study called Pathfinder,

01:14:26.620 a positive predictive value is around 40%. That's all stages?

01:14:31.680 Yeah. So that's all cancers, all stages. It's a population study. So it's whatever natural

01:14:37.220 set of cancers and stages occur in that group. So that was about 6,500 individuals.

01:14:43.920 Do you recall, Alex, what the prevalence was in that population? Was it a low risk population?

01:14:50.200 Yeah. So it was a mix of a slightly elevated risk population and then a average risk population.

01:14:58.240 Just in terms of risk, and I think you'll appreciate this, I think of anyone over 50 as

01:15:02.720 high risk. And that's where the majority of these studies are happening, right? So I mean,

01:15:07.000 age is your single biggest risk factor for cancer. The population over 50 is about a 10x increased risk

01:15:14.640 relative to the population under 50.

01:15:17.920 And age 55 to 65 is the decade where cancer is the number one cause of death.

01:15:23.620 I would say in developed nations, I mean, that's actually increasing, right? I mean,

01:15:27.920 we're making such incredible progress on metabolic disease and cardiovascular disease. Cancer in the

01:15:34.000 developed world is predicted to become surpass cardiovascular disease as the number one killer.

01:15:39.020 Anyway, older populations are at, I wouldn't call them low risk, I'd call them average risk for that 1.00

01:15:45.100 age group, which is still relatively high for the overall population. But it was a mixed prevalence,

01:15:50.440 a bit less than 1%. Some of these studies do have a healthy volunteer bias.

01:15:55.840 In a 6,500 person cohort with a prevalence of 1%, which is pretty low, the positive predictive value was 40%.

01:16:06.020 Yep, that's right.

01:16:07.880 What was the sensitivity for all stages then? It must have been,

01:16:11.560 it's easy to calculate if I had my spreadsheet in front of me, but it's got to be 60% or higher.

01:16:17.000 Sensitivity and specificity has got to be close over 99% at that point, right?

01:16:21.760 Those are the rough numbers. Yeah, that's right. Some of the important statistics there, right? So about

01:16:26.780 half of the cancers that manifested over the lifetime of the study were detected by the test. The test

01:16:34.100 actually doubled the number of cancers in that interventional study than were detected by standard

01:16:39.420 of care screening alone. The interventional study, the Pathfinder study, the enrollees were getting

01:16:45.920 standard of care screening according to guidelines. So mammography, importantly, cervical cancer

01:16:52.120 screening, and then colonoscopies or stool-based testing based on guidelines. And so a number of the

01:16:58.340 cancers that the grail gallery test detected were also detected by standard of care, which you would

01:17:04.400 expect. But the total number of cancers found was about doubled with the addition of the gallery test.

01:17:11.880 And that was predominantly cancers where there isn't a screening test for. But just going back to

01:17:17.260 the positive predictive value, just the positive predictive value of most screening tests is low single

01:17:22.700 digits. You probably have the experience more than I have, but many, many times a female colleague, 1.00

01:17:29.060 friend, or someone's wife calls and said, you know, I got a mammography. They found something. I'm

01:17:34.480 going to have to go for a follow-up, a biopsy, and so on. And literally 19 times out of 20, it's a false

01:17:41.580 positive. That's one where we've accepted, for better or worse, a huge false positive rate. Catch some

01:17:48.560 cancers, right? And that's why there's a fair amount of debate around mammography. But again,

01:17:53.040 that's a positive predictive value of about four and a half percent. The vast majority of people who

01:17:58.680 get initial positive, they're not going to end up having cancer, but still potentially worth it.

01:18:05.220 Now we're talking about something where we're approaching one or two positive tests will

01:18:10.340 ultimately lead to a cancer diagnosis that's potentially actionable. So it's, I think sometimes

01:18:15.920 when people hear 40%, they say, gee, that means there's still a fair amount of people who are

01:18:22.240 going to get a positive test, meaning a cancer signal detected and ultimately not. But again,

01:18:28.240 for a screening test, that's incredibly high yield. I think another way to think about that is to go

01:18:33.600 back to the airport analogy. So this is a metal detector that is basically saying, look, we're willing

01:18:42.960 to beep at people who don't have knives to make sure everybody with a knife or gun gets caught.

01:18:49.280 So the negative predictive value is what's giving you the insight about the bad guys. So a 40% positive

01:18:56.780 predictive value means, let's just make the numbers even simpler. Let's say it's a 25% positive predictive

01:19:03.680 value. It means for every four people you stop, only one is a true bad guy. Think about what it's like

01:19:12.420 in the actual airport. How many times in a day does the metal detector go off and how many times in a

01:19:19.380 day are they catching a bad guy? The answer is it probably goes off 10,000 times in a day and they

01:19:25.540 catch zero bad guys on average. So that gives you a sense of how low the positive predictive value is

01:19:31.480 and how high the sensitivity is and how low the specificity is. So yes, I think that's a great way to

01:19:37.320 look at it, which is if you are screening a population that is of relatively normal risk,

01:19:45.740 a positive predictive value of 20% is very, very good. It also explains, I think, where the burden

01:19:54.820 of responsibility falls to the physician, which is as a physician, I think you have to be able to talk

01:20:00.760 to your patients about this explicitly prior to any testing. I think patients need to understand that,

01:20:10.220 hey, there's a chance that if I get a positive test here, it's not a real positive. I have to have

01:20:17.000 kind of the emotional constitution to go through with that, and I have to be willing to then engage

01:20:22.460 in follow-up testing. Because if this thing says, oh, you know, Alex, it looks like you have a lung cancer,

01:20:28.800 the next step is, I'm going to be getting a chest x-ray, or I'm going to be getting a low-dose CT

01:20:33.140 of my chest. And that doesn't only come with a little bit of risk, in this case, radiation,

01:20:37.740 although it's an almost trivial amount, but I think more than anything, it's the risk of the emotional

01:20:42.880 discomfort associated with that. And I think, honestly, when you present the data this way to patients,

01:20:49.020 they really understand it, and they really can make great informed decisions for themselves.

01:20:53.580 And by the way, for some of those patients, it means, thank you, but no thank you.

01:20:56.820 I just don't want to go through with this, and that's okay, too. Let's talk a little bit about

01:21:01.160 some of the really interesting stuff that emerged in the various histologies and the various stages.

01:21:08.140 And I've had some really interesting discussions with your colleagues. I guess, just for the sake

01:21:12.480 of completing the story, you're no longer a part of the company, Grail. Maybe just explain that so

01:21:18.500 that we can kind of get back to the Grail stuff, but just so that people understand kind of your

01:21:22.300 trajectory. We should do that. Yeah. So I was at Illumina, and then I helped spin off Grail as a

01:21:27.900 co-founder, led the R&D and clinical development. I actually went back to Illumina as the chief

01:21:34.340 technology officer running all of the company's research and development. Really, really fantastic,

01:21:39.640 fun job. Subsequently, Illumina acquired Grail, solely owned subsidiary of Illumina.

01:21:47.360 That was almost three years ago. Recently, I left Illumina to start a new company, a really

01:21:54.220 interesting biotech company that I'm the CEO of. No longer actively involved in either company.

01:22:00.200 I have great relations with all my former colleagues. Excited to see their progress.

01:22:04.960 I should also say that I am still a shareholder also of Illumina, just for full disclosure.

01:22:10.020 Yeah. Thank you. You have a number of colleagues, as you said, who are still at Grail,

01:22:13.740 who I've gotten to know. One of the things that really intrigued me was, again, some of the

01:22:20.100 histologic differences and the stage differences of cancer. If you look at the opening data,

01:22:29.600 a few things stood out. There were certain histologies that, if you took them all together

01:22:35.660 by stage, didn't look as good as others. For example, talk a little bit about prostate cancer

01:22:42.320 detection using the gallery test. I think what you're referring to is there's a very wide variety

01:22:50.480 of different performances in different cancers. They're all highly specific, so very low false

01:22:55.700 positive rate because there's only one false positive rate for the whole test, which is probably worth

01:23:00.720 spending some time on later. For example, sensitivity to many GI cancers or certain histologies of lung

01:23:08.280 cancer, the test is very good at detecting earlier stage localized cancers. Particularly in prostate

01:23:15.080 cancer and hormone receptor positive breast cancer, the detection rate is lower for stage one cancers.

01:23:23.620 But this gets to a very important issue, which is what is it that you want to detect? So do you want

01:23:29.720 to detect everything that's called a cancer today? Or is what you want to detect is you want to detect

01:23:33.980 cancers that are going to grow and ultimately cause harm? So the weird thing about cancer screening in

01:23:40.020 general is there's both over and under diagnosis. Most small breast cancers and most DCIS and most

01:23:46.960 even small prostate cancers will never kill the patient or cause morbidity, but there is a small

01:23:52.640 subset that will. And so for those, we have decided to, again, go for a trade-off where we'll often

01:24:00.420 resect things and go through treatments just to make sure that smaller percentage is removed,

01:24:05.660 even though we're removing a ton of other, quote, cancers that are unlikely to ever proceed into

01:24:12.540 anything dangerous. On the flip side, 70% of people who die of cancer, they die from an unscreened cancer.

01:24:19.900 So there's huge underdiagnosis. You should remember that. 70% of people who ultimately die of cancer on

01:24:27.460 their death certificate, they die from a cancer where there was no established screening prior to

01:24:32.900 something like Grail's Gallery. So we have this weird mix of, there's a lot of cancers where we

01:24:37.540 know we're overdiagnosing, but we're doing it for a defensible trade-off. And then there's a huge

01:24:43.400 number of cancer deaths occurring where there's essentially zero diagnosis. But back to the ones

01:24:49.020 where there's underdiagnosis, it gets back to what does it mean to have tumor DNA in your blood?

01:24:55.420 So measuring and detecting a cancer from tumor DNA in your blood is a functional asset.

01:25:01.360 To get tumor DNA in your blood, you have to have enough cells. They have to be growing fast enough,

01:25:07.380 dying fast enough, and have blood access. So those are the things that you require.

01:25:13.180 Now, if you have a tumor that's small, encapsulated, not growing, well, guess what? It's not going to have

01:25:20.180 DNA in the blood. So unlike an imaging assay, which is anatomical, this is really a functional

01:25:26.200 asset. You're querying for whether or not there's a cancer that has the mass, the cell activity and

01:25:33.160 death, and access to the blood to get and manifest its DNA into the blood. So it's really stratifying

01:25:41.600 cancers on whether or not they have the activities. Now, interestingly, this functional assay

01:25:47.180 is very correlated with ultimate mortality. There's a really nice set of data that the

01:25:53.700 GRAIL put out where you look at Kaplan-Meier curves. So over the course of the CCGA study,

01:25:58.960 which is now going out, I don't know, five plus years, you can say, well, what do survival curves

01:26:04.600 look like? If you were positive, your test was detected versus your test was negative, meaning your

01:26:10.540 cancer was not detected by the GRAIL test. And there's a big difference. So basically,

01:26:15.020 if your cancer was undetectable by the GRAIL test, you have a very good outcome, much,

01:26:23.740 much better than the general population with that cancer. So this suggests two things. One is,

01:26:29.380 A, those cancers may not have actually been dangerous because there's not a lot of mortality

01:26:34.240 associated with them. And maybe that's also why they couldn't put their tumor DNA in the blood.

01:26:38.900 The other is whatever the existing standard of care is, it's working well. Now, if you look at all

01:26:45.060 the cancers in the Kaplan-Meier curve that were detected, they have a lot of mortality associated

01:26:50.920 with them. And so what it's showing is that it's the dangerous cancers, the cancers that are

01:26:55.500 accounting for the majority of mortality, those are the ones that the test is detecting.

01:27:00.580 This biological rationale makes a lot of sense, which is, okay, a tumor that grows fast, can get

01:27:06.520 its DNA in the blood. Well, that's probably also a dangerous tumor that is going to become invasive 0.52

01:27:11.000 and spread. So again, it's a functional assay. So if your test is detected by one of these tests,

01:27:18.680 like the gallery test, it's saying something about the tumor that is very profound, which is that it's

01:27:25.260 active enough to get its signal into the blood. And it's very likely, if untreated, to ultimately

01:27:32.500 be associated with morbidity and potentially mortality. I think it's an open question of

01:27:38.760 these tumors that aren't detectable and that are in cancers, we know there's a lot of indolent

01:27:45.440 disease. What does it really mean that the test is low sensitivity for that?

01:27:50.220 Yeah. I would say that when I went through these data and I went through every single histology

01:27:56.880 by stage, I did this exercise probably 18 months ago. The one that stood out to me more than any

01:28:04.660 other was the sensitivity and specificity discrepancy. Well, I should say the sensitivity

01:28:12.100 discrepancy between triple negative breast cancer and hormone positive breast cancer. You alluded to

01:28:20.040 this, but I want to reiterate the point because I think within the same quote unquote disease of

01:28:25.240 breast cancer, we clearly understand that there are three diseases. There's estrogen positive,

01:28:31.280 there's HER2 new positive, there's triple negative. Those are the defining features of three

01:28:36.140 completely unrelated cancers with the exception of the fact that they all originate from the same

01:28:42.360 mammary gland. But that's about where the similarity ends. Their treatments are different,

01:28:46.580 their prognoses are different. And to take the two most extreme examples, you take a woman who has 0.98

01:28:53.160 triple positive breast cancer, i.e. it's estrogen receptor positive, progesterone positive,

01:28:59.040 HER2 new positive. You take a woman who has none of those receptors positive. The difference on the 1.00

01:29:05.040 gallery test performance on stage one and stage two, so this is cancers that have not even spread to

01:29:12.780 lymph nodes. The hormone positives were about a 20% sensitivity for stage one, stage two,

01:29:18.560 and the triple negative was 75% sensitivity for stage one, stage two. And so this underscores your

01:29:26.220 point, which is the triple positive cancer is a much, much worse cancer. And that at stage one,

01:29:34.500 stage two, you're detecting 75% sensitivity portends a very bad prognosis. Now, I think the really important

01:29:45.900 question here, I believe that this is being asked, is does the ability to screen in this way lead to better

01:29:56.320 outcomes? So I will state my bias, because I think it's important to put your biases out there,

01:30:02.220 and I've stated it publicly many times, but I'll state it again. My bias is that yes, it will. My bias

01:30:09.260 is that early detection leads to earlier treatment. And even if the treatments are identical to those

01:30:17.220 that will be used in advanced cancers, the outcomes are better because of the lower rate of tumor burden.

01:30:23.580 And by the way, I would point to two of the most common cancers as examples of that, which are breast

01:30:30.140 and colorectal cancer, where the treatments are virtually indistinguishable in the adjuvant setting

01:30:36.320 versus the metastatic setting. And yet the outcomes are profoundly different. In other words, when you

01:30:42.060 take a patient with breast or colorectal cancer, and you do a surgical resection, and they are a stage three

01:30:47.920 or less, and you give them adjuvant therapy, they have far, far, far better survival than those patients who

01:30:56.300 undergo a resection, but have metastatic disease and receive the same adjuvant therapy. It's not even

01:31:02.380 close. And so that's the reason that I argue that the sooner we know we have cancer and the sooner we

01:31:08.220 can begin treatment, the better we are. But the skeptic will push back at me and say, Peter, the only thing

01:31:14.280 the Grail test is going to do is tell more people bad news. So we'll concede that people are going to

01:31:22.480 get a better, more relevant diagnosis, that we will not be alerting them to cancers that are irrelevant

01:31:29.920 and over-treating them. And we will alert them to negative or more harmful cancers, but it won't

01:31:37.060 translate to a difference in survival. So what is your take on that? And how can that question be

01:31:43.920 definitively answered? It's a very important question. And over time, it will be definitively

01:31:50.700 answered. So we should talk about some of Grail's studies and how they're going about it.

01:31:55.640 So the statistics are very profound, like you said. So most solid tumors, five-year survival,

01:32:01.600 when disease is localized, hasn't spread to another organ, 70 to 80% five-year survival,

01:32:08.360 less than 20 per metastatic stage four disease. That correlation of stage diagnosis versus five-year

01:32:16.520 survival is night and day. And obviously, everyone would want them and their loved ones,

01:32:21.620 most people in the localized disease category. Now, there's an academic question, like you're

01:32:28.020 saying, which is, okay, well, that's true. But does that really prove that if you find people at

01:32:33.240 that localized disease through this method, as opposed to all the variety of methods that happens today,

01:32:39.300 incidentally, that you will have the same outcome? And sure, I guess you could come up with some

01:32:45.340 very theoretical possibility that somehow that won't, but that doesn't seem very likely.

01:32:52.940 And I think it gets to a fundamental question of, well, are we going to wait decades to see that?

01:32:59.480 And in the meantime, give up the possibility, which is probably likely, that finding these cancers early

01:33:06.360 and intervening early will change outcome. I'm all for, and I think everyone is, bigger and more

01:33:13.140 definitive studies over time. But the idea that we're never going to do that study or just take

01:33:19.560 kind of a nihilistic point of view, that until it's done, we're not going to find cancers early

01:33:24.520 and intervene, I don't think it's conscionable to do that, especially when the false positive rate's low.

01:33:30.220 I think there's a few other ways to come at it, which is, if what you said was really true,

01:33:35.040 I've met some of the folks and called by them, the GRAIL test has found the positive.

01:33:38.920 I can think of a former colleague in the test found an ovarian cancer. Do you think when she

01:33:45.180 went to her OBGYN and said, look, the test said that I have potentially an ovarian cancer and they

01:33:50.560 did an ultrasound and they found something that OBGYN said, you know what, since this was found

01:33:56.080 through a new method, let's not intervene. There's a malignancy. It is an ovarian cancer. We know what the

01:34:02.920 natural history is, but we're not going to intervene. Similarly with cases of pancreatic cancer,

01:34:08.120 head and neck or things like that. I don't understand the logic because today people do

01:34:13.420 show up. It's not very often with early stage versions of these disease, ovarian, pancreatic,

01:34:18.180 head and neck and things, and we treat them. So why is it you wouldn't treat them if you could find 0.99

01:34:23.480 them through this modality? I just don't know of any GI surgeon who says, well, you're one of the

01:34:29.460 lucky people where you found your pancreatic cancer at stage one, two, but we're not going to treat it

01:34:33.460 because there isn't definitive evidence over decades that mortality isn't better. So I get

01:34:39.200 the academic point and Grail and others are investing tremendous amount to increase the data.

01:34:45.360 The idea that we have this technology and we're going to allow huge numbers of cancers to just

01:34:51.840 progress to late stage before treating, I don't think that's the right balance of potential benefit

01:34:58.460 versus burden of evidence. So is there now a prospective real world trial ongoing in Europe?

01:35:05.960 There it is. Let's talk a little bit about that.

01:35:08.320 The NHS has been piloting the Grail test in a population of about 140,000. So it involves

01:35:15.840 sequential testing, I think at least two tests, and then they look at outcomes. It's an interventional

01:35:22.800 study with return of results. And they're looking for a really interesting endpoint here. So mortality

01:35:29.420 takes time. So, I mean, some cancers, I mean, to ultimately see whether or not getting diagnosed at

01:35:35.460 a different stage and the intervention changes that that could take one or in some cases, two decades,

01:35:41.040 but they came up with a really interesting surrogate endpoint, which is reduction in stage four

01:35:46.440 cancers. So here's the logic. I think it makes a lot of sense, which is if people stop getting

01:35:51.480 diagnosed with stage four and say a big reduction in stage three cancer, then doesn't it stand to

01:35:57.500 reason that ultimately you will reduce mortality? So if you remove the end stage version of cancer,

01:36:04.900 which kills most people, and that you know that you have to pass through, most people don't die

01:36:10.380 of stage two cancer. They were diagnosed with stage two, they died because it turned out it wasn't stage

01:36:14.900 two and it spread. If you do a study and within a few years, when you're screening people at

01:36:21.480 there's no more, and let's take the extreme stage four cancer, then you've stage shifted the

01:36:26.400 population and you're kind of eliminating late stage metastatic cancer. So again, I think while

01:36:33.760 we're waiting for that to read out, my personal belief is the potential benefit of finding cancer

01:36:39.460 is so significant. Testing now for many patients makes sense. And then I think this endpoint of stage

01:36:48.460 four reduction. Yeah, that's a clever, clever endpoint. One of the things that I know that a lot

01:36:55.660 of the folks who oppose cancer screening tend to cite is that a number of cancer screening studies

01:37:01.920 do not find an improvement in all cause mortality, even when there's a reduction in cancer specific

01:37:08.340 mortality. So, hey, we did this colonoscopy study, or we did this breast cancer screening study,

01:37:14.140 and it indeed reduced breast cancer deaths, but it didn't actually translate to a difference in

01:37:18.780 all cause mortality. I've explained this on a previous podcast, but it is worth for folks who

01:37:23.380 didn't hear that to understand why. To me, that's a very, very, oh, how can I say this charitably?

01:37:30.360 That's a very misguided view of the literature because what you fail to appreciate is those studies

01:37:37.060 are never powered for all cause mortality. And if you reduce breast cancer mortality by 40% or 30%,

01:37:46.720 that translates to a trivial reduction in all cause mortality because breast cancer is still just one 0.99

01:37:54.920 of 50 cancers. And even though it's a relatively prevalent cancer over the period of time of a study,

01:38:00.860 which is typically five to seven years, the actual number of women who were going to die of 1.00

01:38:05.780 breast cancer is still relatively small compared to the number of women, period, who were going to die

01:38:11.780 of anything. And I, in previous podcasts have discussed that it's very difficult to get that

01:38:18.480 detection within the margin of error. And so if you actually wanted to be able to see how that

01:38:24.960 translates to a reduction in all cause mortality, you would need to increase the size of these studies

01:38:30.140 considerably, even though really what you're trying to do is detect a reduction in cancer

01:38:35.360 specific mortality. I say all of that to say that I think one of the interesting things about the

01:38:41.460 NHS study is it is a pan screening study. And to my knowledge, it's the first. In other words,

01:38:48.980 it has the potential to detect many cancers and therefore you have many shots on goal. Potentially,

01:38:56.520 this could show a reduction in all cause mortality and not just cancer specific mortality. I would have to

01:39:02.500 see the power analysis, but I wonder if the investigators thought that far ahead. Do you

01:39:06.840 know? I mean, they're going to follow these patients long-term. They will get, be able to

01:39:11.720 have the data on mortality. I don't know if it's powered for all cause. I think that's unlikely just

01:39:19.500 for the reasons you said, which is the numbers would be really high. I mean, again, if you're powering

01:39:25.200 it to see a reduction in stage four over a couple of years, that may not be enough.

01:39:31.620 Interesting. Well, time will tell. Alex, I want to pivot if we've got a few more minutes

01:39:35.860 to a topic that you and I spend a lot of time talking about these days. And so by way of

01:39:41.520 disclosure, you sort of noted that you've left Illumina somewhat recently. You've started another

01:39:47.360 company. I'm involved in that company as both an investor and an advisor, and it's an incredibly

01:39:51.980 fascinating subject. But one of the things that we talk about a lot is going back to this role of

01:39:59.380 the epigenome. So I think you did a great job explaining it and putting it in context. So we've

01:40:04.800 got these 3 billion base pairs and lo and behold, some 28 million of them also happen to have a methyl

01:40:13.520 group on their C. I'll fill in a few more details that we didn't discuss on the podcast, but just to

01:40:19.520 throw it out there. As a general rule, when we're born, we have kind of our max set of them. And as

01:40:26.060 we age, we tend to lose them. As a person ages, the number of those methylation sites goes down.

01:40:33.780 You obviously explain most importantly what they do, what we believe their most important purpose is,

01:40:39.200 which is to impact gene expression. It's worth also pointing out that there are many hallmarks of

01:40:46.400 aging. There are many things that are really believed to be at the fundamental level that

01:40:52.680 describes why you and I today look and function entirely different from the way we did when we

01:41:00.080 met 25 years ago. We're half the men we used to be. I could make a Laplace Fourier joke there, but I will

01:41:06.860 refrain. So I guess the question is, Alex, where do you think methylation fits in to the biology of

01:41:17.800 aging? That's a macro question, but... Yeah, yeah. So you talked about the hallmarks of aging,

01:41:24.320 because the author, I think it was Hanrahan, came up with that about 10 years ago, this hallmarks of

01:41:29.620 aging. And he recently gave a talk where he talked about perhaps methylation is the hallmark of

01:41:35.920 aging. And what he's referring to is the mounting data that the epigenetic changes are the most

01:41:45.280 descriptive of aging and are becoming more and more causally linked to aging events.

01:41:50.900 There's lots of data that show that people of comparable age, but different health status,

01:41:58.460 for example, smokers versus non-smokers, people who exercise versus people who don't,

01:42:03.240 people who are obese versus people who are not, can have very different methylation patterns.

01:42:09.500 There's also some data that look at centenarians relative to non-centenarians. And obviously,

01:42:17.780 that's a complicated analysis because by definition, there's a difference in age,

01:42:21.900 but you get a sense of different patterns of methylation. And clearly, centenarians we've

01:42:26.880 established long ago do not acquire their centenarian status by their behaviors.

01:42:31.940 Just look at Charlie Munger and Henry Kissinger, two people who recently passed away at basically

01:42:37.700 the age of a hundred, despite no evidence whatsoever that they did anything to take care

01:42:42.360 of themselves. So clearly their biology and their genes are very protective. As you said,

01:42:49.200 there are a bunch of these hallmarks. I think the original paper talked about nine and that's

01:42:54.160 been somewhat expanded. But you share that view, I suppose, that the epigenome sits at the top

01:43:01.480 and that potentially it's the one that's impacting the other. So when we think about

01:43:05.840 mitochondrial dysfunction, which no one would dispute, mine and yours are nowhere near as good

01:43:12.700 as they were 25 years ago. Our nutrient sensing pathways, inflammation, all of these things are

01:43:18.160 moving in the wrong direction as we age. How do you think those tie to methylation and to the epigenome

01:43:24.860 and to gene expression by extension?

01:43:27.220 Maybe let's reduce it to like a kind of an engineering framework. If we took Peter's epigenome

01:43:34.160 from 25 years ago when I first met you, right? And we knew for every cell type and every cell,

01:43:41.860 what was the methylation status at all 28 million positions? We had recorded that and we took yours

01:43:48.100 today where most of those cells have deviated from that and we could flip all those states back.

01:43:55.700 That's kind of how I think about it is the cells don't go away, just whether or not they have the

01:43:59.580 methyl group or not changes. And some places gain it, some places lose it. If we could flip all those

01:44:06.600 back, would that force the cell to behave like it was 25 years ago? Express genes, the fidelity with

01:44:15.320 which it controlled those genes, the interplay between them, would it be reprogrammed back to

01:44:20.860 that state? And so that I think is a really provocative hypothesis. We don't know that for

01:44:27.720 sure, but there's more and more evidence that that might be possible. And so to me, that's the

01:44:33.080 burning question is now that we have the ability to characterize that and we know what it looks like

01:44:37.760 in a higher functioning state, which correlates with youth, and we are gaining technologies to be able

01:44:44.280 to modulate that and actually change the epigenome as opposed to modifying proteins or gene expressions,

01:44:50.120 but actually go in and remethylate and demethylate certain sites. Can we reprogram things back to that

01:44:57.780 earlier state? And if it is the root level at which things are controlled, will you then get all of the

01:45:04.300 other features that the cell had and the organism had? That's a really exciting question to answer.

01:45:09.660 Because if the answer is yes, or even partially yes, then it gives us a really concrete way to go

01:45:15.700 about this. And so we talk about the hallmarks and the hallmarks are complex and interrelated.

01:45:21.680 What I like about the epigenome is we can read it out and we're gaining the ability to modify it

01:45:27.240 directly. So if really it's the most fundamental level at which all of these other things are

01:45:32.040 controlled, it gives us, again, maybe back to the early discussion, a very straightforward

01:45:37.100 engineering way to go about this. Let's talk a little bit about how that's done.

01:45:41.840 A year ago, you were part of a pretty remarkable effort that culminated in a publication in Nature,

01:45:48.480 if I recall, it sequenced the entire human epigenome. So if we had the Human Genome Project

01:45:54.280 24 years ago, roughly, we had the Epigenome Project. Can you talk a little bit about that

01:46:00.380 and maybe explain technologically how that was done as well?

01:46:06.260 Yeah. So in the development of the Grail Gallery test, there was a key capability that we knew was

01:46:12.820 going to be important for a multi-cancer test. So very different than most cancer screening today,

01:46:19.120 which is done one cancer at a time. So if you have a blood test and it's going to tell you there's a

01:46:24.520 cancer signal present and this person should be worked up for cancer, you'd really like to know,

01:46:30.160 well, where is that cancer likely reside? Because that's where you should start your workup. And you

01:46:35.260 want it to be pretty accurate. So if the algorithm detects a cancer and it's really a head and neck

01:46:40.760 cancer, you'd like the test to also say it's likely head and neck and then do an endoscopy

01:46:45.300 and not have to do lots of whole body imaging or a whole body PET CT or things like that.

01:46:52.200 So we developed something called a cancer site of origin. And so today the test has that. If you

01:46:57.300 get a signal detected, it also predicts where the cancer is. And it gives like a top two choice,

01:47:02.900 top two choices. It's about 90% accurate in doing that. But how does that work? The physicians and

01:47:10.040 patients have gotten that have described it as kind of magic that it detects the cancer and predicts it.

01:47:14.840 And it's based on the methylation patterns. So methylation is what determines cell identity and

01:47:21.660 cell state. So again, DNA code is more or less the same in your cells, but the methylation patterns

01:47:28.340 are strikingly different. When a cell replicates, why does it continue to be the same type of cell?

01:47:34.580 When epithelial cell replicates, same DNA as a T cell or a heart cell, but it doesn't become those

01:47:41.040 it stays. It's because the methylation pattern, those exact methylation states on the 28 million

01:47:46.580 are also replicated. So just in the same way, DNA as a way of replicating the code, there's an enzyme

01:47:52.220 that looks and copies the pattern to the next cell. And so that exact code determines, again,

01:48:00.980 is it a colonic epithelial cell or a fallopian epithelial cell or whatever it is. And so we knew

01:48:06.980 that the only way to make a predictor in the cell pre-DNA is to have that atlas of all the

01:48:14.480 different methylation patterns. And so with a collaborator, a guy named Yuval Dor at Jerusalem

01:48:20.160 University, we laboriously got surgical remnants from healthy individuals. He developed protocols

01:48:27.620 to isolate the individual cell types of most of the cells that get transformed in cancer.

01:48:34.860 And then we got pure methylation patterns where we sequenced, like sequencing the whole genome,

01:48:40.000 sequenced the whole methylome of all those cell types. And we published that a year ago.

01:48:44.160 As the first atlas of the human methylome and all of the major cell types. And so for the first time,

01:48:51.300 we could say, hey, this is the code, which makes you beta islet cell in the pancreas that makes

01:48:57.800 insulin versus something else. Interestingly, there's only one cell in the body where the insulin promoter

01:49:05.940 is not methylated. And that is the beta islet cell. Every other single cell, that promoter is heavily

01:49:12.240 methylated because it shouldn't be making insulin. It's those kinds of signals that when you have the

01:49:18.840 cell-free DNA and you look at the methylation pattern allows the algorithm to predict, hey,

01:49:23.120 this isn't just methylation signal that looks like cancer. The patterns and what's methylated and what's

01:49:29.480 not methylated looks like colorectal tissue or a colorectal cancer. And that's how the algorithm does it.

01:49:37.080 And so this atlas, again, was a real breakthrough for diagnostics and it made cancer site of origin

01:49:44.080 useful. It's also being used for lots of MRD or those cancer monitoring tests too, because it's so

01:49:50.180 sensitive. But it also brought up this interesting possibility, which is if you're going to develop

01:49:55.320 therapeutics or you want to, say, rejuvenate cells or repair them that have changed or become

01:50:01.760 pathologic, what if you compare the methylation pattern in the good state versus the bad state?

01:50:07.020 Does that then tell you the exact positions that need to be fixed? And then with another technology,

01:50:13.440 which can go and flip those states, will that reverse or rejuvenate the cell to the original or

01:50:21.760 desired state?

01:50:23.600 So Alex, unlike the genome, which doesn't migrate so much as we age, I mean, obviously it accumulates

01:50:29.940 mutations, but with enough people, I guess we can figure that out pretty quickly. Do you need

01:50:35.880 longitudinal analysis of a given individual, i.e. within an individual to really study the

01:50:43.260 methylome? Do you need to be able to say, boy, in an ideal world, this is what Peter's epigenome

01:50:49.820 looked like when he was one year, you know, at birth, one year old, two, three, four, 50 years old,

01:50:54.660 so that you could also see not just how does the methylation site determine the tissue specificity

01:51:04.820 or differentiation, but how is it changing with normal aging as well?

01:51:11.860 I think a lot of it is not individual specific. I'll give you an example. So I've done a fair amount

01:51:17.580 of work in T cells. And if you look at, say, exhausted effector T cells versus naive memory

01:51:24.540 cells, where younger individuals tend to have more of those, and it gives them more reservoir

01:51:29.900 to do things like fight disease, fight cancer. There's very distinct methylation changes. Certain

01:51:36.720 genes get methylated or demethylated. And those changes seem to be, again, very correlated with this

01:51:44.120 change in T cell function. My belief is that those represent fundamental changes as the T cell

01:51:52.240 population gets aged, and you end up with more and more T cells that, relatively speaking, are useless.

01:51:58.440 And so if you wanted to rejuvenate the T cells, repairing those methylation states is something that

01:52:04.380 would benefit everyone. Now, there are definitely a small percentage of methylation sites that are

01:52:11.220 probably drifting or degrading, and those could be specific to individuals. There's some gender

01:52:16.840 specific sites, for sure. There's some ethnic ones. But big, big changes seem to happen more with loss of

01:52:25.680 function, big changes in age that are probably common across individuals, or in the case of cancer, we also

01:52:34.660 have profound changes. When you think about this space, a term comes up. If folks have been kind of

01:52:42.400 following this, they've probably heard of things called Yamanaka factors. In fact, a Nobel Prize was

01:52:47.980 awarded to Yamanaka for the discovery of these factors. Can you explain what they are and what role they

01:52:55.980 play in everything you were discussing?

01:52:58.780 What Yamanaka and colleagues discovered is that if you take fully differentiated cells, for example,

01:53:07.060 fibroblasts, and you expose them to a particular cocktail of four transcription factors, that the

01:53:13.940 cell reverts to a stem cell-like state. And these are called induced pluripotent stem cells. You subject

01:53:21.440 a differentiated cell that was a mature cell of a particular type. I think most of their work was in

01:53:27.560 fibroblasts. And the cell, when it's exposed to these transcription factors, and these transcription

01:53:33.560 factors are powerful ones at the top of the hierarchy, they unleash a huge number of changes

01:53:39.700 in gene expression. Genes get turned on, get turned off. And then ultimately, if you keep letting it going,

01:53:46.840 you end up with something that is a type of stem cell. And why this was so exciting is it gave the

01:53:54.080 possibility to create stem cells through a manufactured process. As you know, there's a lot

01:53:59.720 of controversy about getting stem cells from embryos or other sources. This created a way now to create

01:54:07.180 stem cells and use them for medical research by just taking an individual's own cells and kind of

01:54:13.040 de-differentiating it back to a stem cell.

01:54:15.400 How much did that alter the phenotype of the cell itself? In other words, the fibroblast has

01:54:24.300 a bunch of phenotypic properties. What are the properties of a stem cell and how much of that is

01:54:31.040 driven by the change in methylation? In other words, I'm trying to understand how these transcription

01:54:37.380 factors are actually exerting their impact throughout this regression, for lack of a better word.

01:54:42.860 We refer to cell-type specific features as somatic features, like a T-cell receptor. That's a feature

01:54:50.260 of a T-cell or a dendrite or an axon would be for a neuron or an L-type calcium channel for a cardiac

01:54:57.080 myocyte. So those are very cell-type specific features. So if you turn on these Yamanaka factors and you

01:55:03.920 go back to a pluripotent stem cell, you lose most of these. And that word pluripotent means the

01:55:10.840 potential to become anything, at least in theory. So you lose most of these cell-type specific

01:55:16.860 features. So the use of the iPSCs is then to re-differentiate them. And that's what people have

01:55:24.120 been attempting to do. And it opened up the ability to do that, which is you create this

01:55:28.340 stem cell that now potentially has the ability to be differentiated into something else. You give it a

01:55:34.140 different cocktail and you try to make it a neuron or a muscle cell, and then use that in a tissue

01:55:41.020 replacement therapy. And there's a lot of research on that and a lot of groups trying to do that.

01:55:46.220 You also asked about what is the relationship between that and the epigenetics and methylation

01:55:50.740 state. That has not been well explored. And that's something that I and others are excited to do,

01:55:56.620 because it could be that you're indirectly affecting the epigenome with these Yamanaka factors,

01:56:02.460 and that if you translated that into an epigenetic programming protocol, you could have a lot more

01:56:08.520 control over it. Because one of the challenges with the Yamanaka factors is if you do this for 0.99

01:56:14.900 long enough, eventually the stem cell becomes something much more like a cancer cell and just

01:56:21.000 becomes kind of unregulated growth. And so again, huge breakthrough in learning about this kind of

01:56:27.820 cell reprogramming and de-differentiation, but our ability to use it in a practical way for tissue and

01:56:35.100 cell replacements is not there. My hope is that by converting it to an epigenetic level, it'll be more

01:56:41.700 tractable. You mentioned that this is typically done with fibroblasts. I assume the experiment has been

01:56:47.160 done where you sprinkle Yamanaka factors on cardiac myocytes, neurons, and things like that. Do they not

01:56:53.960 regress all the way back to potent stem cells? I think to varying extents. I mean, if you truly have

01:57:00.400 a pluripotent stem cell, I guess in theory, it shouldn't matter where it came from, right? Because

01:57:05.280 it's pluripotent. So with developmental factors, where did your first neurons come from? You had a

01:57:11.400 stem cell, and then in the embryo or the fetus, there were factors that then coax that stem cell to 0.79

01:57:18.020 become these other types of cells and tissues. So if it's truly pluripotent, you should be able to do

01:57:24.080 that. Now, I think you're getting at something which is different, which is called partial

01:57:27.860 reprogramming. He and the people who have followed his work, they're trying to do his things which

01:57:33.640 is kind of stop halfway. So what if you took a heart cell or a T cell that's lost a lot of function,

01:57:41.360 and you give it these Yamanaka factors, but you stop it before it really loses its cell identity,

01:57:48.700 will it have gained some properties of its higher functioning youthful state without

01:57:53.500 having lost it? And so there's some provocative papers out there on this. There's a guy, Juan Carlos

01:58:00.520 Del Monte, who's done some work on this and some very provocative results in mice of doing these

01:58:06.620 partial reprogramming protocols and rejuvenating. Again, it's mice, so all the usual caveats,

01:58:13.480 but getting very striking improvements in function, in eyesight, cognition, again, in these

01:58:19.380 mouse metrics. So certainly interesting in trying to understand how that might be able to translate to

01:58:25.040 humans. Again, the worry there would be that if you don't control it, then you could make essentially

01:58:31.360 a tumor. So it's opened up that whole area of science that it's possible to do these kinds of

01:58:37.640 dramatic de-differentiations, how to really harness that in a context of human rejuvenation.

01:58:44.440 We don't know how to do that yet, but there's a lot of people trying to figure that out.

01:58:48.940 If you had to guess with a little bit of optimism, but not pie in the sky optimism,

01:58:53.940 where do you think this field will be in a decade? Which there's a day when a decade sounded a long

01:59:01.540 time away. It doesn't sound that long anymore. Decades seem to be going by quicker than I remember.

01:59:07.100 So it's going to be a decade pretty soon, but that's still a sizable amount of time for the field to

01:59:12.940 progress. What do you realistically think can happen with respect to addressing the aging phenotype

01:59:22.880 vis-a-vis some method of reversal of aging, some truly gyro-protective intervention?

01:59:32.320 So I'm optimistic and I'm a believer. I think for specific organs and tissues and cell types,

01:59:40.300 there will be treatments that rejuvenate them. It's hard to see in a decade that there's just a

01:59:45.260 complete rejuvenation of every single cell and tissue in a human, but joint tissues,

01:59:52.320 the retina, immune cells. We're learning so much about the biology related to rejuvenation and

02:00:00.960 healthier states of them. And then in combination with that, the tools to manipulate them, which is

02:00:06.200 equally important. You could understand what the biology is, but not have a way to intervene.

02:00:09.820 The tools to go in and edit these at a genomic level, to edit it at an epigenetic level,

02:00:16.960 to change the state and the delivery technologies to get them to very specific tissues and organs

02:00:23.520 is also progressing tremendously. So I definitely see a world in 10 years from now where we may have

02:00:30.120 rejuvenation therapies for osteoarthritis, rejuvenation for various retinopathies, where

02:00:37.320 we can rejuvenate whole classes of immune cells that make you more resistant to disease,

02:00:42.820 more resistant to cancer. I think we'll see things that will have real benefits in improving health

02:00:49.300 span. Alex, this is an area that I think truly excites me more than anything else in all of

02:00:56.820 biology, which is to say, I don't think there's anything else in my professional life that grips my

02:01:03.880 fascination more than this question. Namely, if you can revert the epigenome to a version that

02:01:13.800 existed earlier, can you take the phenotype back with you? And that could be at the tissue level,

02:01:20.100 as you say, could I make my joints feel the way they did 25 years ago? Could it make my T cells

02:01:27.600 function as they did 25 years ago? And obviously one can extrapolate from this and think of the entire

02:01:33.440 organism. So anyway, I'm excited by the work that you and others in this field are doing

02:01:39.040 and grateful that you've taken the time to talk about something that's really no longer your main

02:01:44.060 project, but something for which you provide probably as good a history of as anyone vis-a-vis

02:01:50.580 the liquid biopsies. And then obviously a little bit of a glimpse into the problem that obsesses you

02:01:54.740 today. Awesome. Well, fun chatting with you as always, Peter. Glad to have the opportunity to dive

02:02:00.260 in deep with this. There are many places to do this. Thank you. Thanks, Alex. Thank you for listening

02:02:05.680 to this week's episode of The Drive. It's extremely important to me to provide all of this content

02:02:10.720 without relying on paid ads. To do this, our work is made entirely possible by our members. And in

02:02:16.280 return, we offer exclusive member-only content and benefits above and beyond what is available for free.

02:02:23.000 So if you want to take your knowledge of this space to the next level, it's our goal to ensure

02:02:26.880 our members get back much more than the price of the subscription. Premium membership includes

02:02:31.740 several benefits. First, comprehensive podcast show notes that detail every topic, paper, person,

02:02:38.960 and thing that we discuss in each episode. And the word on the street is nobody's show notes rival

02:02:44.220 ours. Second, monthly ask me anything or AMA episodes. These episodes are comprised of detailed

02:02:51.380 responses to subscriber questions typically focused on a single topic and are designed to offer a great

02:02:57.340 deal of clarity and detail on topics of special interest to our members. You'll also get access

02:03:02.060 to the show notes for these episodes, of course. Third, delivery of our premium newsletter, which is put

02:03:08.160 together by our dedicated team of research analysts. This newsletter covers a wide range of topics related

02:03:14.100 to longevity and provides much more detail than our free weekly newsletter. Fourth, access to our

02:03:21.160 private podcast feed that provides you with access to every episode, including AMA's sans the spiel you're

02:03:27.400 listening to now and in your regular podcast feed. Fifth, the Qualies, an additional member-only podcast

02:03:34.740 we put together that serves as a highlight reel featuring the best excerpts from previous episodes of

02:03:40.700 the drive. This is a great way to catch up on previous episodes without having to go back and

02:03:45.200 listen to each one of them. And finally, other benefits that are added along the way. If you want

02:03:50.400 to learn more and access these member-only benefits, you can head over to peteratiamd.com forward slash

02:03:56.900 subscribe. You can also find me on YouTube, Instagram, and Twitter, all with the handle

02:04:01.980 peteratiamd. You can also leave us a review on Apple podcasts or whatever podcast player you use.

02:04:08.660 This podcast is for general informational purposes only and does not constitute the practice of

02:04:13.940 medicine, nursing, or other professional healthcare services, including the giving of medical advice.

02:04:19.420 No doctor-patient relationship is formed. The use of this information and the materials linked to this

02:04:25.240 podcast is at the user's own risk. The content on this podcast is not intended to be a substitute for

02:04:31.180 professional medical advice, diagnosis, or treatment. Users should not disregard or delay in obtaining

02:04:36.800 medical advice from any medical condition they have, and they should seek the assistance of their

02:04:41.760 healthcare professionals for any such conditions. Finally, I take all conflicts of interest very

02:04:47.100 seriously. For all of my disclosures and the companies I invest in or advise, please visit

02:04:52.720 peteratiamd.com forward slash about where I keep an up-to-date and active list of all disclosures.

02:04:59.740 peteratiamd.com forward slash about where I keep an up-to-date and active list of all disclosures.

02:05:29.740 peteratiamd.com forward slash about where I keep an up-to-date and active list of all disclosures.

The Peter Attia Drive - February 19, 2024

#290 ‒ Liquid biopsies for early cancer detection, the role of epigenetics in aging, and the future of aging research ｜ Alex Aravanis, M.D., Ph.D.

Episode Stats

Summary

Transcript