#290 ‒ Liquid biopsies for early cancer detection, the role of epigenetics in aging, and the future of aging research | Alex Aravanis, M.D., Ph.D.
Episode Stats
Length
2 hours and 5 minutes
Words per Minute
165.39502
Summary
Alex Aravins is the CEO and Co-Founder of Moonwalk Biosciences, a company that focuses on genome sequencing and personalized medicine. In this episode, we discuss the evolution of DNA sequencing, epigenetics, and the biology of aging.
Transcript
00:00:00.000
Hey, everyone. Welcome to the Drive podcast. I'm your host, Peter Atiyah. This podcast,
00:00:16.540
my website, and my weekly newsletter all focus on the goal of translating the science of longevity
00:00:21.520
into something accessible for everyone. Our goal is to provide the best content in health and
00:00:26.720
wellness, and we've established a great team of analysts to make this happen. It is extremely
00:00:31.660
important to me to provide all of this content without relying on paid ads. To do this, our work
00:00:36.960
is made entirely possible by our members, and in return, we offer exclusive member-only content
00:00:42.700
and benefits above and beyond what is available for free. If you want to take your knowledge of
00:00:47.940
this space to the next level, it's our goal to ensure members get back much more than the price
00:00:53.200
of the subscription. If you want to learn more about the benefits of our premium membership,
00:00:58.020
head over to peteratiyahmd.com forward slash subscribe. My guest this week is Alex Aravinas.
00:01:06.920
Alex is the CEO and co-founder of Moonwalk Biosciences. I should note up front that I am
00:01:13.160
also an investor in and an advisor to Moonwalk Biosciences. Alex and I were colleagues in medical
00:01:20.020
school, so I've known Alex for a little over 25 years now. Before Moonwalk, Alex was Illumina's
00:01:25.600
chief technology officer, the SVP and head of research and product development, and under his
00:01:31.440
leadership, Illumina launched the industry-leading product for generating and analyzing most of the
00:01:36.980
world's genomic data. He developed large genome-based research and clinical applications, including
00:01:42.660
whole genome sequencing for rare disease diagnoses, comprehensive genomic profiling for cancer,
00:01:48.400
and for selected optimal therapies, and the most advanced AI tools for interpreting genomic
00:01:54.420
information. Alex has been the founder of several biotech and healthcare companies, including Grail
00:01:59.780
Bio, where he served as the chief science officer and head of R&D. At Grail, he led the development of
00:02:06.180
its multi-cancer early screening test gallery, which we'll discuss at length in this podcast. He holds over
00:02:12.300
30 patents and serves on the scientific advisory board for several biotechnology companies.
00:02:17.240
Alex received his master's and his PhD in electrical engineering and his MD from Stanford University
00:02:23.980
and his undergrad in engineering from Berkeley. In this episode, we talk about two related things,
00:02:31.500
liquid biopsies and epigenetics. We cover the evolution of genome sequencing and tumor sequencing.
00:02:38.100
We then speak at length about Alex's work with Grail and liquid biopsies, including an understanding of
00:02:44.340
cell-free DNA, methylation, sensitivity, specificity, along with the positive and negative predictive
00:02:50.100
value of liquid biopsies. We then get into epigenetics, methylation, and the biology of aging.
00:02:56.660
This is an especially complicated topic, but truthfully, there are few topics in biology today
00:03:02.280
that excite me more than this. And I suspect that my enthusiasm will come across pretty clearly here.
00:03:08.820
So without further delay, please enjoy my conversation with Alex Aravines.
00:03:19.020
Hey, Alex, great to be sitting down with you here today. I kind of wish we were doing this in person
00:03:24.040
because we haven't seen each other in person in a few months. And even that was sort of a chance
00:03:28.040
meeting. So I guess by way of background, you and I go back over 20 years now, I guess it's 25 years
00:03:34.880
that we both started med school together. It's hard to believe it's been that long, huh?
00:03:40.380
It seems like a million years ago, but it also seems like yesterday. Yeah, those are good times.
00:03:45.240
So Alex, one of the things I remember when we first met was that we pretty much clicked over
00:03:50.020
the fact that we were both engineers coming in. And we had a good group of friends that I remember in
00:03:54.220
medical school. And the one thing we had in common is not one of us was a pre-med. We were all kind of
00:03:58.720
whatever the term was they used to describe as non-traditional path to medical school. So let's
00:04:04.960
talk a little bit about just briefly your background. You came in as an electrical engineer and then you
00:04:09.800
did a PhD in a lab of a very prominent scientist by the name of Dick Chen. Maybe tell folks a little
00:04:15.280
bit about what you did in that work and what it was that got you excited enough about science to
00:04:22.640
Yeah, my PhD was in electrical engineering and Stanford has a cool configuration on the
00:04:28.900
campus where the engineering school is literally across the street from the medical school.
00:04:33.060
And so over time, I became more and more interested in applying signal processing techniques, circuit
00:04:40.020
design, imaging, AI, things like that. But the problems in medicine that were more interesting
00:04:45.260
to me than some of the traditional engineering products and things like that. Met a world famous
00:04:51.060
neuroscientist named, as you mentioned, Dick Chen, who was very interested in fundamental questions
00:04:56.200
about the quantum unit of communication in the brain, which is the individual synaptic vesicle.
00:05:02.260
And there was a question of just what did it look like and how did it operate? And it was the
00:05:07.060
beginning for me of just applying these engineering tools to really important questions in biology and
00:05:11.460
helping answer them. That first story was a great article in Nature where we definitively answered
00:05:17.320
the question of how that quantum is transmitted between cells. And then went on to do several
00:05:25.000
Can you say a little bit about that? How is that information transmitted?
00:05:28.700
It was really fun to come up with these problems with an engineering and communications background.
00:05:32.620
But if you look at a central neuron on the brain, and you look at the rate at which information is
00:05:38.660
transferred, it seemed to be much faster than the number of synaptic vesicles in the terminal,
00:05:44.660
right? So there was this, well, there's only 30 synaptic vesicles in the terminal by like an
00:05:49.420
electron microscope, yet you're seeing hundreds of transmissions over a few seconds. So how is that
00:05:55.920
possible? And there were various theories. There was an individual vesicle that was fusing and staying
00:06:02.280
fused and pumping neurotransmitter through it without collapsing. And that's how you could get these
00:06:07.460
so much more rapid puffs. We came up with a cute term, which was called kiss and run to explain
00:06:13.440
phenomenon. And it again, helped answer this fundamental question of how did the brain get
00:06:19.740
so many small neurons yet able to transmit so much information per individual connection.
00:06:26.780
So Alex, if you think about all the things that you learned during your PhD, I mean, I guess one of
00:06:32.040
the benefits of doing it where you did it in the lab you did it in was you overlapped with some other
00:06:38.280
really thoughtful folks, including a previous guest on the podcast, Carl Deseroth. What do you think
00:06:43.500
were the most important things you learned philosophically, not necessarily technically,
00:06:48.700
that are serving you in the stuff we're going to talk about today? So we're going to talk today a lot
00:06:54.800
about liquid biopsies. We're going to talk a lot about epigenetics. We're going to talk a lot about
00:07:00.600
certainly technologies that have made those things possible. And when you think back to your background in
00:07:07.440
double E, what were the transferable skills? So I think one of them, and it's a saying in
00:07:12.460
engineering, which is if you can't build it, you don't understand it. So simply understanding a
00:07:16.980
description of something is not the same as you can build it up from scratch. And so you can't always
00:07:22.000
do that in biology, but you can do experiments where you're testing the concept of, can I really make it
00:07:27.760
work? And so I think that was an engineering concept that served me well a lot. Another, it's not
00:07:33.680
exclusive to engineering, but was being very first principled. Do we really understand how this
00:07:38.580
works? In that particular lab, there's a big emphasis on doing experiments where you always
00:07:45.180
learn something, where, you know, regardless of whether or not it confirmed or rejected your
00:07:50.300
hypothesis, you learn something new about the system. Don't do experiments where you may just
00:07:56.120
not learn anything. That was a very powerful way to think about things.
00:08:00.400
So we'll fast forward a bit, just for the sake of time, there's obviously an interesting detour where
00:08:06.020
after I go off to residency and after you finish your PhD, we still find ourselves back together
00:08:11.860
side by side in the same company for four years, which again, brought many funny stories, including
00:08:19.160
my favorite is you and I getting lost in the middle of Texas, actually not in the middle of Texas,
00:08:25.580
but just outside of El Paso and nearly running out of gas. I mean, this was no cell signal.
00:08:34.020
We were in trouble, but we somehow made it out of that one together.
00:08:36.900
Yes. Yeah. No, I remember that, that us Californians thought that there must be a Starbucks within,
00:08:42.960
you know, 10, 15 miles out in the middle of West Texas. And it turns out you can go hundreds of miles
00:08:49.580
That's right. Passing a gas station with an eighth of a tank saying, we'll stop at the next one can be
00:08:55.900
a strategic error. There was also the time you bailed me out when I forgot my cufflinks because
00:09:02.240
you had some dental floss. Do you remember that? I don't know if you remember that.
00:09:08.940
That's right. Yeah. Total MacGyver move. But anyway, let's fast forward to all that stuff. So
00:09:12.940
I don't know what year it is. It's got to be circa what, 2012. When do you end up at Illumina
00:09:22.560
Okay. Talk to me about that role. What was it that you were recruited to Illumina to do? And maybe
00:09:27.160
just tell folks who don't know what Illumina is a little bit about the company as well.
00:09:33.060
Yeah. So today, Illumina is the largest maker of DNA sequencing technologies. So when you hear about
00:09:41.760
the human genome being sequenced, things like expression data or any seek, most liquid biopsies,
00:09:48.940
most tumor sequencing, finding genetic variants in kids with rare disease, most of that is done
00:09:55.260
with Illumina technology. So they also make the chemistries that process the DNA, the sequencers
00:10:00.860
that generate that information, and also the software that helps analyze it. So I really took that tool
00:10:07.960
from a very niche research technology to a standard of care in medicine and hundreds of thousands of
00:10:14.720
publications and tremendously has been advancing science. So 11 years ago, you showed up there.
00:10:21.540
What was the role you were cast in? This was earlier on in Illumina's history. What attracted me to the
00:10:27.640
company and why I was recruited was to help develop more clinical applications and more applied
00:10:34.160
applications of the technology. So the technology had a use by certain sequencing aficionados for basic
00:10:41.280
research. But the company and I agreed with the vision felt that, hey, this could be used for a lot
00:10:46.460
more. This could be used to help every cancer patient. This could be used to help people with genetic
00:10:51.780
diseases. How can we develop the technology and other aspects of it, the assays and softwares to make that
00:10:58.540
reality? I was hired to do that. It occurred to me when you even said a little bit of that, Alex, that
00:11:04.560
many of us, you and I would take for granted some of the lingo involved here, sequencing and what's
00:11:11.560
involved. But I still think it might be a bit of a black box to some people listening. And given
00:11:17.140
the topics we're going to cover today, I think just explaining to people, for example, what was done
00:11:24.860
in the late 90s, early 2000s when, quote unquote, the human genome was sequenced? What does that mean?
00:11:31.260
And how had that changed from the very first time it was done by sheer brute force in the most analog
00:11:38.240
way until even when you arrived 10, 11 years ago? So maybe walk us through what it actually means
00:11:46.480
to sequence a genome. And feel free to also throw in a little bit of background about some of the basics
00:11:52.080
of DNA and the structure, et cetera, as it pertains to that. It's some really important fundamental
00:11:56.800
stuff. A quick primer on human genetics. So in most cells of the body, you have 23 pairs of
00:12:04.220
chromosomes. They're very similar except the X and Y chromosome, which are obviously different in men
00:12:09.660
and women. Each one of those chromosomes is actually a lot of DNA packed together in a very orderly way,
00:12:17.520
where the DNA is wrapped around proteins called nucleosomes, which are composed of histones.
00:12:23.480
And then it's packed into something called chromatin, which is this mass of DNA and proteins.
00:12:28.820
And again, packed together, and then you make these units of chromosomes. Now, if you were to unwind
00:12:34.260
all of those chromosomes, pull the string on the sweater and completely unwind it, and you were to line
00:12:41.060
all of them end to end, you would have 3 billion individual bases. So the ATCG code at any given
00:12:49.860
one of those 3 billion positions, you would have a string of letters. Each one would either be ATC or G,
00:12:56.040
and it would be 3 billion long. So to sequence a whole human genome is to read out that code for an
00:13:03.480
individual. And once you do that, you then know their particular code at each of those positions.
00:13:09.520
So at the end of the last century, that was considered quite a daunting task. But as I think
00:13:18.260
our country has often done, decided that it was a very worthy one to do, along with several other
00:13:24.000
leading countries that believe strongly in science. And so they funded the Human Genome Project. So all
00:13:29.240
over the world at centers, people were trying to sequence bits of this 3 billion bases to comprise
00:13:35.660
the first complete human genome. So it's just quite famous. There were two efforts. One was a public
00:13:41.580
effort led by the NIH and Francis Collin at the time. They had a particular approach where what they
00:13:48.320
were doing was they were cutting out large sections of the genome, and then using an older type of
00:13:55.540
sequencing method called capillary electrophoresis to sequence each of those individual bases.
00:14:00.740
There was a private effort led by Craig Venter and a company called Solera, which took a very
00:14:07.220
different approach, which is they cut up the genome into much, much smaller pieces, pieces that were so
00:14:13.780
small that you didn't necessarily know a priori what part of the genome they would come from, which is
00:14:19.860
why they were doing this longer, more laborious process through the public effort. But there was a big
00:14:24.800
innovation, which is they realized that if you had enough of these fragments, you could, using a
00:14:29.960
mathematical technique, reconstruct it from these individual pieces, where you could take individual
00:14:35.680
pieces, looked at where they overlapped. And again, we're talking about billions of fragments here,
00:14:40.640
and you can imagine mathematically reconstructing that. Very computationally intensive, very complex.
00:14:46.600
But the benefit of that is that you could generate the data much, much faster. And so in a fraction of the
00:14:52.480
time and for a fraction of the money, they actually caught up to the public effort and then culminated
00:14:57.740
in each having a draft of a human genome around the same time in late 2000, early 2001. And then
00:15:06.020
simultaneously in nature and science, we got the first draft of a human genome milestone in science.
00:15:12.960
Alex, what were the approximate lengths of the fragments that Solera was breaking DNA down into?
00:15:19.080
They were taking chunks out in individual megabases, so like a million bases at a time. And then they would
00:15:26.800
isolate that and then deconstruct it even into smaller pieces, which were kilobase fragments,
00:15:33.200
a thousand bases at a time. And again, so they would take a piece of the puzzle, but they would
00:15:37.740
know which piece it was, and then break that into smaller and smaller ones. And then after you had the
00:15:43.240
one kilobase sequences, they would put it all back together versus just to contrast that with the
00:15:48.820
private effort, which they called shotgun sequencing, which is you just took the whole thing,
00:15:53.720
ground it up, brute force sequenced it, and then use the informatics to figure out what went where.
00:16:00.040
And in the shotgun, how small were they broken down into?
00:16:03.280
They got down to kilobase and hundred base, multi-hundred base fragments. But the key was,
00:16:09.120
all you had to do was just brute force keep sequencing, as opposed to this more artisanal
00:16:14.440
approach of trying to take individual pieces and deconstruct them and then reconstruct them.
00:16:19.260
So it's early 2001. This gets published. By the way, do we know the identity of the individual?
00:16:24.480
I think we do know the identity of the individual who was sequenced, don't we? I can't recall.
00:16:28.980
I think the original one was still anonymous and likely to be a composite of multiple individuals,
00:16:37.280
Yeah. Soon after, there were individuals. Craig Venter, he may have been the first
00:16:41.720
individual who was named that we had the genome for.
00:16:45.100
Got it. It's often been said, Alex, that that effort costs, at the end of that sequencing,
00:16:51.540
if you decided, I want to now do one more person, it would cost a billion dollars directionally
00:16:56.580
to do that effort. What was the state of the art in transitioning that from where it was,
00:17:07.200
let's just say, order of magnitude, 10 to the $9 per sequence, to where it was
00:17:13.960
10 years later, approximately? What was the technology introduction or plural version of
00:17:21.520
that question that led to a reduction? And how many logs did it improve by?
00:17:27.520
We went back and did this analysis. So if you literally at the end of the original human
00:17:31.780
genome said, Hey, I want to do one more. And you have the benefit of all the learnings from the
00:17:36.280
previous one, a few hundred million dollars would have been an incremental genome. By 2000,
00:17:43.980
well, it was low tens of thousands of dollars. So let's call that four or five logs of improvement.
00:17:52.380
And what brought that? So the day you show up at Illumina and it's, if for research purposes,
00:17:57.680
or if a very wealthy individual said, I have to know my whole genome sequence,
00:18:02.240
and they were willing to pay $25,000 for it, or a lab was doing it as part of a clinical trial or
00:18:08.700
for research, what were they buying from Illumina to make that happen?
00:18:13.900
So it was a series of inventions that allow the sequencing reactions to be miniaturized.
00:18:19.160
And then you could do orders of magnitude, more sequencing of DNA by miniaturizing it.
00:18:24.880
The older sequencers, they had a small glass tube. And as the DNA went through, you sequenced it,
00:18:29.980
it got converted into a 2D format, kind of like a glass slide, where you had tiny fragments of DNA
00:18:36.780
stuck to it, hundreds of millions, then ultimately billions. And then you sequenced all of them
00:18:42.440
simultaneously. So there was a huge miniaturization of each individual sequencing reaction, which allowed
00:18:49.340
you to just in one system generate many, many more DNA sequences at the same time. There's a very
00:18:56.280
important chemistry that was developed called sequencing by synthesis by a Cambridge chemist,
00:19:02.080
who I know well, Shankar Balasupramanian. And he developed Illumina sequencing chemistry,
00:19:07.980
which ultimately went through a company called Celexa, which Illumina acquired. And that has
00:19:12.300
generated the majority of the world's genomics data, the original chemistry that he developed in
00:19:18.340
And what was it about that chemistry that was such a step forward?
00:19:22.920
It allowed you to miniaturize the sequencing reactions. So you could have a huge number,
00:19:28.180
ultimately billions in a very small glass slide. It also allowed you to do something
00:19:33.400
which is called cyclic sequencing in a very precise and efficient and fast way, where you read off one
00:19:41.500
base at a time, and you can control it. And so you imagine you have, say, a lawn of a billion DNA
00:19:46.840
fragments, and you're on base three on every single fragment, and you want to know what base four is
00:19:51.960
on every fragment. It allowed you to simultaneously sequence just one more base on all billion
00:19:57.900
fragments, read it out across your whole lawn. And then once you read it out, add one more base,
00:20:04.640
read it all out. And so this allowed for this huge parallelization.
00:20:09.240
Let's talk a little bit about where we are today. To my recollection, the last time I looked
00:20:15.440
to do a whole genome sequence today is on the order of $1,000, $500 to $1,000. Is that about accurate?
00:20:23.520
Yeah, that's way too expensive, Peter. Today, a couple hundred dollars.
00:20:29.140
Okay. So a couple hundred dollars today. I feel like I looked at this on a graph a while ago,
00:20:35.500
and it was one of the few things I noticed that was improving faster than Moore's Law.
00:20:40.980
Maybe tell folks what Moore's Law is, why it's often talked about. I think everybody's heard of
00:20:46.560
it. And maybe talk about the step function that it's basically, if I'm looking at it correctly,
00:20:51.940
there were two Moore's Laws, but there was something in between that became even a bigger
00:20:57.120
improvement. But maybe tell folks what Moore's Law is, first of all.
00:21:01.000
It's not like a law, like a law of physics or something like that, but it became an industry
00:21:05.400
trend in microprocessors. What it refers to is the density of transistors on a microchip and the
00:21:14.160
cost of the amount of computing power per amount of transistors. And that geometrically decreased
00:21:21.660
kind of in a steady way. Actually, I don't remember the exact number if it's like doubling every
00:21:26.980
two years or something like that. But there was a geometric factor to it that the industry
00:21:32.420
followed for decades. It's not quite following that anymore. I mean, transistors are getting down
00:21:37.320
to like the atomic scale, but went way faster than people had envisioned.
00:21:43.720
It basically started in the late 60s. And as you said, it went until it hits the limits of atomic
00:21:50.020
Yeah. And so that relentless push is what made the whole software engineering high-tech industry possible.
00:21:56.140
So back to my question, which is, if you just look at the cost of sequencing
00:22:00.600
from 2000 till today, it's sort of like two curves. There's the relentless curve that gets to where
00:22:08.380
we are in 2013. But then there was another big drop in price that occurred after that. I'm guessing
00:22:15.500
that had to do with shotgun sequencing or the commercialization of it. I mean, not the concept
00:22:20.020
of it, which already existed. Does that sound right?
00:22:23.380
Yeah. So when Illumina really started to deliver the higher throughput next generation sequencings,
00:22:29.140
it brought along a new faster curve because of the miniaturizations. So this ability to sequence
00:22:34.840
billions of fragments in a small area, I was privileged to be a big part of this effort.
00:22:40.520
And Illumina just continuing to drive the density down, the speed of the chemistry up,
00:22:45.700
all the associated optics, engineering software around it drove that much faster than Moore's law
00:22:53.400
Were other companies involved in the culmination of next-gen sequencing?
00:22:58.740
Yeah, many. And some of them are still around. None nearly as successful as Illumina,
00:23:06.520
And today that's the industry standard. I assume there's no sequencing that's going on
00:23:12.400
No, the vast majority is next-gen sequencing. There's niche applications where there's other
00:23:17.680
approaches, but in the 99% of the data being generated, some version of next-generation
00:23:24.220
Got it. So you mentioned a moment ago that part of the effort to bring you to Illumina was
00:23:31.080
presumably based on not just your innate talents, but also the fact that you came from a somewhat
00:23:37.360
clinical background as well. You're an MD and a PhD. And if their desire is to be able to branch out
00:23:42.840
into clinical applications, that would make for a natural fit. So where in that journey did the
00:23:48.580
idea of liquid biopsies come up? And maybe talk a little bit about the history of one of the
00:23:54.160
companies in that space that we're going to talk about today.
00:23:56.840
So to start with that, I should talk about first tumor sequencing, which predated liquid biopsy.
00:24:02.260
A couple of companies, most notably Foundation Medicine, developed using Illumina technology,
00:24:08.260
developed tumor sequencing. So there had been some academic work, but they tried to develop it and
00:24:13.920
were the first to do it successfully as a clinical product. What you can imagine is there's these
00:24:19.000
genes that are implicated in cancer that often get mutated. Knowing which mutations a tumor has has
00:24:25.380
big implications for prognosis, but also for treatment. Over time, we have more and more targeted
00:24:31.000
therapies where if your tumor has a very particular mutation, it's more likely to respond to certain
00:24:36.940
drugs that target that type of tumor. And at the time, as more and more of these mutations were
00:24:43.640
identified that could be important in the treatment of a tumor, it was becoming impractical to say,
00:24:49.980
do a PCR test for every mutation. So imagine there's 100 potential mutations you'd like to know about
00:24:56.180
if a patient has in their tumor and their lung cancer, doing each of these individually. Again,
00:25:02.160
a lot of expense, a lot of false positives. And so what companies like Foundation Med is say, hey,
00:25:08.360
why don't we just sequence all of those positions at once given next generation sequencing? So they
00:25:13.740
would make a panel to sequence, say, 500 genes or a few hundred genes, the ones that are most important
00:25:19.460
in most solid cancers. And then they would sequence them. And then in one test, they would see the vast
00:25:25.220
majority of the potential mutations that could be relevant to treatment for that cancer patient.
00:25:30.740
And so that is still a very important tool in oncology today. A large fraction of tumors are
00:25:37.160
sequenced. And that's what allows people to get access to many types of drugs. Many of the targeted
00:25:43.360
therapies for lung cancer, melanoma, or you hear about things like microsatellite instability or high
00:25:50.720
mutational burden, that all comes from tumor sequencing. Once that was established, then a few
00:25:57.600
folks, most notably at Johns Hopkins, but also other places, started to ask the question, well,
00:26:02.960
you know, could we sequence the tumor from the blood? And you might say, well, hey, you have a tumor in
00:26:08.020
your lung. Why would sequencing blood be relevant to looking at the tumor? Well, it turns out there is
00:26:14.800
tumor DNA in the blood. And this is interesting. So in the late 40s, it was first identified,
00:26:20.720
that there was DNA in the blood outside of cells, so-called cell-free DNA. And then in the 70s,
00:26:27.040
it was noticed that cancer patients had a lot of DNA outside their cells in the blood, and that some
00:26:34.540
of this was likely from tumors, from the cancer itself. If you know anything about tumor biology,
00:26:41.560
you know that cancer cells are constantly dying. So you think of cancers as growing very quickly,
00:26:46.220
and that's true, but they actually are dying at an incredible rate because it's disordered growth.
00:26:52.640
So many of the cells that divide have all kinds of genomic problems. So they die or they're cut
00:26:58.260
off from vasculature. But the crazy thing about a tumor is, yes, it's growing fast if it's an aggressive
00:27:04.120
tumor, but also the amount of cell death within that tumor is very high. And every time one of those
00:27:10.500
cells die, some of the DNA has the potential to get into the bloodstream. And so it was this insight
00:27:16.940
along with the tumor sequencing that said, hey, what if we sequence this cell-free DNA? Could we
00:27:22.920
end up sequencing some of the tumor DNA or the cancer cell DNA that's in circulation?
00:27:28.940
Early results, particularly from this group at Johns Hopkins, began to show that indeed that was
00:27:35.060
possible. And then a few companies, again, using Illumina technology, and then we started doing
00:27:41.080
it at Illumina also, our own liquid biopsy assays and tests and technologies developed what became
00:27:47.600
liquid biopsy. In this context, it was for late-stage cancer. So it was for patients who
00:27:52.740
diagnosed with a cancer. You wanted to know, did their tumor have mutations? And you could do it from
00:27:58.180
the blood. There was a big benefit, which was, as you know, for lung cancer, taking a biopsy can be a
00:28:04.200
very dangerous proposition. You can cause a pneumothorax. You can land someone in the ICU.
00:28:11.220
You know, in rare cases, it can lead to death in that type of procedure. And so the ability to
00:28:16.180
get the mutational profile from the blood was really attractive. And so that started many companies
00:28:23.360
down the road of developing these liquid biopsies for late-stage cancers.
00:28:28.380
So Alex, let's talk about a couple of things there. Tell me the typical length of
00:28:34.080
a cell-free DNA fragment. How many base pairs, or what's the range?
00:28:38.840
Yeah, it depends on the exact context, but around 160 base pairs. So that's 160 letters of the ATCG
00:28:46.660
code. And there's a very particular reason it's that length, which is that if you pull the string
00:28:53.140
on the sweater, you unwind the chromosome, and you keep doing it until you get down to something
00:28:58.800
around 160 base pairs, what you find is that the DNA, right, it's not just naked, it's wrapped around
00:29:05.320
something called a nucleosome, which is an octamer or eight of these histone proteins in a cube,
00:29:12.500
and the DNA is wrapped around it twice. And that's the smallest unit of chromatin of this larger chromosome
00:29:19.720
structure. And so the reason it's 160 bases is that's more or less the geometry of going around
00:29:26.380
twice. And so DNA can be cleaved by enzymes in the blood, but that nucleosome protects the DNA from
00:29:35.540
being cut to anything smaller than about 160 base pairs. And does that mean that the cell-free DNA
00:29:42.600
that is found in the blood is still wrapped around the nucleosome twice, like it's still clinging to
00:29:50.100
that and that's what's protecting it from being cleaved any smaller?
00:29:53.860
You mentioned that obviously the first application of this was presumably looking for ways to figure
00:30:02.220
out what the mutation was of a person with late-stage cancer without requiring a tissue
00:30:07.060
biopsy. Presumably by this point, it was very easy to gather hundreds of 160 base pair fragments and
00:30:17.180
use the same sort of mathematics to reassemble them based on the few overlaps to say this is the actual
00:30:23.780
sequence because presumably the genes are much longer than 160 base pairs that they're looking at.
00:30:30.100
That's right. So by this point in 2014, 2015, the informatics was quite sophisticated. So you could
00:30:38.880
take a large number of DNA sequences from fragments and easily determine which gene it was associated with.
00:30:46.060
At some point I recall in here, I had a discussion on the podcast maybe a year and a half ago,
00:30:53.220
two years ago with Max Dean, another one of our med school classmates, about looking at recurrences in
00:31:00.960
patients who were clinically free of disease. So you took a patient who's had a resection plus or minus
00:31:08.900
some adjuvant chemotherapy. And to the naked eye and to the radiograph, they appear free of disease.
00:31:16.120
And the question becomes, is that cancer recurring? And the sooner we can find out, the better our chance
00:31:22.940
at treating them systemically again, because it's a pretty well-established fact in oncology that the
00:31:29.380
lower the burden of tumor, the better the response, the lower the mutations, the less escapes, etc.
00:31:35.640
And so did that kind of become the next iteration of this technology, which was,
00:31:41.860
if we know the sequence of the tumor, can we go fishing for that particular tumor in the cell-free
00:31:49.640
Yeah, yeah. Broadly speaking, there's kind of three applications from looking at tumor DNA in the
00:31:54.360
blood. One is screening, which we'll talk about later, which is people who don't have cancer,
00:31:59.140
or 99% who don't, and trying to find the individual who has cancer, an invasive cancer,
00:32:04.980
but doesn't know it. There's this application of what we call therapy selection, which is you're a
00:32:09.300
cancer patient trying to decide which targeted therapy would be best for you. And then this other
00:32:15.520
one you mentioned is a third application we call often minimal residual disease. We're looking at
00:32:21.480
monitoring a response, which is you're undergoing treatment, and you want to know,
00:32:26.520
is the amount of tumor DNA in the blood undetectable? And also its velocity, is it changing?
00:32:33.160
Because as you mentioned, that could tell you, is your treatment working, the tumor DNA burden or
00:32:38.980
load is going down? Is it undetectable, and you're potentially cured that there's no longer that source
00:32:45.820
of tumor DNA in your body? Or is it present even after a treatment with intent to cure, and that in the
00:32:54.900
presence of that tumor DNA still means basically, and we appreciate this now, unfortunately, you have
00:33:01.360
not been cured, but that patient hasn't been cured, because there is some nidus of tissue somewhere
00:33:06.220
that still harbors these mutations, and therefore is the tumor, even if it's not detectable by any other
00:33:13.300
So at what point does this company called Grail that we're going to talk about, at what point does
00:33:23.520
it come into existence, and what was the impetus and motivation for that as a distinct entity outside
00:33:31.680
So there were several technological and scientific insights that came together, along with, as often
00:33:38.520
in this case, some really old entrepreneurs and investors. The use of this liquid biopsy technology
00:33:46.580
in late-stage cancers, it was clearly possible to sequence tumors from the blood, and it was clearly
00:33:52.340
actually the tumor DNA, and it was useful for cancer patients. So we knew that there was tumor DNA, we knew
00:33:58.740
it could be done, but what the field didn't know is, could you just see this in early-stage cancers,
00:34:03.660
localized cancers that were small? Not a lot of data on that, but there was the potential.
00:34:10.720
There was also a really interesting incidental set of findings in a completely different application
00:34:16.300
called non-invasive prenatal testing. Again, totally different application, but it was discovered
00:34:22.700
principally by a scientist in Hong Kong named Dennis Lowe that you could see fetal DNA in the blood,
00:34:30.300
or more specifically placental DNA in the blood, and it was also cell-free DNA. What he developed,
00:34:37.880
actually, along with one of our professors at Stanford, Steve Quake, was a technique to look
00:34:43.840
for trisomies in the blood based on this placental or fetal DNA, and this is called non-invasive
00:34:50.060
prenatal testing. And so what you do is you sequence the cell-free DNA fragments in a pregnant woman,
00:34:56.540
you look at the DNA, and if you see extra DNA, for example, at the position of chromosome 21,
00:35:04.560
well, that indicates that there are tissues in women, presumably the fetus, or placenta that's
00:35:10.700
giving off extra chromosome 21. And so this ended up being an incredibly sensitive and specific way
00:35:18.380
to test for the presence of trisomies, chromosome 21, 18, 13, early in pregnancy. And it's had a
00:35:26.740
tremendous impact. It was also involved in subsequent iterations of the test. In the United States,
00:35:31.860
it decreased amniocentesis by about 80% because the test is so sensitive and specific as a screen
00:35:38.940
that many, many women have now not had to undergo amniocentesis and the risks around. Again,
00:35:45.240
totally different application of cell-free DNA. But what happened is during the early
00:35:51.380
commercialization of about the first few hundred tests, the companies pioneering this, and one of
00:35:56.620
them was a company called Veranata that Illumina acquired, began to see in rare cases, very unusual
00:36:03.520
DNA patterns. It wasn't just a chromosome 21 or 18 or 13, but what's often called chromotripsis,
00:36:13.000
which is many, many abnerations across chromosomes. The two women who really did this analysis and
00:36:21.660
really brought both Illumina and the world's attention to it were Meredith Hawks-Miller,
00:36:26.560
a pathologist and lab director at this Illumina-owned company, Veranata, and another
00:36:31.500
bioinformatics scientist, Daria Chudova. What they showed is, ultimately, that these women actually
00:36:38.420
had cancer. They were young women of childbearing age. They ultimately had healthy children,
00:36:44.840
but they had an invasive cancer and it was being diagnosed in their cell-free DNA by this
00:36:51.540
non-invasive prenatal test. And as they began to show these patterns to people, it became clear that
00:36:58.280
they were clearly cancer. If you have many, many chromosomes that are abnormal, that's just not
00:37:03.980
compatible with life or a fetus. And so when you saw this just genome-wide chromosomal changes,
00:37:11.500
it was very clear that we're incidentally finding cancer in these women.
00:37:15.680
Let's talk a little bit about that, actually, because I want to dig into that. It's so interesting.
00:37:19.760
So let's take a step back. So again, whenever you say we're sampling for cell-free DNA,
00:37:25.140
we should all be keeping in the back of our mind, we're looking for these teeny tiny little
00:37:29.440
160 base pair fragments wrapped around little nucleosomes. Now, let's just go back to the
00:37:36.120
initial use case around trisomy 21. With 160 base pairs, is that sufficient to identify any one
00:37:44.080
chromosome? Presumably, you're also sampling maternal blood, so you know what the maternal
00:37:49.220
chromosomes look like, and you're presumably juxtaposing those two as your control. Is that
00:37:55.500
part of it? Not quite. So it's all mixed together. So in a pregnant woman's blood and maternal blood,
00:38:02.140
it's a mixture. So you have cell-free DNA. The majority of the cell-free DNA is from her
00:38:07.340
own cells and tissues. And then you have superimposed on that a bit of cell-free DNA from
00:38:13.720
mostly the placenta. And so what you're seeing is this mix of cell-free DNA. And then what you do is
00:38:20.500
you sequence. There's different ways to do it, but the most common way is you do shotgun sequencing,
00:38:25.140
and you sequence millions of these fragments. And every time you sequence a fragment,
00:38:30.860
you place it in a chromosome based on its sequence. Your first fragment, you say,
00:38:35.760
hey, when I compare this to the draft human genome, this goes on chromosome two.
00:38:39.840
You sequence your third fragment and you say, hey, this sequence looks like chromosome 14.
00:38:44.960
And you keep putting them in the chromosome buckets. And what you expect, if every tissue has an
00:38:53.200
even chromosome distribution, you know, or two chromosomes, is that that profile would be flat
00:38:57.840
and each bucket would be about the same level. But what you see in a woman carrying a fetus that
00:39:04.460
has a trisomy... You'll see 50% greater in the chromosome 21 bucket.
00:39:09.920
You actually see more like 5% or 10%. Because again, remember, 90% of it might be maternal blood,
00:39:15.760
right? So that's all going to be even. But within the 10% fetal, you're going to have an extra 50%.
00:39:21.480
So the total might be an extra 5% or 10%. But that's a whopping big signal and very easy to detect.
00:39:29.280
Isn't it interesting? It just gives a sense of how large the numbers are if a 5% delta
00:39:34.620
is an off the charts, unmistakable increase in significance. I want to make sure again,
00:39:40.940
people understand what you just said, because it's very important. Because the majority of the
00:39:45.280
cell-free DNA belongs to the mother, and because the fetal slash placental cell-free DNA is a trivial
00:39:51.820
amount, even though by definition a trisomy means there is 50% more of one chromosome, you've gone
00:39:59.360
from two to three copies. In the fully diluted sample, that might only translate to a few percent.
00:40:07.140
But that's enough, given the large numbers that you're testing, to be a definitive,
00:40:13.820
statistically significant difference that triggers a positive test.
00:40:20.480
Alex, I want to come back to the story, because this is clearly the beginning of the story.
00:40:24.300
But let's come back to just a couple other housekeeping items.
00:40:27.580
A moment ago, we talked about cell-free DNA in the context of tumor. Someone might be listening to us
00:40:31.720
thinking, wait, guys, you just said that the majority of the cell-free DNA is from this mother.
00:40:37.120
99.9% of the time, she doesn't have cancer. Where is that cell-free DNA coming from?
00:40:42.340
When cells are destroyed, either through necrosis or apoptosis, there's a lot of cell turnover,
00:40:48.600
right, of cells that replicate, especially epithelial cells, blood cells, and so on. As the natural
00:40:55.440
biochemistry destroys them, some of the DNA from the nucleus ends up in circulation. Again,
00:41:01.720
where they're wrapped around these nucleosomes. So it's essentially cell death and cell turnover
00:41:07.180
is the source of it. And since, again, at any one time, there's millions of cells dying and being
00:41:13.200
turned over, there's always some base-level cell-free DNA in the blood.
00:41:18.280
And again, I don't know if you've ever done the calculation. If not, I don't mean to put you on
00:41:21.540
the spot. But do you have an approximate guess for how many base pairs of cell-free DNA are floating
00:41:28.180
around your body or my body as we sit here right now?
00:41:31.160
What I can say is, if you took a 10-mil blood tube, which is a lot of what these tests use,
00:41:37.100
and you remove all the cellular DNA, remember, there's a ton of DNA in the cells in circulation.
00:41:41.700
Sure. The white blood cells, the red blood cells, et cetera. Get rid of all that. Yep.
00:41:45.380
Huge amount. You probably have on the order of a few thousand cells worth of cell-free DNA
00:41:51.160
in a 10-mil blood tube, which isn't a lot. Just to make sure I understand you, you're saying
00:41:56.640
a few thousand cells worth. Each cell would be 3 billion base pairs.
00:42:03.980
Wow. On the one hand, it doesn't sound like a lot because there are billions of cells. On the other
00:42:10.180
hand, it still sounds like a lot. That's still a big computational problem.
00:42:14.420
Where it becomes challenging is when we get into early detection, right? Where if you think about it,
00:42:19.720
for any position in the genome, you only have a few thousand representations of it because there's
00:42:27.300
only a few thousand cells. That starts to limit your ability to detect events that occur at one
00:42:34.380
in a million or one in a hundred thousand. Alex, do you recall these incident cases of the pregnant
00:42:43.960
mothers? Again, I guess we should probably go back and re-explain that because it's such an
00:42:48.900
important and profound discovery. There were a handful of cases where in the process of screening
00:42:54.460
for trisomies, they're discovering not that the mother has additional chromosomes that can be
00:43:02.640
attributed to the fetus, but that she has significant mutations across a number of genes that
00:43:12.700
also are probably showing up in relatively small amounts because they're not in all of her cells.
00:43:20.500
Yeah. Yeah. So you might expect a flat pattern, right? In the majority of cases, or when the fetus
00:43:26.740
has a trisomy, you see these very well-known accumulations, mostly in 21, but occasionally in
00:43:33.260
18 or 13. And instead what you see is just increases and decreases monosomies and trisomies
00:43:40.120
across many, many chromosomes, which is just not compatible with life even as a fetus. But there
00:43:47.120
is a biology where you do see these tremendous changes in the chromosomes. And that's often in the
00:43:55.240
Do you recall what those cancers turned out to be in those young women? I mean, I assume they
00:43:59.880
were breast cancers, but they could have been lung cancers, anything?
00:44:03.360
Yeah. So Meredith and Daria, they published a paper in JAMA, which for anyone interested,
00:44:08.480
details these 10 or so cases and what happened in each of them. It was a mix. I think there was
00:44:14.580
a neuroendocrine, uterine, some GI cancers. It was a smattering of different things.
00:44:20.860
And what was the approximate year of that? We'll try to find that paper and link to it in the show
00:44:29.880
Of course, one doesn't know the contrapositive. One doesn't know how many women had cancer but
00:44:37.660
weren't captured. But is it safe to assume that the 10 who were identified all had cancer?
00:44:46.420
So there were no false positives. We just don't know how many false negatives there were.
00:44:50.180
Right. Yeah. This is one of the things that contributed to the evidence that cancer screening
00:44:56.580
might be possible using cell-free DNA, which is these incidental findings. As I mentioned earlier,
00:45:02.180
we already knew that, yes, tumors do put cell-free DNA into the bloodstream. But this was a profound
00:45:08.300
demonstration that in actual clinical practice, you could find undiagnosed cancers in asymptomatic
00:45:15.240
individuals. And that it was highly specific, meaning that when it was found using this method,
00:45:20.940
it almost, well, I think in those initial ones, it was every case, but almost every case turned out
00:45:26.000
to have cancer. Now, to your point, it's not a screening test because even in relatively healthy
00:45:33.720
and women of childbearing age, a population of 100,000, you expect epidemiologically 10 times or
00:45:41.160
so or 50 times that number of cancers over a year or so. So clearly you're missing the majority of
00:45:48.080
cancer. So it's not a screening test. Right. It was just a proof of concept though.
00:45:52.860
Yeah. An inadvertent proof of concept that really raised that Illumina and I think in the field,
00:45:57.800
our attention of, hey, using cell-free DNA and sequencing based methods, it might be possible
00:46:03.240
to develop a very specific test for cancer. So what was the next step in the process of
00:46:10.480
systematically going after addressing this problem? Myself and some other folks at Illumina,
00:46:16.040
along with the two scientists I mentioned, Meredith and Daria, and then also in particular,
00:46:21.960
the CMO at the time, Rick Klausner, who had a very long history in cancer research and in cancer
00:46:29.500
screening. He was the previous NCI director under Bill Clinton. So that's the National Cancer
00:46:35.320
Institute at the NIH under Bill Clinton. And he was the CMO at Illumina at the time. And we started to
00:46:41.340
talk more and more about what would it take to develop or determine the feasibility of a universal
00:46:48.680
blood test for cancer based on this cell-free DNA technology. And being very first principle,
00:46:55.180
I really asked the question, well, why is it in 50 years of many companies and a tremendous amount
00:47:01.140
of academic research, no one had ever developed a broad-based blood test for cancer? Not just many
00:47:08.140
cancers, let alone any cancer. Really, the only example is PSA. And again, the false positive
00:47:14.600
rates there are so high that its benefit to harm has been questioned many times. And that's why it
00:47:20.980
doesn't have a USPSTF grade A or B anymore. And the fundamental reason is specificity. So there's lots
00:47:28.480
of things that are sensitive, meaning that there are proteins that accumulate, biochemistries, metabolites
00:47:34.460
that go up in cancer. But the problem is they go up in a lot of benign conditions. So, you know,
00:47:39.840
a big benign prostate spews out a lot of PSA. And pretty much every other protein or metabolite does
00:47:45.900
that. The biomarkers to date were all very sensitive, but all had false positive rates of,
00:47:53.540
say, 5% or 10%. And so if you're imagining screening the whole population, you can't be working up one of
00:48:00.560
10 people for a potential cancer. And so the key technological thing to solve was, well, how do
00:48:07.100
you have something that has a 1% false positive rate or a half percent false positive rate? Because
00:48:12.840
that's what you need to get to if you want to do broad-based cancer screening in relatively healthy
00:48:18.940
asymptomatic people. And this is why we thought it might be possible with cell-free DNA because
00:48:25.220
the tumor DNA could be more specific than proteins and other things that are common in benign disease.
00:48:33.440
And so that was the reason to believe. The things we didn't know is, well, how much DNA does a early
00:48:39.480
stage tumor pump out? If it doesn't pump out any, well, there's nothing to detect. The other is the
00:48:44.620
heterogeneity. Cancer is not like infectious disease or there's one very identifying antigen or sequence.
00:48:52.680
Every tumor is truly unique, right? So even two lung cancers that are both the same
00:48:57.600
histological subtype, they can share very few mutations or none. So you can have two squamous
00:49:03.860
cell lung cancers that honestly don't have a single shared mutation. So now you need to look at
00:49:09.960
hundreds or thousands or even of millions of physicians to see enough potential changes.
00:49:15.540
And this is where, again, NGS was a really good fit, which is how do you overcome the heterogeneity
00:49:21.340
that you need to now look for a disease that isn't defined? I can't tell you these three mutations
00:49:27.140
are the ones you need to find for this cancer. There's a huge set of different ones for every
00:49:33.040
cancer. And then that got us thinking, well, look, in addition to sequencing many physicians and
00:49:39.000
sequencing very deeply and using cell-free DNA, we were going to need to use AI or machine learning
00:49:44.600
because we had to learn these complex associations and patterns that no human being could curate
00:49:52.000
thousands of different mutational profiles and try to find the common signals and so on.
00:49:58.320
What emerged over the course of a year is, look, this might be possible, but we're going to have to
00:50:03.980
enroll very large populations just to study and find the signals and develop the technology.
00:50:09.920
And then we're going to need very large studies to actually do interventions and prove it clinically
00:50:15.720
valid that it actually works. We're going to have to use NGS and sequence broadly across the whole
00:50:22.080
genome. And only then might it be possible. And so the company at the time decided, and this was a
00:50:30.320
board-level decision, that ultimately this made more sense as an independent company, given the amount
00:50:37.640
of capital that was going to be required, given the scientific and technical risk, given the kind
00:50:43.240
of people that you would need to recruit. We're passionate about this, that it made sense to do
00:50:48.240
as a separate company. And so the CEO at the time, Jay Flatley, in early 2016 announced the founding of
00:50:56.100
the company and then spinning it out of Illuminon. I had the honor of being one of the co-founders of it.
00:51:02.100
Let's go back to 2016. You guys are now setting up new shop. You've got this new company. It's called
00:51:08.420
Grail. You've brought over some folks like you from Illumina, and presumably you're now also
00:51:15.340
recruiting. What is the sequence of the first two or three problems you immediately get to work on?
00:51:23.100
As I wrote the starting research and development plan, the way I wrote it was we needed to evaluate
00:51:29.420
every potential feature in cell-free DNA, meaning that any known method of looking for cancer in
00:51:36.680
cell-free DNA, we needed to evaluate. That if we were going to do this and recruit these cohorts and
00:51:42.060
all these altruistic individuals, and we were going to spend the money to do this, we needed to not look
00:51:46.620
at just one method or someone's favorite method or whatever they thought might work. We needed to look
00:51:52.040
at every single one. And so that's what we did. We developed an assay and software for mutations and
00:51:59.300
then a bunch of other things, chromosomal changes, changes in the fragment size, and many others. And we
00:52:05.740
said, look, we're going to test each one of these head-to-head, and we're going to test them in
00:52:09.160
combination, and we're going to figure out the best way to do this. We even had a mantra that Rick came up with
00:52:15.720
that I thought was very helpful, which is we're either going to figure out how to do this, or we're going to prove
00:52:19.940
it can't be done. I think that was very helpful in thinking about how to do these initial experiments.
00:52:25.100
So it was a lot of building these assays. We needed a massive data sets to train the machine learning
00:52:30.360
algorithm. So we had this study called the CCGA, the Circulating Cell-Free Genome Atlas, where we
00:52:35.560
recruited 15,000 individuals with and without cancer of every major cancer type, and in most cases,
00:52:42.940
hundreds. And then we tested all of these different methods, the ones I mentioned, and also
00:52:49.020
importantly, a methylation-based assay. And we did blinded studies to compare them and see could
00:52:55.060
any of them detect a large fraction of the cancers? Did any of them have the potential to do it at high
00:53:00.200
specificity? Because that's what we would need if we were going to develop a universal test for cancer
00:53:06.100
that could be used in a broad population. So let's kind of go back and talk about a few of those things
00:53:11.300
there because there was a lot there. So you said up front, look, we're going to make sure that any
00:53:18.180
measurable property of cell-free DNA, we are measuring, we are quantifying it, we are noting
00:53:24.880
it. We talked about some of them, right? So fragment length, that seems relatively fixed, but presumably
00:53:30.560
at large enough sample size, you're going to see some variation there. Does that matter?
00:53:35.240
The actual genetic sequence, of course, that's your bread and butter, to be able to measure that.
00:53:41.320
You also mentioned, of course, something called methylation, which we haven't really talked about
00:53:45.980
yet. So we should explain what that means. Were there any other properties besides fragment length,
00:53:52.280
sequence, and methylation that I'm missing? There were several others. One was chromosomal changes.
00:53:57.580
So as we mentioned in cancer, the numbers of chromosomes often change. So many cancers,
00:54:03.980
and this is wild, they'll often double the number of chromosomes. So you can go from 23 to
00:54:10.200
double or even triple the number. But these chromosomes are not normal. So you'll often have
00:54:16.440
arms or the structures of chromosomes will get rearranged. And so there's a way to look at that
00:54:22.620
also in the cell-free DNA. Like as we mentioned in the non-invasive prenatal testing, where you look at
00:54:27.300
the amount of DNA per chromosome or per part of chromosome. So we looked at what's called these
00:54:32.800
chromosomal abnormalities. We also looked at cell-free RNA. So it turns out there's also
00:54:38.940
RNA from tumors in circulation. How stable is that, Alex? I was under the impression that
00:54:44.840
RNA wouldn't be terribly stable, unlike DNA, which of course is double strand and quite stable.
00:54:51.140
How do you capture cell-free RNA? So naked RNA is not very stable. However, there's proteins that if
00:55:00.080
the RNA is bound to, and one type is called an argonaut protein, if the RNA is bound to it,
00:55:06.060
it is protected. I assume this is typically messenger RNA that's been in the process of
00:55:10.920
being transcribed. But somewhere along the way, before translation occurs, there's the disruption
00:55:18.400
to the cell that results in lysis or something. And you're just basically getting the cell-free RNA
00:55:23.640
RNA because you happened to catch it at that point. It was a replicating cell or something,
00:55:29.620
Yeah. Or during apoptosis, it's somehow during some kind of programmed cell death,
00:55:34.860
it's being digested or bound. The amount relative to the amount of cell death is low. So presumably
00:55:41.600
most of the RNA is destroyed, but enough of it does get protected and bound to proteins.
00:55:47.720
Whether or not it's cellular detritus or garbage, or it's intentional, it's kind of a different
00:55:53.320
question, but it is present. There's also vesicular structure. So little bubbles of membrane that the
00:56:00.280
RNA can be contained in. The most common one is referred to as an exosome, which are these little
00:56:05.760
vesicles in circulation. So in a variety of different ways, you can have messenger RNA and other types of
00:56:12.200
RNA preserved outside of cells in circulation. And so we looked at that also.
00:56:19.360
How long did it take to quantify all of these things? And presumably, I think you sort of alluded
00:56:25.760
to this, but we're not just looking at any one of these things. You're also asking the question,
00:56:30.220
can combinations of these factors add to the fidelity of the test, correct?
00:56:34.920
Yeah. So this initial research phase took close to three years, cost hundreds of millions of dollars.
00:56:41.160
We had to recruit the largest cohort ever for this type of study, the CCGA study, as I alluded to.
00:56:47.580
And there were different phases. There was a discovery and then multiple development and
00:56:52.300
validation phases. We had to make the world's best assays to look at each of these features.
00:56:58.960
And then we had to process all of those samples and then analyze them. And we did it in a very
00:57:05.240
rigorous way where the final testing was all done blinded and the analysis was all done blinded.
00:57:10.360
So we could be sure that the results were not biased. And then we compared them all and we also
00:57:16.440
compared them in combinations. And we use sophisticated machine learning approaches to
00:57:21.320
really maximize the use of each individual type of data from each, you know, whether or not it was
00:57:26.440
mutations or the chromosomal changes or methylation.
00:57:29.200
So you mentioned that the CCGA had 15,000 samples. How many of those samples were cancers versus
00:57:39.440
It's about 60% cancer versus controls. Yeah, 40%.
00:57:43.340
You sort of alluded to it, but just to be sure I understood, you're obviously starting first with
00:57:48.520
a biased set where you know what's happening and then you're moving to a blinded, unbiased set
00:57:53.380
for confirmation. Is that effectively the way you did it?
00:57:56.080
Yeah. Yeah. It's often referred to as a training set and a test set. Yeah.
00:58:02.220
Tell us what emerged, Alex, when it was all said and done, when you had first and foremost
00:58:07.440
identified every single thing that was measurable and knowable. Sorry, before we do that, I keep
00:58:13.400
taking us off methylation. Explain methylation of all the characteristics. That's the one I don't
00:58:19.120
So DNA methylation is a chemical modification of the DNA. So in particular at the
00:58:25.580
C in the ATC-C code, the C stands for acytosine. So that's a particular nucleotide or base in DNA.
00:58:34.240
Mammalian biology can methylate. It means that it can add a methyl group, but a methyl group is just
00:58:40.500
a single carbon atom with three hydrogens and then bonded to that cytosine. And so that's what DNA
00:58:46.760
methylation is. So to say acytosine is methylated, it means that it has that single methyl group bonded
00:58:54.120
to it. Turns out that there's about 28 million positions in the human genome that can be methylated.
00:59:00.640
It usually occurs at what's called CPG sites, which is if you go along one strand of DNA,
00:59:07.840
this is not pairing of the DNA, but one strand, a G follows a C. So that's what a CPG is. It's a C
00:59:14.260
with a phosphate bond to a G. And so at those positions in the genome, there are enzymes that
00:59:21.000
can methylate the cytosine and demethylate it. And there's again, about 28 million of those sites
00:59:27.700
out of the 3 billion overall bases in the human genome. These chemical modifications are really
00:59:35.340
important because they affect things like gene expression. It's one of the more important classes
00:59:40.300
of something that's called epigenetics, which is changes that are outside of the genetics or outside
00:59:46.040
of the code itself. As you know, the DNA code is the same in most cells of the human body.
00:59:51.600
Obviously, the cells are quite different. So a T cell is very different than a neuron.
00:59:55.360
And other than the T cell receptor, all of the genes are the same. The code is the same. So why are the
01:00:00.900
cells different? Well, it's the epigenetics. So things like which parts of the gene are methylated
01:00:06.700
or which ones are associated with histones that are blocking access to the DNA, that that's what
01:00:13.460
ultimately determines which genes are transcribed, which proteins are made, and why cells take on
01:00:19.820
very different morphology and properties. The methylation is a very fundamental code
01:00:25.900
for controlling it. So I call the epigenetics the software of the genome. The genetic code is kind
01:00:32.920
of the hardware, but how you use it, which genes you use when, which combination, that's really the
01:00:38.740
epigenetics. What is the technological delta or difference in reading out the methylation sequence
01:00:48.640
on those CPG sites relative to the ease with which you simply measure the base pair sequences? So you can
01:00:58.460
measure C, G, A, T, C, C, A, T, G, et cetera. But then in the same readout, do you also acquire which of
01:01:07.160
the C's are methylated, or are you doing a separate analysis? There's different technologies to do that.
01:01:12.940
For cell-free DNA, usually you want very accurate sequencing of billions of these, or many hundreds
01:01:19.440
of millions of these small fragments. The way it's done is, and this adds complexity to the chemistry,
01:01:25.320
is you pre-treat the DNA in a way that encodes the methylation status in the ATG sequence,
01:01:32.800
and then you just use a sequencer that can only see ATCG. But because you've encoded the information,
01:01:39.780
you can then deconvolute it and infer which sites were methylated. Just to be a little more specific,
01:01:46.580
there's chemicals that will, for example, deaminate a cytosine that's not methylated.
01:01:53.520
And then that deaminated cytosine effectively turns into a uracil, which is a fifth letter in RNA.
01:01:59.920
And then when you copy the DNA and you amplify it prior to the sequencing, it amplifies as a T,
01:02:07.120
because a U, when it's copied by a DNA polymerase, it becomes a T. And then you end up with a sequence
01:02:13.160
where you expect to see C's and you see a T. And if you see a T there, then you know that,
01:02:21.900
That came from a U, and the U is an unmethylated C. Brilliant.
01:02:26.080
Brilliant. Right. And if the C was not changed, then you say, then that must have been a site that
01:02:31.120
was methylated. Because you'll see G's opposite them. Oh, sorry. If the C was methylated, you'll
01:02:36.300
see the G opposite because you won't turn it to the uracil. Right, right. Yeah. Brilliant.
01:02:41.560
That technique is called bisulfite sequencing. There are other ways to do it, but that's the
01:02:45.780
predominant of it. All right. So now back to the question I started to ask a minute ago,
01:02:49.780
but then realized we hadn't talked about methylation. So you've come up with all these different
01:02:53.620
things that you can do with this tiny amount of blood. Because again, you talk about 10 ML,
01:02:59.920
you know, in the grand scheme of things, that's a really small amount of blood. That's two small
01:03:04.340
tubes of blood. Very easy to do. Presumably there was an optimization problem in here where you min
01:03:10.440
max this thing and realize, well, look, this would be easy to do if we could take a liter of blood,
01:03:15.400
but that's clinically impossible. Yeah. It would be nice to Theranos this quote unquote,
01:03:20.540
and do this with a finger stick of blood, but you're never going to get the goods.
01:03:24.540
So did you sort of end up at 10 ML? Was it just sort of an optimization problem that got you there
01:03:30.340
as the most blood we could take without being unreasonable, but yet still have high enough
01:03:35.380
fidelity? And maybe asked another way, can you get better and better at doing this if you were taking
01:03:40.980
eight tubes of blood instead of two? Yeah. There's a couple of considerations. One is the
01:03:46.000
practical one. You need a format to the extent your standard phlebotomy and standard volumes that are
01:03:52.340
below the volumes at which you could put someone in jeopardy. That's a big practical issue. But it
01:03:58.560
actually turned out that what ultimately limited the sensitivity of the test was the background biology.
01:04:06.140
So for broad-based cancer screening, more blood would actually not help you. Now there's other
01:04:11.180
applications for monitoring or the therapy selection where you're looking for a very particular target,
01:04:18.520
someone who has cancer and you know what kind of cancer, and there you could improve your sensitivity.
01:04:23.280
But just for cancer screening, you're usually not limited by the amount of blood.
01:04:29.400
And so did methylation turn out to be the most predictive element at giving you that very,
01:04:35.860
very high specificity, or was it some combination of those measurable factors?
01:04:42.340
Yeah. So it was pretty unexpected. I would say going into it, most people thought that the mutations
01:04:47.880
were going to be the most sensitive method. Some of us thought that the chromosomal changes were going
01:04:53.860
to be the most sensitive. I would say the methylation signals were kind of a dark horse. I had to fight
01:04:59.640
several times to keep it in the running. But again, we really took a, let the data tell us what's
01:05:05.860
the right thing to do. It's not biases from other experiments. Let's do this in a comprehensive,
01:05:11.120
rigorous way. And in the end, the methylation performed by far the best. So it was the most
01:05:17.220
sensitive. So it detected the most cancers. Importantly, it was very specific. It actually
01:05:22.680
had the potential and ultimately did get to less than 1% false positive rate. And then the methylation
01:05:28.760
had this other feature, which was very unique, which was that it could predict the type of cancer.
01:05:34.640
What was the original, what we call now the cancer site of origin? What organ or tissue did it originate
01:05:41.880
from? Interestingly, adding them all together didn't improve on the methylation. I can explain
01:05:47.920
why. And now in hindsight, you might've thought, Hey, more types of information and signal are better,
01:05:53.620
but it actually did it. So we ended up with one clear result that the methylation patterns in the
01:05:59.960
cell free DNA were the most useful and information and adding other things was not going to help the
01:06:06.440
performance. And why do you think that was? Because it is a little counterintuitive. There are clearly
01:06:13.080
examples I could probably think of where you can degrade the signal by adding more information.
01:06:18.440
But I'm curious if you have a biologic teleologic explanation for why one and only one of these
01:06:26.300
metrics turned out to be the best and any additional information only diluted the signal.
01:06:31.860
It comes down to, this is a good engineering principle, right? If you want to improve your
01:06:36.600
prediction, you need an additional signal that carries information and is independent from your
01:06:42.580
initial signal. If it's totally correlated, then it doesn't actually add anything.
01:06:47.760
Let's take an analogy. Let's say you're on a freeway overpass and you're developing an image
01:06:53.320
recognition for Fords. And you say, okay, what I'm going to start initially with is an algorithm.
01:06:59.280
It's going to look for a blue oval with the letters F-O-R-D in it. So that's pretty good.
01:07:04.360
Now let's say you say, okay, I know that some Fords also have the number 150 on the side,
01:07:10.840
F-150. So I'm going to add that, right? If you think about it, if your algorithm
01:07:17.600
based on the blue oval is already pretty good, adding the 150 is not going to help because
01:07:24.740
whenever the 150 occurs, the blue oval is also always there. Now, if the blue oval wasn't always
01:07:31.280
there or there were Fords that didn't have the blue oval, then some other signal could be helpful.
01:07:35.940
And so that's kind of what ended up happening is that the methylation signal was so much more
01:07:41.680
prevalent and so much more distorted in cancer that everything else didn't really add because
01:07:48.840
anytime you could see one of the others, you could also see many more abnormal methylation fragments.
01:07:55.820
Yeah, that's really fantastic. I guess I also want to, again, just go back and make sure people
01:08:01.480
understand the mission statement you guys brought to this, which was high specificity is a must.
01:08:09.040
So people have heard me on the podcast do this before, but just in case there are people
01:08:13.300
who haven't heard this example or forget it, I sometimes like to use the metal detector analogy
01:08:18.820
in the airport to help explain sensitivity and specificity. So sensitivity is the ability of
01:08:25.400
the metal detector to detect metal that should not go through. And let's be clear. It's not that people
01:08:32.600
in the airports care if your phone is going through or your laptop or your watch or your belt,
01:08:38.440
they care that you're bringing guns or explosives. That's why we have metal detectors or knives or
01:08:45.060
things of that nature. That's why the metal detector exists. It has to be sensitive enough
01:08:50.640
that no one carrying one of those things can get through. On the other hand, specificity would say,
01:08:58.100
so if you're optimizing for sensitivity, you make it such that you will detect any metal that goes
01:09:03.680
through that thing. And by definition, you're going to be stopping a lot of people. You're going to stop
01:09:10.280
everybody from walking through. If their zipper is made of metal, you'll stop them.
01:09:14.740
Or prosthetic or a big belt or boots or anything. You got a little metal on your glasses, you're going
01:09:21.120
to get stopped. So you have to dial the thing in a way so that you have some specificity to this test
01:09:27.100
as well, which is I can't just stop everybody. In an ideal world, I kind of want everyone to make
01:09:33.380
it through who's not carrying one of those really bad things. And we're defining bad thing by a certain
01:09:38.900
quantity of metal. And therefore, your specificity is to kind of say, I don't want my test to be
01:09:47.580
triggered on good guys, right? I want my test to be triggered on bad guys. Now, when you guys are
01:09:54.820
designing a test like this, like the Grail test, I guess I should just go back and state anybody
01:10:00.300
who's ever been through two different airports wearing the exact same clothing and realizes
01:10:06.240
sometimes it triggers, sometimes it doesn't. What you realize is not every machine has the same
01:10:09.960
setting. And that's because the airport, the people at TSA, they turn up or turn down the sensitivity
01:10:15.500
and that changes the specificity as well. How deliberately do you, when you're setting up this
01:10:23.240
assay, have the capacity to dial up and down sensitivity and specificity? So while I understand
01:10:29.480
your mandate was a very high specificity test, where was the control or manipulation of that system,
01:10:37.940
if at all? So there's a threshold. It's complex. Conceptually, there's a threshold inside the
01:10:44.000
algorithm, right? So you can imagine that after you have this comprehensive map of all these different
01:10:51.140
types of methylation changes that can occur in the fragments of hundreds of examples of every cancer
01:10:57.200
type. And then you compare it to all the methylation changes that can occur outside of cancer, which we
01:11:03.880
haven't talked about, which is very important. So most of the methylation patterns are pretty similar
01:11:09.380
and similar cell types across individuals. But there are changes that occur that occur with age or
01:11:14.760
ethnicity or environmental exposure and so on. What you'd like is those two populations to be
01:11:21.500
completely different. But it turns out there is some overlap. So there are fragments that occur in
01:11:28.040
cancer that can occur outside of cancer. The algorithm in a very complex state space is trying to
01:11:35.320
separate these populations. And whether or not you're going to call something as a potential cancer
01:11:42.580
and say a cancel signal is detected is whether or not the algorithm thinks, is it associated with
01:11:47.620
this cancer group or is it associated with a non-cancer group? But again, there's some overlap
01:11:54.140
between these. And so where you set that overlap, like in the border between individuals who don't have
01:12:01.860
cancer, but how for whatever reason, an abnormal level of fragments that kind of look cancerous,
01:12:07.540
that will determine your specificity. So there is a dial to turn where you can increase the
01:12:14.440
stringency, call fewer false positives, but then you will start to miss some of the true positives.
01:12:21.260
Now, what was so great about methylation is that these populations were pretty well separated,
01:12:26.680
better than anything the world had ever seen before, which is why you could get high specificity
01:12:31.940
and still pretty good sensitivity. But yes, there is some overlap, which means you have to make a
01:12:38.040
trade-off and dial it in. Inside the company, is there sort of a specific discussion around
01:12:44.320
the trade-offs of it's better to have a false positive than have a false negative? Like let's
01:12:49.800
use the example you brought up earlier, right? So prostate-specific antigen is kind of the mirror
01:12:54.380
image of this, right? It's a highly, highly sensitive test with very low specificity. It's obviously a
01:13:01.120
protein, so it's a totally different type of assay, right? It's a far cruder test, of course.
01:13:05.600
But the idea is, in theory, and of course I could give you plenty of examples, someone with prostate
01:13:11.580
cancer is going to have a high PSA. So you're not going to miss people with cancer. But as you pointed
01:13:18.420
out earlier, you're going to be catching a lot of people who don't have cancer. And it's for that
01:13:24.060
reason, as you said, there is no longer a formal recommendation around the use of PSA screening.
01:13:28.820
It has now kind of been relegated to the just talk to your doctor about it. And of course,
01:13:34.540
the thinking is, look, there are too many men that have undergone an unnecessary prostate biopsy on the
01:13:40.580
basis of an elevated PSA that really should have been attributed to their BPH or prostatitis or
01:13:46.660
something else. So notwithstanding the fact that we have far better ways to screen for prostate cancer
01:13:51.440
today, that's a test that is highly geared towards never missing a cancer. In its current format,
01:13:59.260
under low prevalence populations, which is effectively the population it's being designed
01:14:05.580
for, right? This is designed as a screening tool. It seems to have better negative predictive value
01:14:10.900
than positive predictive value, correct? It's pretty high in both because negative predictive
01:14:15.260
value also is related to prevalence. Well, just to put some numbers out there, right? So
01:14:20.200
in the CCGA study, but then importantly, in an interventional study called Pathfinder,
01:14:26.620
a positive predictive value is around 40%. That's all stages?
01:14:31.680
Yeah. So that's all cancers, all stages. It's a population study. So it's whatever natural
01:14:37.220
set of cancers and stages occur in that group. So that was about 6,500 individuals.
01:14:43.920
Do you recall, Alex, what the prevalence was in that population? Was it a low risk population?
01:14:50.200
Yeah. So it was a mix of a slightly elevated risk population and then a average risk population.
01:14:58.240
Just in terms of risk, and I think you'll appreciate this, I think of anyone over 50 as
01:15:02.720
high risk. And that's where the majority of these studies are happening, right? So I mean,
01:15:07.000
age is your single biggest risk factor for cancer. The population over 50 is about a 10x increased risk
01:15:17.920
And age 55 to 65 is the decade where cancer is the number one cause of death.
01:15:23.620
I would say in developed nations, I mean, that's actually increasing, right? I mean,
01:15:27.920
we're making such incredible progress on metabolic disease and cardiovascular disease. Cancer in the
01:15:34.000
developed world is predicted to become surpass cardiovascular disease as the number one killer.
01:15:39.020
Anyway, older populations are at, I wouldn't call them low risk, I'd call them average risk for that
01:15:45.100
age group, which is still relatively high for the overall population. But it was a mixed prevalence,
01:15:50.440
a bit less than 1%. Some of these studies do have a healthy volunteer bias.
01:15:55.840
In a 6,500 person cohort with a prevalence of 1%, which is pretty low, the positive predictive value was 40%.
01:16:07.880
What was the sensitivity for all stages then? It must have been,
01:16:11.560
it's easy to calculate if I had my spreadsheet in front of me, but it's got to be 60% or higher.
01:16:17.000
Sensitivity and specificity has got to be close over 99% at that point, right?
01:16:21.760
Those are the rough numbers. Yeah, that's right. Some of the important statistics there, right? So about
01:16:26.780
half of the cancers that manifested over the lifetime of the study were detected by the test. The test
01:16:34.100
actually doubled the number of cancers in that interventional study than were detected by standard
01:16:39.420
of care screening alone. The interventional study, the Pathfinder study, the enrollees were getting
01:16:45.920
standard of care screening according to guidelines. So mammography, importantly, cervical cancer
01:16:52.120
screening, and then colonoscopies or stool-based testing based on guidelines. And so a number of the
01:16:58.340
cancers that the grail gallery test detected were also detected by standard of care, which you would
01:17:04.400
expect. But the total number of cancers found was about doubled with the addition of the gallery test.
01:17:11.880
And that was predominantly cancers where there isn't a screening test for. But just going back to
01:17:17.260
the positive predictive value, just the positive predictive value of most screening tests is low single
01:17:22.700
digits. You probably have the experience more than I have, but many, many times a female colleague,
01:17:29.060
friend, or someone's wife calls and said, you know, I got a mammography. They found something. I'm
01:17:34.480
going to have to go for a follow-up, a biopsy, and so on. And literally 19 times out of 20, it's a false
01:17:41.580
positive. That's one where we've accepted, for better or worse, a huge false positive rate. Catch some
01:17:48.560
cancers, right? And that's why there's a fair amount of debate around mammography. But again,
01:17:53.040
that's a positive predictive value of about four and a half percent. The vast majority of people who
01:17:58.680
get initial positive, they're not going to end up having cancer, but still potentially worth it.
01:18:05.220
Now we're talking about something where we're approaching one or two positive tests will
01:18:10.340
ultimately lead to a cancer diagnosis that's potentially actionable. So it's, I think sometimes
01:18:15.920
when people hear 40%, they say, gee, that means there's still a fair amount of people who are
01:18:22.240
going to get a positive test, meaning a cancer signal detected and ultimately not. But again,
01:18:28.240
for a screening test, that's incredibly high yield. I think another way to think about that is to go
01:18:33.600
back to the airport analogy. So this is a metal detector that is basically saying, look, we're willing
01:18:42.960
to beep at people who don't have knives to make sure everybody with a knife or gun gets caught.
01:18:49.280
So the negative predictive value is what's giving you the insight about the bad guys. So a 40% positive
01:18:56.780
predictive value means, let's just make the numbers even simpler. Let's say it's a 25% positive predictive
01:19:03.680
value. It means for every four people you stop, only one is a true bad guy. Think about what it's like
01:19:12.420
in the actual airport. How many times in a day does the metal detector go off and how many times in a
01:19:19.380
day are they catching a bad guy? The answer is it probably goes off 10,000 times in a day and they
01:19:25.540
catch zero bad guys on average. So that gives you a sense of how low the positive predictive value is
01:19:31.480
and how high the sensitivity is and how low the specificity is. So yes, I think that's a great way to
01:19:37.320
look at it, which is if you are screening a population that is of relatively normal risk,
01:19:45.740
a positive predictive value of 20% is very, very good. It also explains, I think, where the burden
01:19:54.820
of responsibility falls to the physician, which is as a physician, I think you have to be able to talk
01:20:00.760
to your patients about this explicitly prior to any testing. I think patients need to understand that,
01:20:10.220
hey, there's a chance that if I get a positive test here, it's not a real positive. I have to have
01:20:17.000
kind of the emotional constitution to go through with that, and I have to be willing to then engage
01:20:22.460
in follow-up testing. Because if this thing says, oh, you know, Alex, it looks like you have a lung cancer,
01:20:28.800
the next step is, I'm going to be getting a chest x-ray, or I'm going to be getting a low-dose CT
01:20:33.140
of my chest. And that doesn't only come with a little bit of risk, in this case, radiation,
01:20:37.740
although it's an almost trivial amount, but I think more than anything, it's the risk of the emotional
01:20:42.880
discomfort associated with that. And I think, honestly, when you present the data this way to patients,
01:20:49.020
they really understand it, and they really can make great informed decisions for themselves.
01:20:53.580
And by the way, for some of those patients, it means, thank you, but no thank you.
01:20:56.820
I just don't want to go through with this, and that's okay, too. Let's talk a little bit about
01:21:01.160
some of the really interesting stuff that emerged in the various histologies and the various stages.
01:21:08.140
And I've had some really interesting discussions with your colleagues. I guess, just for the sake
01:21:12.480
of completing the story, you're no longer a part of the company, Grail. Maybe just explain that so
01:21:18.500
that we can kind of get back to the Grail stuff, but just so that people understand kind of your
01:21:22.300
trajectory. We should do that. Yeah. So I was at Illumina, and then I helped spin off Grail as a
01:21:27.900
co-founder, led the R&D and clinical development. I actually went back to Illumina as the chief
01:21:34.340
technology officer running all of the company's research and development. Really, really fantastic,
01:21:39.640
fun job. Subsequently, Illumina acquired Grail, solely owned subsidiary of Illumina.
01:21:47.360
That was almost three years ago. Recently, I left Illumina to start a new company, a really
01:21:54.220
interesting biotech company that I'm the CEO of. No longer actively involved in either company.
01:22:00.200
I have great relations with all my former colleagues. Excited to see their progress.
01:22:04.960
I should also say that I am still a shareholder also of Illumina, just for full disclosure.
01:22:10.020
Yeah. Thank you. You have a number of colleagues, as you said, who are still at Grail,
01:22:13.740
who I've gotten to know. One of the things that really intrigued me was, again, some of the
01:22:20.100
histologic differences and the stage differences of cancer. If you look at the opening data,
01:22:29.600
a few things stood out. There were certain histologies that, if you took them all together
01:22:35.660
by stage, didn't look as good as others. For example, talk a little bit about prostate cancer
01:22:42.320
detection using the gallery test. I think what you're referring to is there's a very wide variety
01:22:50.480
of different performances in different cancers. They're all highly specific, so very low false
01:22:55.700
positive rate because there's only one false positive rate for the whole test, which is probably worth
01:23:00.720
spending some time on later. For example, sensitivity to many GI cancers or certain histologies of lung
01:23:08.280
cancer, the test is very good at detecting earlier stage localized cancers. Particularly in prostate
01:23:15.080
cancer and hormone receptor positive breast cancer, the detection rate is lower for stage one cancers.
01:23:23.620
But this gets to a very important issue, which is what is it that you want to detect? So do you want
01:23:29.720
to detect everything that's called a cancer today? Or is what you want to detect is you want to detect
01:23:33.980
cancers that are going to grow and ultimately cause harm? So the weird thing about cancer screening in
01:23:40.020
general is there's both over and under diagnosis. Most small breast cancers and most DCIS and most
01:23:46.960
even small prostate cancers will never kill the patient or cause morbidity, but there is a small
01:23:52.640
subset that will. And so for those, we have decided to, again, go for a trade-off where we'll often
01:24:00.420
resect things and go through treatments just to make sure that smaller percentage is removed,
01:24:05.660
even though we're removing a ton of other, quote, cancers that are unlikely to ever proceed into
01:24:12.540
anything dangerous. On the flip side, 70% of people who die of cancer, they die from an unscreened cancer.
01:24:19.900
So there's huge underdiagnosis. You should remember that. 70% of people who ultimately die of cancer on
01:24:27.460
their death certificate, they die from a cancer where there was no established screening prior to
01:24:32.900
something like Grail's Gallery. So we have this weird mix of, there's a lot of cancers where we
01:24:37.540
know we're overdiagnosing, but we're doing it for a defensible trade-off. And then there's a huge
01:24:43.400
number of cancer deaths occurring where there's essentially zero diagnosis. But back to the ones
01:24:49.020
where there's underdiagnosis, it gets back to what does it mean to have tumor DNA in your blood?
01:24:55.420
So measuring and detecting a cancer from tumor DNA in your blood is a functional asset.
01:25:01.360
To get tumor DNA in your blood, you have to have enough cells. They have to be growing fast enough,
01:25:07.380
dying fast enough, and have blood access. So those are the things that you require.
01:25:13.180
Now, if you have a tumor that's small, encapsulated, not growing, well, guess what? It's not going to have
01:25:20.180
DNA in the blood. So unlike an imaging assay, which is anatomical, this is really a functional
01:25:26.200
asset. You're querying for whether or not there's a cancer that has the mass, the cell activity and
01:25:33.160
death, and access to the blood to get and manifest its DNA into the blood. So it's really stratifying
01:25:41.600
cancers on whether or not they have the activities. Now, interestingly, this functional assay
01:25:47.180
is very correlated with ultimate mortality. There's a really nice set of data that the
01:25:53.700
GRAIL put out where you look at Kaplan-Meier curves. So over the course of the CCGA study,
01:25:58.960
which is now going out, I don't know, five plus years, you can say, well, what do survival curves
01:26:04.600
look like? If you were positive, your test was detected versus your test was negative, meaning your
01:26:10.540
cancer was not detected by the GRAIL test. And there's a big difference. So basically,
01:26:15.020
if your cancer was undetectable by the GRAIL test, you have a very good outcome, much,
01:26:23.740
much better than the general population with that cancer. So this suggests two things. One is,
01:26:29.380
A, those cancers may not have actually been dangerous because there's not a lot of mortality
01:26:34.240
associated with them. And maybe that's also why they couldn't put their tumor DNA in the blood.
01:26:38.900
The other is whatever the existing standard of care is, it's working well. Now, if you look at all
01:26:45.060
the cancers in the Kaplan-Meier curve that were detected, they have a lot of mortality associated
01:26:50.920
with them. And so what it's showing is that it's the dangerous cancers, the cancers that are
01:26:55.500
accounting for the majority of mortality, those are the ones that the test is detecting.
01:27:00.580
This biological rationale makes a lot of sense, which is, okay, a tumor that grows fast, can get
01:27:06.520
its DNA in the blood. Well, that's probably also a dangerous tumor that is going to become invasive
01:27:11.000
and spread. So again, it's a functional assay. So if your test is detected by one of these tests,
01:27:18.680
like the gallery test, it's saying something about the tumor that is very profound, which is that it's
01:27:25.260
active enough to get its signal into the blood. And it's very likely, if untreated, to ultimately
01:27:32.500
be associated with morbidity and potentially mortality. I think it's an open question of
01:27:38.760
these tumors that aren't detectable and that are in cancers, we know there's a lot of indolent
01:27:45.440
disease. What does it really mean that the test is low sensitivity for that?
01:27:50.220
Yeah. I would say that when I went through these data and I went through every single histology
01:27:56.880
by stage, I did this exercise probably 18 months ago. The one that stood out to me more than any
01:28:04.660
other was the sensitivity and specificity discrepancy. Well, I should say the sensitivity
01:28:12.100
discrepancy between triple negative breast cancer and hormone positive breast cancer. You alluded to
01:28:20.040
this, but I want to reiterate the point because I think within the same quote unquote disease of
01:28:25.240
breast cancer, we clearly understand that there are three diseases. There's estrogen positive,
01:28:31.280
there's HER2 new positive, there's triple negative. Those are the defining features of three
01:28:36.140
completely unrelated cancers with the exception of the fact that they all originate from the same
01:28:42.360
mammary gland. But that's about where the similarity ends. Their treatments are different,
01:28:46.580
their prognoses are different. And to take the two most extreme examples, you take a woman who has
01:28:53.160
triple positive breast cancer, i.e. it's estrogen receptor positive, progesterone positive,
01:28:59.040
HER2 new positive. You take a woman who has none of those receptors positive. The difference on the
01:29:05.040
gallery test performance on stage one and stage two, so this is cancers that have not even spread to
01:29:12.780
lymph nodes. The hormone positives were about a 20% sensitivity for stage one, stage two,
01:29:18.560
and the triple negative was 75% sensitivity for stage one, stage two. And so this underscores your
01:29:26.220
point, which is the triple positive cancer is a much, much worse cancer. And that at stage one,
01:29:34.500
stage two, you're detecting 75% sensitivity portends a very bad prognosis. Now, I think the really important
01:29:45.900
question here, I believe that this is being asked, is does the ability to screen in this way lead to better
01:29:56.320
outcomes? So I will state my bias, because I think it's important to put your biases out there,
01:30:02.220
and I've stated it publicly many times, but I'll state it again. My bias is that yes, it will. My bias
01:30:09.260
is that early detection leads to earlier treatment. And even if the treatments are identical to those
01:30:17.220
that will be used in advanced cancers, the outcomes are better because of the lower rate of tumor burden.
01:30:23.580
And by the way, I would point to two of the most common cancers as examples of that, which are breast
01:30:30.140
and colorectal cancer, where the treatments are virtually indistinguishable in the adjuvant setting
01:30:36.320
versus the metastatic setting. And yet the outcomes are profoundly different. In other words, when you
01:30:42.060
take a patient with breast or colorectal cancer, and you do a surgical resection, and they are a stage three
01:30:47.920
or less, and you give them adjuvant therapy, they have far, far, far better survival than those patients who
01:30:56.300
undergo a resection, but have metastatic disease and receive the same adjuvant therapy. It's not even
01:31:02.380
close. And so that's the reason that I argue that the sooner we know we have cancer and the sooner we
01:31:08.220
can begin treatment, the better we are. But the skeptic will push back at me and say, Peter, the only thing
01:31:14.280
the Grail test is going to do is tell more people bad news. So we'll concede that people are going to
01:31:22.480
get a better, more relevant diagnosis, that we will not be alerting them to cancers that are irrelevant
01:31:29.920
and over-treating them. And we will alert them to negative or more harmful cancers, but it won't
01:31:37.060
translate to a difference in survival. So what is your take on that? And how can that question be
01:31:43.920
definitively answered? It's a very important question. And over time, it will be definitively
01:31:50.700
answered. So we should talk about some of Grail's studies and how they're going about it.
01:31:55.640
So the statistics are very profound, like you said. So most solid tumors, five-year survival,
01:32:01.600
when disease is localized, hasn't spread to another organ, 70 to 80% five-year survival,
01:32:08.360
less than 20 per metastatic stage four disease. That correlation of stage diagnosis versus five-year
01:32:16.520
survival is night and day. And obviously, everyone would want them and their loved ones,
01:32:21.620
most people in the localized disease category. Now, there's an academic question, like you're
01:32:28.020
saying, which is, okay, well, that's true. But does that really prove that if you find people at
01:32:33.240
that localized disease through this method, as opposed to all the variety of methods that happens today,
01:32:39.300
incidentally, that you will have the same outcome? And sure, I guess you could come up with some
01:32:45.340
very theoretical possibility that somehow that won't, but that doesn't seem very likely.
01:32:52.940
And I think it gets to a fundamental question of, well, are we going to wait decades to see that?
01:32:59.480
And in the meantime, give up the possibility, which is probably likely, that finding these cancers early
01:33:06.360
and intervening early will change outcome. I'm all for, and I think everyone is, bigger and more
01:33:13.140
definitive studies over time. But the idea that we're never going to do that study or just take
01:33:19.560
kind of a nihilistic point of view, that until it's done, we're not going to find cancers early
01:33:24.520
and intervene, I don't think it's conscionable to do that, especially when the false positive rate's low.
01:33:30.220
I think there's a few other ways to come at it, which is, if what you said was really true,
01:33:35.040
I've met some of the folks and called by them, the GRAIL test has found the positive.
01:33:38.920
I can think of a former colleague in the test found an ovarian cancer. Do you think when she
01:33:45.180
went to her OBGYN and said, look, the test said that I have potentially an ovarian cancer and they
01:33:50.560
did an ultrasound and they found something that OBGYN said, you know what, since this was found
01:33:56.080
through a new method, let's not intervene. There's a malignancy. It is an ovarian cancer. We know what the
01:34:02.920
natural history is, but we're not going to intervene. Similarly with cases of pancreatic cancer,
01:34:08.120
head and neck or things like that. I don't understand the logic because today people do
01:34:13.420
show up. It's not very often with early stage versions of these disease, ovarian, pancreatic,
01:34:18.180
head and neck and things, and we treat them. So why is it you wouldn't treat them if you could find
01:34:23.480
them through this modality? I just don't know of any GI surgeon who says, well, you're one of the
01:34:29.460
lucky people where you found your pancreatic cancer at stage one, two, but we're not going to treat it
01:34:33.460
because there isn't definitive evidence over decades that mortality isn't better. So I get
01:34:39.200
the academic point and Grail and others are investing tremendous amount to increase the data.
01:34:45.360
The idea that we have this technology and we're going to allow huge numbers of cancers to just
01:34:51.840
progress to late stage before treating, I don't think that's the right balance of potential benefit
01:34:58.460
versus burden of evidence. So is there now a prospective real world trial ongoing in Europe?
01:35:05.960
There it is. Let's talk a little bit about that.
01:35:08.320
The NHS has been piloting the Grail test in a population of about 140,000. So it involves
01:35:15.840
sequential testing, I think at least two tests, and then they look at outcomes. It's an interventional
01:35:22.800
study with return of results. And they're looking for a really interesting endpoint here. So mortality
01:35:29.420
takes time. So, I mean, some cancers, I mean, to ultimately see whether or not getting diagnosed at
01:35:35.460
a different stage and the intervention changes that that could take one or in some cases, two decades,
01:35:41.040
but they came up with a really interesting surrogate endpoint, which is reduction in stage four
01:35:46.440
cancers. So here's the logic. I think it makes a lot of sense, which is if people stop getting
01:35:51.480
diagnosed with stage four and say a big reduction in stage three cancer, then doesn't it stand to
01:35:57.500
reason that ultimately you will reduce mortality? So if you remove the end stage version of cancer,
01:36:04.900
which kills most people, and that you know that you have to pass through, most people don't die
01:36:10.380
of stage two cancer. They were diagnosed with stage two, they died because it turned out it wasn't stage
01:36:14.900
two and it spread. If you do a study and within a few years, when you're screening people at
01:36:21.480
there's no more, and let's take the extreme stage four cancer, then you've stage shifted the
01:36:26.400
population and you're kind of eliminating late stage metastatic cancer. So again, I think while
01:36:33.760
we're waiting for that to read out, my personal belief is the potential benefit of finding cancer
01:36:39.460
is so significant. Testing now for many patients makes sense. And then I think this endpoint of stage
01:36:48.460
four reduction. Yeah, that's a clever, clever endpoint. One of the things that I know that a lot
01:36:55.660
of the folks who oppose cancer screening tend to cite is that a number of cancer screening studies
01:37:01.920
do not find an improvement in all cause mortality, even when there's a reduction in cancer specific
01:37:08.340
mortality. So, hey, we did this colonoscopy study, or we did this breast cancer screening study,
01:37:14.140
and it indeed reduced breast cancer deaths, but it didn't actually translate to a difference in
01:37:18.780
all cause mortality. I've explained this on a previous podcast, but it is worth for folks who
01:37:23.380
didn't hear that to understand why. To me, that's a very, very, oh, how can I say this charitably?
01:37:30.360
That's a very misguided view of the literature because what you fail to appreciate is those studies
01:37:37.060
are never powered for all cause mortality. And if you reduce breast cancer mortality by 40% or 30%,
01:37:46.720
that translates to a trivial reduction in all cause mortality because breast cancer is still just one
01:37:54.920
of 50 cancers. And even though it's a relatively prevalent cancer over the period of time of a study,
01:38:00.860
which is typically five to seven years, the actual number of women who were going to die of
01:38:05.780
breast cancer is still relatively small compared to the number of women, period, who were going to die
01:38:11.780
of anything. And I, in previous podcasts have discussed that it's very difficult to get that
01:38:18.480
detection within the margin of error. And so if you actually wanted to be able to see how that
01:38:24.960
translates to a reduction in all cause mortality, you would need to increase the size of these studies
01:38:30.140
considerably, even though really what you're trying to do is detect a reduction in cancer
01:38:35.360
specific mortality. I say all of that to say that I think one of the interesting things about the
01:38:41.460
NHS study is it is a pan screening study. And to my knowledge, it's the first. In other words,
01:38:48.980
it has the potential to detect many cancers and therefore you have many shots on goal. Potentially,
01:38:56.520
this could show a reduction in all cause mortality and not just cancer specific mortality. I would have to
01:39:02.500
see the power analysis, but I wonder if the investigators thought that far ahead. Do you
01:39:06.840
know? I mean, they're going to follow these patients long-term. They will get, be able to
01:39:11.720
have the data on mortality. I don't know if it's powered for all cause. I think that's unlikely just
01:39:19.500
for the reasons you said, which is the numbers would be really high. I mean, again, if you're powering
01:39:25.200
it to see a reduction in stage four over a couple of years, that may not be enough.
01:39:31.620
Interesting. Well, time will tell. Alex, I want to pivot if we've got a few more minutes
01:39:35.860
to a topic that you and I spend a lot of time talking about these days. And so by way of
01:39:41.520
disclosure, you sort of noted that you've left Illumina somewhat recently. You've started another
01:39:47.360
company. I'm involved in that company as both an investor and an advisor, and it's an incredibly
01:39:51.980
fascinating subject. But one of the things that we talk about a lot is going back to this role of
01:39:59.380
the epigenome. So I think you did a great job explaining it and putting it in context. So we've
01:40:04.800
got these 3 billion base pairs and lo and behold, some 28 million of them also happen to have a methyl
01:40:13.520
group on their C. I'll fill in a few more details that we didn't discuss on the podcast, but just to
01:40:19.520
throw it out there. As a general rule, when we're born, we have kind of our max set of them. And as
01:40:26.060
we age, we tend to lose them. As a person ages, the number of those methylation sites goes down.
01:40:33.780
You obviously explain most importantly what they do, what we believe their most important purpose is,
01:40:39.200
which is to impact gene expression. It's worth also pointing out that there are many hallmarks of
01:40:46.400
aging. There are many things that are really believed to be at the fundamental level that
01:40:52.680
describes why you and I today look and function entirely different from the way we did when we
01:41:00.080
met 25 years ago. We're half the men we used to be. I could make a Laplace Fourier joke there, but I will
01:41:06.860
refrain. So I guess the question is, Alex, where do you think methylation fits in to the biology of
01:41:17.800
aging? That's a macro question, but... Yeah, yeah. So you talked about the hallmarks of aging,
01:41:24.320
because the author, I think it was Hanrahan, came up with that about 10 years ago, this hallmarks of
01:41:29.620
aging. And he recently gave a talk where he talked about perhaps methylation is the hallmark of
01:41:35.920
aging. And what he's referring to is the mounting data that the epigenetic changes are the most
01:41:45.280
descriptive of aging and are becoming more and more causally linked to aging events.
01:41:50.900
There's lots of data that show that people of comparable age, but different health status,
01:41:58.460
for example, smokers versus non-smokers, people who exercise versus people who don't,
01:42:03.240
people who are obese versus people who are not, can have very different methylation patterns.
01:42:09.500
There's also some data that look at centenarians relative to non-centenarians. And obviously,
01:42:17.780
that's a complicated analysis because by definition, there's a difference in age,
01:42:21.900
but you get a sense of different patterns of methylation. And clearly, centenarians we've
01:42:26.880
established long ago do not acquire their centenarian status by their behaviors.
01:42:31.940
Just look at Charlie Munger and Henry Kissinger, two people who recently passed away at basically
01:42:37.700
the age of a hundred, despite no evidence whatsoever that they did anything to take care
01:42:42.360
of themselves. So clearly their biology and their genes are very protective. As you said,
01:42:49.200
there are a bunch of these hallmarks. I think the original paper talked about nine and that's
01:42:54.160
been somewhat expanded. But you share that view, I suppose, that the epigenome sits at the top
01:43:01.480
and that potentially it's the one that's impacting the other. So when we think about
01:43:05.840
mitochondrial dysfunction, which no one would dispute, mine and yours are nowhere near as good
01:43:12.700
as they were 25 years ago. Our nutrient sensing pathways, inflammation, all of these things are
01:43:18.160
moving in the wrong direction as we age. How do you think those tie to methylation and to the epigenome
01:43:27.220
Maybe let's reduce it to like a kind of an engineering framework. If we took Peter's epigenome
01:43:34.160
from 25 years ago when I first met you, right? And we knew for every cell type and every cell,
01:43:41.860
what was the methylation status at all 28 million positions? We had recorded that and we took yours
01:43:48.100
today where most of those cells have deviated from that and we could flip all those states back.
01:43:55.700
That's kind of how I think about it is the cells don't go away, just whether or not they have the
01:43:59.580
methyl group or not changes. And some places gain it, some places lose it. If we could flip all those
01:44:06.600
back, would that force the cell to behave like it was 25 years ago? Express genes, the fidelity with
01:44:15.320
which it controlled those genes, the interplay between them, would it be reprogrammed back to
01:44:20.860
that state? And so that I think is a really provocative hypothesis. We don't know that for
01:44:27.720
sure, but there's more and more evidence that that might be possible. And so to me, that's the
01:44:33.080
burning question is now that we have the ability to characterize that and we know what it looks like
01:44:37.760
in a higher functioning state, which correlates with youth, and we are gaining technologies to be able
01:44:44.280
to modulate that and actually change the epigenome as opposed to modifying proteins or gene expressions,
01:44:50.120
but actually go in and remethylate and demethylate certain sites. Can we reprogram things back to that
01:44:57.780
earlier state? And if it is the root level at which things are controlled, will you then get all of the
01:45:04.300
other features that the cell had and the organism had? That's a really exciting question to answer.
01:45:09.660
Because if the answer is yes, or even partially yes, then it gives us a really concrete way to go
01:45:15.700
about this. And so we talk about the hallmarks and the hallmarks are complex and interrelated.
01:45:21.680
What I like about the epigenome is we can read it out and we're gaining the ability to modify it
01:45:27.240
directly. So if really it's the most fundamental level at which all of these other things are
01:45:32.040
controlled, it gives us, again, maybe back to the early discussion, a very straightforward
01:45:37.100
engineering way to go about this. Let's talk a little bit about how that's done.
01:45:41.840
A year ago, you were part of a pretty remarkable effort that culminated in a publication in Nature,
01:45:48.480
if I recall, it sequenced the entire human epigenome. So if we had the Human Genome Project
01:45:54.280
24 years ago, roughly, we had the Epigenome Project. Can you talk a little bit about that
01:46:00.380
and maybe explain technologically how that was done as well?
01:46:06.260
Yeah. So in the development of the Grail Gallery test, there was a key capability that we knew was
01:46:12.820
going to be important for a multi-cancer test. So very different than most cancer screening today,
01:46:19.120
which is done one cancer at a time. So if you have a blood test and it's going to tell you there's a
01:46:24.520
cancer signal present and this person should be worked up for cancer, you'd really like to know,
01:46:30.160
well, where is that cancer likely reside? Because that's where you should start your workup. And you
01:46:35.260
want it to be pretty accurate. So if the algorithm detects a cancer and it's really a head and neck
01:46:40.760
cancer, you'd like the test to also say it's likely head and neck and then do an endoscopy
01:46:45.300
and not have to do lots of whole body imaging or a whole body PET CT or things like that.
01:46:52.200
So we developed something called a cancer site of origin. And so today the test has that. If you
01:46:57.300
get a signal detected, it also predicts where the cancer is. And it gives like a top two choice,
01:47:02.900
top two choices. It's about 90% accurate in doing that. But how does that work? The physicians and
01:47:10.040
patients have gotten that have described it as kind of magic that it detects the cancer and predicts it.
01:47:14.840
And it's based on the methylation patterns. So methylation is what determines cell identity and
01:47:21.660
cell state. So again, DNA code is more or less the same in your cells, but the methylation patterns
01:47:28.340
are strikingly different. When a cell replicates, why does it continue to be the same type of cell?
01:47:34.580
When epithelial cell replicates, same DNA as a T cell or a heart cell, but it doesn't become those
01:47:41.040
it stays. It's because the methylation pattern, those exact methylation states on the 28 million
01:47:46.580
are also replicated. So just in the same way, DNA as a way of replicating the code, there's an enzyme
01:47:52.220
that looks and copies the pattern to the next cell. And so that exact code determines, again,
01:48:00.980
is it a colonic epithelial cell or a fallopian epithelial cell or whatever it is. And so we knew
01:48:06.980
that the only way to make a predictor in the cell pre-DNA is to have that atlas of all the
01:48:14.480
different methylation patterns. And so with a collaborator, a guy named Yuval Dor at Jerusalem
01:48:20.160
University, we laboriously got surgical remnants from healthy individuals. He developed protocols
01:48:27.620
to isolate the individual cell types of most of the cells that get transformed in cancer.
01:48:34.860
And then we got pure methylation patterns where we sequenced, like sequencing the whole genome,
01:48:40.000
sequenced the whole methylome of all those cell types. And we published that a year ago.
01:48:44.160
As the first atlas of the human methylome and all of the major cell types. And so for the first time,
01:48:51.300
we could say, hey, this is the code, which makes you beta islet cell in the pancreas that makes
01:48:57.800
insulin versus something else. Interestingly, there's only one cell in the body where the insulin promoter
01:49:05.940
is not methylated. And that is the beta islet cell. Every other single cell, that promoter is heavily
01:49:12.240
methylated because it shouldn't be making insulin. It's those kinds of signals that when you have the
01:49:18.840
cell-free DNA and you look at the methylation pattern allows the algorithm to predict, hey,
01:49:23.120
this isn't just methylation signal that looks like cancer. The patterns and what's methylated and what's
01:49:29.480
not methylated looks like colorectal tissue or a colorectal cancer. And that's how the algorithm does it.
01:49:37.080
And so this atlas, again, was a real breakthrough for diagnostics and it made cancer site of origin
01:49:44.080
useful. It's also being used for lots of MRD or those cancer monitoring tests too, because it's so
01:49:50.180
sensitive. But it also brought up this interesting possibility, which is if you're going to develop
01:49:55.320
therapeutics or you want to, say, rejuvenate cells or repair them that have changed or become
01:50:01.760
pathologic, what if you compare the methylation pattern in the good state versus the bad state?
01:50:07.020
Does that then tell you the exact positions that need to be fixed? And then with another technology,
01:50:13.440
which can go and flip those states, will that reverse or rejuvenate the cell to the original or
01:50:23.600
So Alex, unlike the genome, which doesn't migrate so much as we age, I mean, obviously it accumulates
01:50:29.940
mutations, but with enough people, I guess we can figure that out pretty quickly. Do you need
01:50:35.880
longitudinal analysis of a given individual, i.e. within an individual to really study the
01:50:43.260
methylome? Do you need to be able to say, boy, in an ideal world, this is what Peter's epigenome
01:50:49.820
looked like when he was one year, you know, at birth, one year old, two, three, four, 50 years old,
01:50:54.660
so that you could also see not just how does the methylation site determine the tissue specificity
01:51:04.820
or differentiation, but how is it changing with normal aging as well?
01:51:11.860
I think a lot of it is not individual specific. I'll give you an example. So I've done a fair amount
01:51:17.580
of work in T cells. And if you look at, say, exhausted effector T cells versus naive memory
01:51:24.540
cells, where younger individuals tend to have more of those, and it gives them more reservoir
01:51:29.900
to do things like fight disease, fight cancer. There's very distinct methylation changes. Certain
01:51:36.720
genes get methylated or demethylated. And those changes seem to be, again, very correlated with this
01:51:44.120
change in T cell function. My belief is that those represent fundamental changes as the T cell
01:51:52.240
population gets aged, and you end up with more and more T cells that, relatively speaking, are useless.
01:51:58.440
And so if you wanted to rejuvenate the T cells, repairing those methylation states is something that
01:52:04.380
would benefit everyone. Now, there are definitely a small percentage of methylation sites that are
01:52:11.220
probably drifting or degrading, and those could be specific to individuals. There's some gender
01:52:16.840
specific sites, for sure. There's some ethnic ones. But big, big changes seem to happen more with loss of
01:52:25.680
function, big changes in age that are probably common across individuals, or in the case of cancer, we also
01:52:34.660
have profound changes. When you think about this space, a term comes up. If folks have been kind of
01:52:42.400
following this, they've probably heard of things called Yamanaka factors. In fact, a Nobel Prize was
01:52:47.980
awarded to Yamanaka for the discovery of these factors. Can you explain what they are and what role they
01:52:58.780
What Yamanaka and colleagues discovered is that if you take fully differentiated cells, for example,
01:53:07.060
fibroblasts, and you expose them to a particular cocktail of four transcription factors, that the
01:53:13.940
cell reverts to a stem cell-like state. And these are called induced pluripotent stem cells. You subject
01:53:21.440
a differentiated cell that was a mature cell of a particular type. I think most of their work was in
01:53:27.560
fibroblasts. And the cell, when it's exposed to these transcription factors, and these transcription
01:53:33.560
factors are powerful ones at the top of the hierarchy, they unleash a huge number of changes
01:53:39.700
in gene expression. Genes get turned on, get turned off. And then ultimately, if you keep letting it going,
01:53:46.840
you end up with something that is a type of stem cell. And why this was so exciting is it gave the
01:53:54.080
possibility to create stem cells through a manufactured process. As you know, there's a lot
01:53:59.720
of controversy about getting stem cells from embryos or other sources. This created a way now to create
01:54:07.180
stem cells and use them for medical research by just taking an individual's own cells and kind of
01:54:15.400
How much did that alter the phenotype of the cell itself? In other words, the fibroblast has
01:54:24.300
a bunch of phenotypic properties. What are the properties of a stem cell and how much of that is
01:54:31.040
driven by the change in methylation? In other words, I'm trying to understand how these transcription
01:54:37.380
factors are actually exerting their impact throughout this regression, for lack of a better word.
01:54:42.860
We refer to cell-type specific features as somatic features, like a T-cell receptor. That's a feature
01:54:50.260
of a T-cell or a dendrite or an axon would be for a neuron or an L-type calcium channel for a cardiac
01:54:57.080
myocyte. So those are very cell-type specific features. So if you turn on these Yamanaka factors and you
01:55:03.920
go back to a pluripotent stem cell, you lose most of these. And that word pluripotent means the
01:55:10.840
potential to become anything, at least in theory. So you lose most of these cell-type specific
01:55:16.860
features. So the use of the iPSCs is then to re-differentiate them. And that's what people have
01:55:24.120
been attempting to do. And it opened up the ability to do that, which is you create this
01:55:28.340
stem cell that now potentially has the ability to be differentiated into something else. You give it a
01:55:34.140
different cocktail and you try to make it a neuron or a muscle cell, and then use that in a tissue
01:55:41.020
replacement therapy. And there's a lot of research on that and a lot of groups trying to do that.
01:55:46.220
You also asked about what is the relationship between that and the epigenetics and methylation
01:55:50.740
state. That has not been well explored. And that's something that I and others are excited to do,
01:55:56.620
because it could be that you're indirectly affecting the epigenome with these Yamanaka factors,
01:56:02.460
and that if you translated that into an epigenetic programming protocol, you could have a lot more
01:56:08.520
control over it. Because one of the challenges with the Yamanaka factors is if you do this for
01:56:14.900
long enough, eventually the stem cell becomes something much more like a cancer cell and just
01:56:21.000
becomes kind of unregulated growth. And so again, huge breakthrough in learning about this kind of
01:56:27.820
cell reprogramming and de-differentiation, but our ability to use it in a practical way for tissue and
01:56:35.100
cell replacements is not there. My hope is that by converting it to an epigenetic level, it'll be more
01:56:41.700
tractable. You mentioned that this is typically done with fibroblasts. I assume the experiment has been
01:56:47.160
done where you sprinkle Yamanaka factors on cardiac myocytes, neurons, and things like that. Do they not
01:56:53.960
regress all the way back to potent stem cells? I think to varying extents. I mean, if you truly have
01:57:00.400
a pluripotent stem cell, I guess in theory, it shouldn't matter where it came from, right? Because
01:57:05.280
it's pluripotent. So with developmental factors, where did your first neurons come from? You had a
01:57:11.400
stem cell, and then in the embryo or the fetus, there were factors that then coax that stem cell to
01:57:18.020
become these other types of cells and tissues. So if it's truly pluripotent, you should be able to do
01:57:24.080
that. Now, I think you're getting at something which is different, which is called partial
01:57:27.860
reprogramming. He and the people who have followed his work, they're trying to do his things which
01:57:33.640
is kind of stop halfway. So what if you took a heart cell or a T cell that's lost a lot of function,
01:57:41.360
and you give it these Yamanaka factors, but you stop it before it really loses its cell identity,
01:57:48.700
will it have gained some properties of its higher functioning youthful state without
01:57:53.500
having lost it? And so there's some provocative papers out there on this. There's a guy, Juan Carlos
01:58:00.520
Del Monte, who's done some work on this and some very provocative results in mice of doing these
01:58:06.620
partial reprogramming protocols and rejuvenating. Again, it's mice, so all the usual caveats,
01:58:13.480
but getting very striking improvements in function, in eyesight, cognition, again, in these
01:58:19.380
mouse metrics. So certainly interesting in trying to understand how that might be able to translate to
01:58:25.040
humans. Again, the worry there would be that if you don't control it, then you could make essentially
01:58:31.360
a tumor. So it's opened up that whole area of science that it's possible to do these kinds of
01:58:37.640
dramatic de-differentiations, how to really harness that in a context of human rejuvenation.
01:58:44.440
We don't know how to do that yet, but there's a lot of people trying to figure that out.
01:58:48.940
If you had to guess with a little bit of optimism, but not pie in the sky optimism,
01:58:53.940
where do you think this field will be in a decade? Which there's a day when a decade sounded a long
01:59:01.540
time away. It doesn't sound that long anymore. Decades seem to be going by quicker than I remember.
01:59:07.100
So it's going to be a decade pretty soon, but that's still a sizable amount of time for the field to
01:59:12.940
progress. What do you realistically think can happen with respect to addressing the aging phenotype
01:59:22.880
vis-a-vis some method of reversal of aging, some truly gyro-protective intervention?
01:59:32.320
So I'm optimistic and I'm a believer. I think for specific organs and tissues and cell types,
01:59:40.300
there will be treatments that rejuvenate them. It's hard to see in a decade that there's just a
01:59:45.260
complete rejuvenation of every single cell and tissue in a human, but joint tissues,
01:59:52.320
the retina, immune cells. We're learning so much about the biology related to rejuvenation and
02:00:00.960
healthier states of them. And then in combination with that, the tools to manipulate them, which is
02:00:06.200
equally important. You could understand what the biology is, but not have a way to intervene.
02:00:09.820
The tools to go in and edit these at a genomic level, to edit it at an epigenetic level,
02:00:16.960
to change the state and the delivery technologies to get them to very specific tissues and organs
02:00:23.520
is also progressing tremendously. So I definitely see a world in 10 years from now where we may have
02:00:30.120
rejuvenation therapies for osteoarthritis, rejuvenation for various retinopathies, where
02:00:37.320
we can rejuvenate whole classes of immune cells that make you more resistant to disease,
02:00:42.820
more resistant to cancer. I think we'll see things that will have real benefits in improving health
02:00:49.300
span. Alex, this is an area that I think truly excites me more than anything else in all of
02:00:56.820
biology, which is to say, I don't think there's anything else in my professional life that grips my
02:01:03.880
fascination more than this question. Namely, if you can revert the epigenome to a version that
02:01:13.800
existed earlier, can you take the phenotype back with you? And that could be at the tissue level,
02:01:20.100
as you say, could I make my joints feel the way they did 25 years ago? Could it make my T cells
02:01:27.600
function as they did 25 years ago? And obviously one can extrapolate from this and think of the entire
02:01:33.440
organism. So anyway, I'm excited by the work that you and others in this field are doing
02:01:39.040
and grateful that you've taken the time to talk about something that's really no longer your main
02:01:44.060
project, but something for which you provide probably as good a history of as anyone vis-a-vis
02:01:50.580
the liquid biopsies. And then obviously a little bit of a glimpse into the problem that obsesses you
02:01:54.740
today. Awesome. Well, fun chatting with you as always, Peter. Glad to have the opportunity to dive
02:02:00.260
in deep with this. There are many places to do this. Thank you. Thanks, Alex. Thank you for listening
02:02:05.680
to this week's episode of The Drive. It's extremely important to me to provide all of this content
02:02:10.720
without relying on paid ads. To do this, our work is made entirely possible by our members. And in
02:02:16.280
return, we offer exclusive member-only content and benefits above and beyond what is available for free.
02:02:23.000
So if you want to take your knowledge of this space to the next level, it's our goal to ensure
02:02:26.880
our members get back much more than the price of the subscription. Premium membership includes
02:02:31.740
several benefits. First, comprehensive podcast show notes that detail every topic, paper, person,
02:02:38.960
and thing that we discuss in each episode. And the word on the street is nobody's show notes rival
02:02:44.220
ours. Second, monthly ask me anything or AMA episodes. These episodes are comprised of detailed
02:02:51.380
responses to subscriber questions typically focused on a single topic and are designed to offer a great
02:02:57.340
deal of clarity and detail on topics of special interest to our members. You'll also get access
02:03:02.060
to the show notes for these episodes, of course. Third, delivery of our premium newsletter, which is put
02:03:08.160
together by our dedicated team of research analysts. This newsletter covers a wide range of topics related
02:03:14.100
to longevity and provides much more detail than our free weekly newsletter. Fourth, access to our
02:03:21.160
private podcast feed that provides you with access to every episode, including AMA's sans the spiel you're
02:03:27.400
listening to now and in your regular podcast feed. Fifth, the Qualies, an additional member-only podcast
02:03:34.740
we put together that serves as a highlight reel featuring the best excerpts from previous episodes of
02:03:40.700
the drive. This is a great way to catch up on previous episodes without having to go back and
02:03:45.200
listen to each one of them. And finally, other benefits that are added along the way. If you want
02:03:50.400
to learn more and access these member-only benefits, you can head over to peteratiamd.com forward slash
02:03:56.900
subscribe. You can also find me on YouTube, Instagram, and Twitter, all with the handle
02:04:01.980
peteratiamd. You can also leave us a review on Apple podcasts or whatever podcast player you use.
02:04:08.660
This podcast is for general informational purposes only and does not constitute the practice of
02:04:13.940
medicine, nursing, or other professional healthcare services, including the giving of medical advice.
02:04:19.420
No doctor-patient relationship is formed. The use of this information and the materials linked to this
02:04:25.240
podcast is at the user's own risk. The content on this podcast is not intended to be a substitute for
02:04:31.180
professional medical advice, diagnosis, or treatment. Users should not disregard or delay in obtaining
02:04:36.800
medical advice from any medical condition they have, and they should seek the assistance of their
02:04:41.760
healthcare professionals for any such conditions. Finally, I take all conflicts of interest very
02:04:47.100
seriously. For all of my disclosures and the companies I invest in or advise, please visit
02:04:52.720
peteratiamd.com forward slash about where I keep an up-to-date and active list of all disclosures.
02:04:59.740
peteratiamd.com forward slash about where I keep an up-to-date and active list of all disclosures.
02:05:29.740
peteratiamd.com forward slash about where I keep an up-to-date and active list of all disclosures.