The Evolution of Digital Pathology with Lee Cooper, PhD

New advances in digital pathology are revolutionizing the analysis of disease, paving the way for greater accuracy and efficiency when it comes to diagnostics, predicting outcomes and treatment. In this episode, Lee Cooper, PhD, discusses the future of digital and computational pathology and his research on machine learning and pathology, including a recent study published in Nature Medicine on using AI in predicting clinical outcomes for breast cancer patients.

“When it comes to the image analysis and the AI, it's all about improving the accuracy, the reproducibility, and the efficiency of the diagnosis — building models that perform tasks that the pathologists do, or that check their work, or that help the lab operate in a more efficient way.” — Lee Cooper, PhD

Director, Institute for Artificial Intelligence in Medicine - Center for Computational Imaging and Signal Analytics in Medicine
Associate Professor of Pathology (Experimental Pathology), McCormick School of Engineering and Preventive Medicine (Health & Bioinformatics)
Member of the Robert H. Lurie Comprehensive Cancer Center

Episode Notes

Digital pathology has been around for a long time, but advances in whole slide imaging and AI-based solutions in recent years have begun to truly revolutionize the analysis of disease. Cooper’s work reimagining clinical practices of pathology through digital and computational programming is on the cutting edge of this research.

Whole slide imaging, a central feature of digital pathology, was approved by the FDA in 2017 for use in surgical pathology diagnosis as an alternative to glass slides viewed on a microscope.
With his team at Northwestern, Cooper aims to improve and make more efficient clinical practices of pathology, asking questions such as how to build AI models with less data, or how to build software that can expand to scale as the field grows.
With collaborating pathologists, Cooper and his team research brain tumors, breast cancer, prostate cancer, hematologic malignancies, leukemias and lymphomas.
Cooper’s recent publication in Nature Medicine looks at an AI tool that can potentially assess breast cancer severity in patients with greater accuracy while also predicting the course of the disease.
Digital imaging in pathology, which allows for diagnosis from a pathologist anywhere in the country, can potentially address health disparities, allowing those at community health centers access to global experts.
The Digital Slide Archive is a collaboration Cooper and his team have had with Emory University and a software company named Kitware. The software platform helps build networks of collaborators from all over the world to work together on large datasets it manages.
Cooper encourages scientists currently using slides in their work to consider not only scanning their slides but also adopting commercial software that could allow them to analyze their data more quantitatively.
AI software still has many limitations, Cooper explains. For example, it performs well with prompts aligned within its datasets but performs poorly when thrown a curveball from outside its training. This can pose serious safety issues.
The U.S. may be behind Europe in the adoption of these AI models due to a difference in medical regulations and economic interests.
When it comes to the future of AI in the medical field, Cooper sees a democratization of AI on the horizon, where medical centers can build their own AI, and eventually, with even less data.

Additional Reading

Discover Cooper’s recent publications
Follow Cooper on LinkedIn
Find out more about Computational and Integrative Pathology Group in the Department of Pathology at Feinberg

Recorded on February 27, 2024.

Continuing Medical Education Credit

Physicians who listen to this podcast may claim continuing medical education credit after listening to an episode of this program.

Target Audience

Academic/Research, Multiple specialties

Learning Objectives

At the conclusion of this activity, participants will be able to:

Identify the research interests and initiatives of Feinberg faculty.
Discuss new updates in clinical and translational research.

Accreditation Statement

The Northwestern University Feinberg School of Medicine is accredited by the Accreditation Council for Continuing Medical Education (ACCME) to provide continuing medical education for physicians.

Credit Designation Statement

The Northwestern University Feinberg School of Medicine designates this Enduring Material for a maximum of 0.25 AMA PRA Category 1 Credit(s)™. Physicians should claim only the credit commensurate with the extent of their participation in the activity.

American Board of Surgery Continuous Certification Program

Successful completion of this CME activity enables the learner to earn credit toward the CME requirement(s) of the American Board of Surgery’s Continuous Certification program. It is the CME activity provider's responsibility to submit learner completion information to ACCME for the purpose of granting ABS credit.

Disclosure Statement

Lee Cooper, PhD, has received consulting fees from Tempus. Content reviewer Jeffery Goldstein, MD, PhD, has nothing to disclose. Course director, Robert Rosa, MD, has nothing to disclose. Planning committee member, Erin Spain, has nothing to disclose. FSM’s CME Leadership, Review Committee, and Staff have no relevant financial relationships with ineligible companies to disclose.

All the relevant financial relationships for these individuals have been mitigated.

Claim your credit

Read the Full Transcript

[00:00:00] Erin Spain, MS: This is Breakthroughs, a podcast from Northwestern University Feinberg School of Medicine. I'm Erin Spain, host of the show. New advances and digital pathology are revolutionizing the analysis of disease, paving the way for greater accuracy and efficiency when it comes to diagnostics, predicting outcomes and treatment. Dr. Lee Cooper is an associate professor of pathology here at Feinberg. He is also the director of the division of computational pathology and the center for computational imaging and signal analytics He joins me today to discuss the future of digital and computational pathology, as well as his research on machine learning and pathology, including a recent study published in nature medicine on using artificial intelligence and predicting clinical outcomes for breast cancer patients. Welcome to the show.

[00:01:06] Lee Cooper, PhD: Thank you. It's great to be with you.

[00:01:08] Erin Spain, MS: Let's start off by having you explain what digital pathology is. It's traditional applications and how it's really changed in recent years.

[00:01:17] Lee Cooper, PhD: Digital pathology is a really broad term, and it just basically describes any kind of digital imaging of glass slides. The first version of digital pathology was called telepathology. And so this is where people would share images over distances to do consultation on cases. Somebody in Germany, for example, might want to consult with someone in California who's an expert on a specific case, and they could use a telepathology system to do that remote consultation. More recently it's about whole slide imaging. And so from about the mid 1990s, these microscopes came around that were able to generate a high-resolution image of an entire glass slide. And that was mostly used for research. There were just a lot of impediments to clinical adoption, as it's very slow. The images are very large, and it's expensive to store them. More recently though, since 2017, the FDA approved whole slide imaging as a non-inferior option to glass for surgical pathology diagnosis. So instead of doing diagnosis for patients based on glass in a microscope, you could use these digital images instead. That was an approved solution. When COVID came around, this sort of remote option of using digital pathology was very popular. And so there was a temporary relaxation of all the regulations surrounding the use of this imaging clinically. And that really, I think, accelerated adoption. Other factors driving the clinical adoption are decrease in storage costs, so every year it's cheaper and cheaper to store data, the scanners are faster, the quality is better. And that's really led to extremely widespread adoption in Europe. We're ramping up in the US now, but we're still, I think, behind where Europe is where many hospitals are fully digital.

[00:02:55] Erin Spain, MS: But your lab really is on the cutting edge of this technology and you're investigating how machine learning and AI can be used to improve diagnostics when it comes to various diseases. Tell me not only about your lab and your research, but how you found your way into this very specific area of expertise.

[00:03:13] Lee Cooper, PhD: I was studying engineering in my undergrad and I took a biology elective. I'd studied mostly math and physics and computing. And what I found is that biology and disease are so much more complex compared to the technical coursework I was taking. And so that was really very interesting and intriguing to me. In graduate school, I found an advisor who had an electrical engineering background, but also quite a bit of biology. And so they were working with digital pathology images. This was in 2005, very early in the world of digital pathology. And when you see these images, they really reveal the complexity of the tissues, which was fascinating to me. You know, You don't need to be a pathologist to appreciate that when you see what these images look like. And they were also very large. And so I was intrigued by the challenge of how do you efficiently compute on these things? What's kind of kept me engaged in this field is that the pathologists, I think, are fascinating people, just their depth of knowledge and the language they use to talk to each other about disease is very interesting. And, you know, they're essentially kind of like human pattern recognition machines. And so that's really what we aim to do is to emulate them in some ways or to complement their capabilities with AI machine learning. So when it comes to my own research, to put it precisely, we build models, AI models and software to try to improve clinical practice of pathology and also research where pathology is used. The pillars of our work: we focus on AI models and kind of fundamental AI research. So, you know, how can we deal with really challenging problems where you have this massive image and there's this tiny portion of it that's relevant to the diagnosis? That's a very hard problem. How can we build models with less data? That's a very active area of research, and we do this across a lot of different disease domains. The other pillar of our research, I would say, is software. How do we build software that can scale to this challenge? So a hospital can produce millions of gigabytes of pathology data every year. And so, you really need to build high quality software to do anything with that, and also tools for kind of managing data that, you know, researchers can use to do things with this type of data.

[00:05:20] Erin Spain, MS: I'd like to hear a little bit more about some of the diseases that you're working on and collaborating with the pathologists. Can you share that with me?

[00:05:27] Lee Cooper, PhD: So we do research in brain tumors, breast cancer, prostate cancer. We've looked at hematologic malignancies, leukemias and lymphomas. It's really a lot of different diseases. One of the challenges is like learning enough about each of these things to meaningfully participate in this research. Pathologists tend to really specialize. And we have to collaborate with many different specialties of pathology. And that's something that's quite challenging.

[00:05:53] Erin Spain, MS: Well, your recent publication in Nature Medicine looks at an AI tool that can potentially assess breast cancer in patients with greater accuracy. There have been other studies looking at breast cancer using AI tools, but this one's different. Tell me about this study and what you found.

[00:06:10] Lee Cooper, PhD: This study looks at one of the fundamental tasks of pathology, which is grading. What that means is essentially, how abnormal does the tissue look. And the way that's currently done by pathologists in invasive breast cancer is that they focus on the properties of the cancer cells. What this study did is to use the fact that we know from biology that the noncancerous cells play a very important role in cancer initiation, cancer progression. But this information isn't used in grading. What we tried to do is to develop a computer system that can measure these things in a very precise way, and then use that to predict the future course of a patient's disease. These patterns of non-cancerous cells, they're things that I think the pathologists appreciate, but it's very difficult for them to score them in a reproducible way and with time constraints. It's really a place where the computer can excel, and I think where it complements the pathologists very well. The way this model works is we build a model that creates an atlas of all of the cells and structures in breast cancer tissues from these whole slide images. And from that atlas, we can generate measurements that describe how these structures appear, how abundant they are, and some of their interactions with each other. And then we take those measurements, which make sense to the pathologists. They're all designed in a rational way to capture cancer biology or prior knowledge about breast cancer. And then we use those measurements to build a prognostic model that predicts essentially patient survival. One of the advantages I think of this approach is that it's holistic. So it looks at the entire disease. It looks at cancer cells, the immune cells, other normal cells from the breast, and it's very transparent. So we can explain to the pathologists, what are the features that are important for survival that either correlate with a better prognosis or a worse prognosis?

[00:07:57] Erin Spain, MS: It's fascinating. There's so many different implications for a tool like this. Can you talk about some of the ways that you think this could really impact and improve patient care?

[00:08:08] Lee Cooper, PhD: So the information that's produced by the pathologists is taken by the oncologists and used to determine the treatment for the patient. And so, you know, to the extent that we can improve the quality of that information, we can improve the selection of that treatment. For example, in our study, we identified a small but important group of people who did not experience any deaths due to breast cancer. And one of the interesting things about this group is that these people, when assessed by the pathologist, they had high grades, low grades. So, it was really a spectrum of disease. And some of the most distinguishing features between these patients and patients who had worse outcomes were in these normal cells. So they had unusual looking stroma tissue that kind of surrounds the cancer. With that type of precise approach, you can maybe identify people, for example, who don't need aggressive treatment or who do need aggressive treatment.

[00:09:00] Erin Spain, MS: There could also be an implication for addressing health disparities. Can you talk to me about that?

[00:09:06] Lee Cooper, PhD: Many patients who were diagnosed with breast cancer received their diagnosis from a general pathologist at a community health center. Studies show that in all areas of pathology, pathologists who specialize in a specific area like breast cancer, prostate cancer, they tend to give better diagnoses. One of the things you can do with these computational models is you could hypothetically ensure that everybody receives the same quality of diagnosis no matter where they are. Whether you're in the Gold Coast where Northwestern is, or in a rural part of the United States where the nearest hospital is far away, you could disseminate that capability through the power of the internet and cloud computing.

[00:09:47] Erin Spain, MS: What stage is this research in right now? You published this article in Nature Medicine, but what are you doing now to move this into patient care?

[00:09:56] Lee Cooper, PhD: It's typically a very long road to advance these models into practice. What we need to do is to validate on more patients. And so we have partnerships and collaborations with other health centers to look at larger datasets than the ones we published in this paper. The holy grail is to take an algorithm like this and use it to select treatment in a clinical trial and then to compare outcomes with grading from the pathologist versus grading from the algorithm. But I think we're quite a ways away from that. So we have additional validation to do before we can take that step.

[00:10:29] Erin Spain, MS: There's so much going on in this space right now, and at Feinberg, you're a part of something called the Digital Slide Archive. Can you tell me about that and the role that plays in studies like the one you just described?

[00:10:40] Lee Cooper, PhD: The digital slide archive is a collaboration we've had with Emory University and a software company named Kitware. It's a software platform that's made for management of digital pathology data, but it's not like software you would install on your desktop computer. It runs in a data center and it's supposed to sort of address the needs of an institution. It allows people to manage very large datasets. So if you have datasets that have thousands and thousands of gigabytes, you can store those in a central location. Then people can access them through digital slide archive through their web browser. So it's really all about managing data, sharing data. What this allows us to do is to build networks of collaborators all over the world and to work together on data sets. For example, in this paper on breast cancer, we built a network of over 30 pathologists and pathology residents, and they generated 200,000 annotations of different types of cells and tissue structures that enabled us to create this model. That is a very different paradigm. It sort of leverages the power of the internet and digital pathology, lets you do this kind of team science with people you've never met before. So that's a lot of fun. This tool, something I think that's remarkable about it, is that it's completely free and open, and so there are people that we've never met that use this to do their own research. That's really the goal. And I think we had over a hundred thousand downloads last year of the software, which is remarkable. I mean, we don't have a paper that has a hundred thousand citations. So I think it just speaks to the sort of power of these like free software that's generated in academic settings like Northwestern.

[00:12:14] Erin Spain, MS: It feels like we're really on the cusp of something here. You mentioned in the past four years since COVID 19, people being able to share information and do things digitally that we've never been able to do before. What do you see is coming next? And what do scientists need to do to prepare for this next wave of using machine learning and AI in their studies?

[00:12:34] Lee Cooper, PhD: When it comes to the image analysis and the AI, it's really all about improving the accuracy, the reproducibility, and the efficiency of the diagnosis, building models that perform tasks that the pathologists do, or that check their work, or that help the lab operate in a more efficient way. There's a ton of opportunities there. My advice to researchers who currently use slides in their research-- maybe they're taking these slides to a pathologist who's like scoring them, one, two, three, right? My advice to those people is to seek out your core at your university and try scanning some of your slides and talk to your core about some of the commercial software they might have that would allow you to analyze this data in a more quantitative way. And, it can improve the quality of your science. It can make things go faster. You might have fun doing it. That would be my advice to the people who are primarily focused on science, but who generate tissue through their research. I think that we're getting better with both commercial and open source tools. They're easier to use. They actually provide some value. You don't have to spend a ton of time learning them. And I think we'll just see things like analysis of digital pathology images become like a basic lab technique where it's not anything exotic. You just do it like you would do genomics or Western blot or anything like that.

[00:13:49] Erin Spain, MS: Now, what are the current limitations in AI and digital pathology and what challenges do you see ahead?

[00:13:55] Lee Cooper, PhD: So there are some important limitations. Today's best AI technologies are quite dumb in some ways. For example, their behavior is very unpredictable. If you give them a pattern that's unfamiliar. So these models are built on large data sets. They're trained to sort of memorize these big data sets. And then as long as you're presenting things that are within that context, you can sort of guarantee some behavior. But if you bring in something different that wasn't represented in the training, the model can do anything. And that's a big problem for a safety critical application. It's the same problem that self-driving cars have. Another big issue is that these models do not convey uncertainty very well. So if you give it something that it's not intended to be used on, you present it some image that is unfamiliar, you'll get an answer, but there won't be any warning that you shouldn't be doing this. That's another big problem for pathology. So, there's a lot of diversity, there's artifacts, a lot of things can cause these algorithms to go off the rails. Another thing in pathology, I would say that's very important is there are a lot of rare patterns and so there are cases that a pathologist may see once in their career or even never in their career, maybe only in a textbook, and they're really responsible for being able to handle those things. And an algorithm, I think, built with the current methods will never be able to handle that. That's where I think human intelligence is so amazing. A human can see something in a book and remember that. And if they see it one time in their career, they may. recognize that's what they're dealing with. But these models don't have that capability.

[00:15:29] Erin Spain, MS: Tell me what it's been like working with some of these pathologists at Northwestern Medicine and sort of the relationships that you've been able to grow as this technology grows and you're able to, you know, really develop these new models.

[00:15:43] Lee Cooper, PhD: It's been very interesting being embedded in a pathology department. You really see how the sausage is made. I think the thing that I've learned from the pathologist is that there is a lot of nuance in these problems and to solve these problems, you need to understand that and deal with it correctly, and really need to, I think, be embedded in a department with them to really get that. You know, I work very closely with a pathologist. We have an integrated lab. His name is Dr. Jeff Goldstein. He looks at placenta. And it's really been fun working with him. I've learned quite a bit from him about how the hospital and the lab operates. And then he's, I think, absorbed a lot about machine learning and has his own research program where he's building his own models. I think it's working really well. Another thing is being able to participate in the hospital in their implementation of digital pathology and just seeing how they administer and make executive decisions. The team that does that is just so talented and impressive. And I've really learned a lot from working with them too.

[00:16:37] Erin Spain, MS: You had mentioned in Europe, they've been quick adapters of this and it seems to be more widespread. Why hasn't this picked up as quickly in the U. S.?

[00:16:46] Lee Cooper, PhD: Yeah, so that's a very interesting question. I would say one thing, and I'm not an expert on this, but you know, their regulations and how they regulate medical devices in Europe are very different. And so I think that is somewhat more permissive. And so that's given them some latitude to use digital imaging in healthcare and diagnosis when it comes to pathology. You know, another issue may be that they're not entirely driven by profit. And so, profit motivations. And so, this is a very expensive endeavor. You have to buy the machines to scan the slides. You have to pay for the storage. You have to pay for staff to run all this. And it's incredibly expensive. And so, I think, the fact that the economic models of their healthcare are different may also play into earlier adoption of this technology.

[00:17:31] Erin Spain, MS: As we've talked about, there's been so much progress really in the last four years, but where would you like to see this technology advancing in the next four or five, six, 10 years?

[00:17:41] Lee Cooper, PhD: Yeah, I think there's three things that come to mind. So the first is what I would call democratization of AI. And so I can imagine a future where, you know, medical centers, especially academic medical centers, can build their own AI. They have the essential ingredients of the data and the medical expertise. And then the software that we use to create these models is just getting better and better every year and easier to use. So there's a lot of things just out there that are free, that with a little bit of learning, you can build your own models. And so I see that as being an important shift. There's a lot of commercial activity in this space, but for some of these large medical centers, they may be able to grow their own solutions in this space, which I think is very interesting. The second thing I would say is you know, doing more with less. So, some of the studies we're doing now, we collect data for two years or so, and it would be great if we could build a model with half the amount of data. The trend right now is to build bigger datasets, bigger models, and I think it's just really not sustainable. We need to try to improve that problem, but it's a tough one. It's a really a fundamental problem. I think the last thing I would say is that a lot of the studies are still very academic. They use public datasets. I think that we as an area of study need to move towards real world validation. So there's a lot of, I think, promising evidence that these models will provide value in the real world. But the only way to do that is to really measure that in the real world and to use these models and see how much do they improve diagnoses? How much effort is involved in using these things? And really establish that real world evidence. And I think that when we do that, there's going to be some surprises. There'll be some good surprises and also some bad surprises. So that's the history of AI is incremental progress and a lot of failure along the way. So maybe we'll leave it there.

[00:19:26] Erin Spain, MS: We will be continuing to watch eagerly as things progress, especially this latest paper that you published in Nature Medicine on breast cancer. We're excited to see what's coming next. So thank you so much. It's been great discussing your work.

[00:19:38] Lee Cooper, PhD: Great. Thank you, Erin.

[00:19:39] Erin Spain, MS: You can listen to shows from the Northwestern Medicine Podcast Network to hear more about the latest developments in medical research, health care, and medical education. Leaders from across specialties speak to topics ranging from basic science to global health to simulation education. Learn more at feinberg. northwestern.edu/podcasts.