Cancer Type

Change My Cancer Selection

Understanding Artificial Intelligence and Its Direct Impact on Cancer Care

In this DECODE podcast, Empowerment Lead Lisa Hatfield and Dr. Virginia Sun explore how artificial intelligence, especially large language models (LLMs), is transforming cancer care. From diagnosing adverse events to streamlining clinician workflows, they unpack what patients need to know, clarify common concerns, and discuss how AI can improve outcomes while preserving the irreplaceable human connection in medicine.

Transcript

Lisa Hatfield:

Welcome to DECODE, a Patient Empowerment Network podcast that breaks down how emerging technologies, like artificial intelligence, are changing cancer care. I’m your host, Lisa Hatfield. Our goal is simple: to help you stay informed, confident, and empowered as you navigate your care. Let’s get started.

Today we’re diving into a topic that’s not only shaping the future of medicine but already transforming the way care is delivered. I am talking about artificial intelligence or AI. From diagnostic imaging to treatment decision-making, and even patient education, AI, particularly large language models, or LLMs, are being woven into clinical oncology in very real ways. Joining me today is Dr. Ginny Sun from Massachusetts General Hospital. Dr. Sun, thank you so much for joining us.

Dr. Virginia Sun:

Thank you so much for having me.

Lisa Hatfield:

So, probably the first thing we should do is identify what artificial intelligence, or AI, is. And I had spoken with another AI researcher, and she explained to me that AI refers to any computer program that fulfills a task typically associated with human intelligence. So things like maps, or facial recognition, or smart doorbells. So, Dr. Sun, what do you think of that definition, and do you have anything to add to it in general terms?

Dr. Virginia Sun:

Yeah. I think that’s a really accurate definition of artificial intelligence. It’s kind of this really big umbrella term that’s used, and sometimes almost interchangeably with other specific artificial intelligence techniques, such as machine learning, deep learning, things like that. And those are great examples. We also use it all the time. So, for example, just pulling up Google Maps and finding the shortest distance to another location, that is artificial intelligence.

Lisa Hatfield:

Okay. Thank you. So, you mentioned that too, there’s subsets within artificial intelligence. And some of the organizations that do AI research use the onion model, looking at layers of an onion. So if the outer layer, for patients who aren’t familiar with AI, if the outer layer is artificial intelligence in general, AI, they are computer systems that can replicate human tasks like learning, reasoning, problem-solving, and decision-making.

So, some examples of that are things like a Roomba. It learns how to steer clear of edges of the room or stay on the carpet or go over bumps, or even spam filters on email accounts. That is a type of artificial intelligence. And then underneath that layer, a subset of AI is called machine learning. Machine learning involves training algorithms to recognize patterns and make decisions based on data without explicitly being programmed to do so. So, that subset of AI called machine learning, an example of that would be something like Netflix or Amazon making personalized recommendations to you. And then another layer in, so we’ve got AI, artificial intelligence, a subset of that is machine learning, and a subset of that is deep learning.

So, deep learning, and this is a little more complex, we don’t need to discuss too much, is where something called artificial neural networks, the important thing to remember is that these are inspired by the human brain. They learn from large amounts of data to perform more complex tasks like image and speech recognition. So examples of that, some people are familiar with Alexa, they have Alexa in their home, or Siri, they use these virtual assistants, and these virtual assistants use deep learning to understand language and speech recognition.

Or even on social media, when people have pictures and there’s a little box that appears and it says, this is Jane Smith, would you like to tag her? That’s using a form of deep learning. So we have AI layers, a subset of that is machine learning, a subset of that is deep learning. And finally, what we’re going to talk about, and it’s a little more complex than this, but something called generative AI and foundation models. And this is kind of where AI exploded. Everybody heard about AI, they’re talking about chatbots, there are some ethical concerns with it, people are worried about AI taking over.

And basically, this is where large-scale AI models are trained on vast amounts of data that can be adapted and fine-tuned for a wide range of tasks. So, people understand things like chatbots, there are chatbots used for customer service purposes, or chatbots on cancer websites to explain things. Most people are familiar with that. So we’ve got the onion layer model, the artificial intelligence, which has a subset called machine learning, which has a subset called deep learning.

And then the innermost layer that we’re talking about is generative AI. And one of the models we find within generative AI is something called Large Language Models, or LLMs. And, Dr. Sun, this is a perfect time to segue to you because you are using LLMs in some of your cancer research right now. So, can you describe how you’re using LLMs and kind of the real-world impact of using those for cancer patients and cancer care?

Dr. Virginia Sun:

Yeah. I would love to. So, I guess first to just maybe dive into what exactly a large language model is. So, if we kind of peel back one layer above the large language model, there’s this whole entire category within artificial intelligence called natural language processing. And so this is kind of like, almost it’s like separate entity from machine learning. So if you think about this onion, maybe it’s like an onion that’s split in two. But there’s a lot of overlap.

So maybe instead of an onion model, maybe we can think of like a Venn diagram. And then between the sort of like the circles of overlap between machine learning and natural language model, there’s natural language processing, there are large language models that sit right in between. So, it’s kind of like the grandchild of both deep learning and natural language processing. You can think of it as basically a very super well-read parrot but way smarter. It’s able to understand and generate human language. And it’s able to do that because it’s been trained on so much text data from the internet, from books, that it’s able to predict every word or sentence that comes next.

And that makes it great at tasks such as answering questions, writing summaries, translating, or what I work on, on analyzing medical notes. And I think large language models really are one of the most powerful tools in the space today. And that’s what caused all of this press about generative AI.

Lisa Hatfield:

Okay. Thank you. Now, would you speak a little bit about the research that you’re doing right now, using this large language model? Can you talk about the research you’re conducting right now?

Dr. Virginia Sun:

Of course. So, large language models I’m using to help detect something called immune-related adverse events or irAEs in cancer patients. And so we’ve probably all heard of ChatGPT. It’s one of the pioneering large language models out there. But instead of using ChatGPT, we basically created our own locally hosted large language model. That way we didn’t have to share any patient data across the Internet, to try and detect these immune-related adverse events. And so let’s talk a little bit about the science behind it first.

So immune-related adverse events are the side effects that happen when a patient’s immune system is ramped up by a type of cancer treatment called immune checkpoint inhibitors. And so these immune checkpoint inhibitors have been incredible. They’ve reshaped how cancer therapies are often treated. And what they do is they turn on your immune system so that it targets the cancer cells. But then sometimes these immune checkpoint inhibitors, the immune system also starts attacking healthy parts of the body, such as the lungs, the liver, or even the heart. And these reactions can be really hard to spot. And if we miss them, they can even be life-threatening.

And so here’s one of the issues. Doctors, including myself, we write a lot of notes in the hospitals. And these events are oftentimes buried in free text progress notes. So, it’s like not even like very neat checkboxes. We didn’t have a diagnostic code for immune-related adverse events until recently. And so we wanted an easier way for clinicians and researchers to basically detect these immune-related adverse events. And so we built a tool using natural language processing using this large language model that reads through these notes just like a clinician would, and then flags the patients who may be having one of these dangerous side effects. And so you can think of it as a very smart digital assistant that never gets tired, scans through thousands of patient charts in seconds, and then basically says, hey, this patient might be having immune-related hepatitis or myocarditis. You should take a look.

Lisa Hatfield:

Okay. That’s really helpful. So to back up a little bit too, I’m curious, where does that data come from? You have to train the model to recognize that somehow so it knows to recognize this could be an adverse event. Where’s that data pulled from?

Dr. Virginia Sun:

That’s a really great question. And I think a lot of times when we think about artificial intelligence, we think about training data. The thing is, with large language models, they’re already trained or in order to build a large language model, they’ve been already trained on thousands of texts. Some of it is medical text, sometimes it’s not. But in either case, it’s able to, when I say comprehend, sort of like the way a machine would, it’s kind of able to understand through like context clues, what’s going on.

So, I never actually fed any specific medical information to train this large language model. It’s just able to do sort of like basic reading comprehension, just like we would if we were taking our SATs, and then be able to detect whether or not an immune-related adverse event is happening.

Lisa Hatfield:

Okay. So, are you using that right now at your facility at your hospital?

Dr. Virginia Sun:

Yeah. So, we’re using it at Massachusetts General Hospital, also at some other institutions as well, mainly more on the research side. So, we’re really focusing on using it to retrospectively or basically look back at existing medical records, to help researchers identify patterns and spot cases of immune-related adverse events. We’re sort of seeing how it works more on the clinical setting, but I think in order for us to roll it out, we have to do a lot more quality control just to make sure that it’s safe among our patients.

Lisa Hatfield:

Sure. And then as a patient, one of the questions I might have is, Okay, you’re retrospectively incorporating some data from electronic medical records. Is there any chance that using this data where I can be identified down the road because this data was incorporated into this model?

Dr. Virginia Sun:

I think the hope is sort of more to, ultimately, I think, like we want to be able to help it sort of like in real time, like we want our large language models to help identify these cases in real time. And then maybe that will help with real-time clinical decision-making. But even with the retrospective research, I do believe it can have a really big impact because I think one of the biggest barriers in cancer research is just getting the accurate and consistent information across large numbers of patients. And so this tool really allows us to do that quickly and at scale with much more accuracy than even a human would be able to do.

Lisa Hatfield:

That’s interesting to know. So, it also makes me hopeful as a patient because it sounds like that by having the human and AI team, that this data can be processed a lot faster, giving the provider information much more quickly based on huge amounts of data, probably more accurately, hopefully giving the provider more face time with the patient. Do you think that could be an outcome for the patients?

Dr. Virginia Sun:

Yeah. I think one of the biggest challenges that I, as a doctor, struggle with is just like the limited face time I have with my patients. So, anything that can take me away from the computer and away from the charts just to spend more face to face time, I think artificial intelligence is probably one of the best ways to help with that.

Lisa Hatfield:

Okay. So, your research is pretty specific based on immune-related adverse events, if I said that correctly. Do you think that this model can be integrated into other systems or implemented in other systems or is this very specific to your institution?

Dr. Virginia Sun:

No, I think we really wanted to make this project scalable and adaptable. So, for example, within the immune-related adverse event space, we have this organization called ASPIRE, which stands for the Alliance for Support and Prevention of Immune-Related Adverse Events. This includes physicians and patients and researchers from all over the world, specifically inside the United States. And so they’re all trying to use this technology to help and create their own database of immune-related adverse events, with the hope of eventually being able to combine our data together to create a multi-institutional database.

And then I think ultimately, I truly believe that better data leads to better science, and that is what leads to better care as well. On the other hand, large language models, like I said, because they’re trained on so much data, I didn’t have to do any additional training to make it specific to immune-related adverse events. So, you could easily take the exact same code that I wrote but then change the question. So instead of, let’s say, immune-related adverse events from immune checkpoint inhibitors, what about just, let’s say, the risk of heart failure from anthracycline therapy, like a very common breast cancer therapy? I could just change that question and then run the same data or sort of a different set of patients who are on anthracycline therapy and then be able to get similar results.

Lisa Hatfield:

Okay. This is fascinating to me. What is the reliability of that data? So, if you do have something that is flagged as an immune-related adverse event, what’s the reliability of that? Is it pretty spot-on? Have you found that so far?

Dr. Virginia Sun:

Yeah. It does depend a little bit on the context, but we do find that for the specific project that I worked on, which was looking at immune-related adverse events among inpatient hospitalizations of patients on immune checkpoint inhibitors, it was able to achieve greater than 90 percent sensitivity and 90 percent specificity. So basically, I think almost all of the cases of immune-related adverse events were detected, but then in some cases, sometimes it would over call things a little bit. So, let’s say it was mistaking infectious colitis, so an infection of the colon with immune-related adverse event colitis or immune checkpoint inhibitor colitis. So, it would have these false positives, but it wouldn’t really have that many false negatives.

And so we’d still need another person to verify everything, but it still cut down the amount of work that needed to be done tremendously. We do find that if we try to switch it to the outpatient setting, it is a little bit more challenging. And I think it’s just because of the lower acuity of immune-related adverse events we see in clinic compared to the ones we see in the hospital. And also the diagnostics are kind of drawn out a little bit more. It’s like sort of through multiple weeks or even months to diagnose an immune-related adverse event in the clinic as opposed to in the hospital where all of that workup is done within days.

Lisa Hatfield:

Okay. That sounds like it’s a great first step. So, if I understand this correctly, an adverse event can be flagged and probably a lot more quickly than a person trying to go through a chart and look at all the data there. So, a patient is known to possibly have an adverse event, you can treat it much more quickly, hopefully leading to better outcomes for patients. But backing up a step, do you think, and I know this is not what you’re studying, do you think it will ever be possible to predict an immune-related adverse event or only to flag one that’s already happening?

Dr. Virginia Sun:

No, I think there are ways to predict immune-related adverse events. And I think that is a big field and something that I’m working on, but I don’t think that it would be coming from the large language model itself. Because you have to think a little bit about how large language models are trained and what the purpose is, I suppose. So, large language models, they’re really good at understanding and generating human text. It’s really good at reading comprehension, which is ultimately what this project is. It’s doing reading comprehension on thousands of clinical notes, and then basically answering a simple question. If we’re talking about prediction, that’s not really what a large language model is meant to be for. And maybe you can do some things like inferencing, kind of like the way we might be able to inference things from reading between the lines of a text. But in terms of actual predicting, that sort of falls under the jurisdiction of other neural networks or deep learning models.

Lisa Hatfield:

Okay. Well, I’m really excited about this, because I think anything we can do to help, especially cancer patients who are in the hospital, or even on an outpatient basis, treat adverse events sooner, obviously the better their outcome will be and the more likely they are to stay on a treatment. So, I think this is super fascinating research. And I want to jump over a little bit too.

So, what you’re talking about is from the clinician perspective, how they can possibly increase efficiency by having these adverse events flagged sooner. So as a patient myself, I do like to do a lot of research and look up. I’ve been known to reference Dr. Google once in a while when I’m not supposed to. So, let’s say I’m having some symptoms, I have a cancer. I’m having some symptoms of relapse. I go in, I decide I’m going to try ChatGPT for the very first time. It’s this weird thing I hadn’t heard of, but a lot of people are talking about it. I type in my symptoms, it gives me a whole list of what this could be, not too alarming.

If I were to take that in, I have a whole printout of my ChatGPT. I’m possibly relapsing from whatever cancer I might have, or maybe I’m having an adverse event. First of all, do you think patients should be using that? And second of all, how do you think providers could and should receive that information coming from a patient?

Dr. Virginia Sun:

Yeah. I think, you know, I absolutely empower my patients to look up things online. Feel free to use ChatGPT if that’s sort of like their preferred way of getting information. I think it’s just, you know, everything should be taken with a grain of salt. So, I’m sure you mentioned Dr. Google. I’ve done it too. Let’s say I look up some symptoms and then I see on WebMD and it’s something really scary sometimes, and ultimately it’s usually not that. And so I think, you know, ChatGPT is really trained on the same data. It’s using the same information that we see on the Internet as well.

So, you never know what you’re going to get, but it’s probably going to be in a similar vein. So, I wouldn’t ever replace a ChatGPT without like consulting with a doctor first. But if you have a question, if a doctor is not available at that time and you just need to figure out the acuity of how important this question is, I think it could be a good first step just to get more information about things. There are so many things that I want to be able to tell my patients, and I can’t fit that in into a 30-minute appointment.

So if going on ChatGPT to get some of that extra information or those questions that weren’t answered, I think that’s totally fine. I will sort of add though, so ChatGPT, it really has made so much progress over the last few years. I remember when it first came out, it wasn’t really connected to the Internet. Sometimes it would do these things called hallucinations, quite often actually, where it would make up these fake research articles to try and support some of its claims. I think that was one of the major feedback points for people when they were talking, sort of like giving feedback back to ChatGPT.

When they realized like, hey, you’re making up fake research and then using that to support your claims. So they’ve actually sort of now incorporated real research articles. They try not to do that. But there’s no way for sure if everything that they say is completely true. There’s plenty of false resources on the Internet as well that it might be being trained on. So like I said, everything has to be taken with a grain of salt.

Lisa Hatfield:

Okay. Thank you. And you had mentioned something too, you said you even have used Dr. Google, but you also go to WebMD. And I think that’s a good point for patients. If they are using ChatGPT or looking things up on the Internet, to use multiple sources and make sure one of those sources is their physician or their provider also, but using multiple sources maybe can help overcome some of those inaccuracies that could be present based on, like you said, the data is coming from all over from the Internet, so it could be inaccurate. So, using multiple sources might be a way to mitigate some of the risk of having inaccurate information.

So, Dr. Sun, can you speak to any biases in AI? And how are these biases decoded or overcome?

Dr. Virginia Sun:

I think talking about biases in artificial intelligence is such an important conversation, especially as you’re going to start using artificial intelligence more in your daily practice, you have to understand what the shortfalls are. So like I kind of mentioned, if a large language model is trained on inaccurate data, then you’re going to get inaccurate results. And oftentimes, large language models, they don’t discriminate between what is being pulled off the Internet. And there are plenty of false advertisements or fake news about different medical things out there.

So, that’s something that you have to be aware about too. Personally for me, if I am to sort of use artificial intelligence to help inform my patient practice, or my own practice, I do want to focus on making sure that whatever the large language model that I’m using is being something that is evidence-based, that it’s being trained on very accurate data. So personally, I don’t use ChatGPT for my own research purposes. I certainly use it to help draft emails sometimes. But like one of the resources out there for physicians is called Open Evidence, which basically pulls all of the papers that are published from PubMed, and then uses that as kind of the database as your chatbot. So, it’s much more reliable than, for example, just the all purpose large language model that you can find on the Internet.

I think the other thing is, you know, when we train data, how much of that data actually reflects us as a population. A lot of times, models can be trained only on health records from a certain demographic. And so that may not represent people from underrepresented communities or those who are getting care in like rural settings. And it’s not because AI is really intentionally biased. It’s just it’s never seen enough diverse examples to actually learn from it. And so this can be applied for race, for gender, language, and even health literacy. And if certain voices and experiences are being missed from the training data, the AI just may not be able to serve those people as well and in some cases can actually be dangerous.

There’s also a lot of ethical concerns about consent and data use as well. It’s kind of its own topic but similarly can lead to bias as well. We don’t know which medical records, what exactly the health records sometimes are being used to train some of these like artificial intelligence. Oftentimes, if it’s coming from a hospital, they have a lot of requirements, regulations in terms of making sure that there’s permission from the patients or that all of the data is de-identified. But we don’t know if these like more sort of like accessible or in other words, like artificial intelligence that you can find online, if that medical information is actually from patients and whether it was obtained ethically or not.

Lisa Hatfield:

Okay. Thank you for explaining that too. And this might be too much of a deep dive into this, but if we are using the large language model of artificial intelligence and trying to determine something like treatment outcomes for a specific demographic of patients, but we know historically that that population of patients has been an underrepresented community in clinical trials, is there any way to train a model to acknowledge that the fact that those, the data probably has, is not accurate for representing some communities because the clinical trials didn’t have that?

Dr. Virginia Sun:

Yeah. I think for all research, we should always think about what the limitations and the biases are. And so that should always be put in writing. And then there are also ways to why sometimes it’s called like augment, basically a certain population that we have less data about. What this does is like, let’s say like 90 percent of patients are from an urban setting and then 10 percent are from a rural setting. And you really want to make sure that that rural setting is those patients from the rural area are being well represented.

So what you can do is augment that 10 percent, basically use some fancy coding to basically turn that 10 percent into 50 percent of their population. And you kind of do that by figuring out what are some common patterns within that 10 percent and then kind of like create like almost like fake patients or fake sample data that’s based off of real patient data. And then using that then to train the model is like one technical way to surpass that barrier that we have.

Lisa Hatfield:

Okay. Thank you. And I’ll close with this final question that some people might be thinking in their heads. Hopefully the use of AI teaming up with our providers will allow patients to have more time with their patients, because some of these decision-making points or the data they’re trying to collect happens more quickly. Do you think that there will ever be a time when patients will not see their providers frequently or maybe even at all because AI, it’s a fear that comes up, AI will take over and allow for patients to be cared for by generative AI, by something, a robot or whatever that might be able to come in the room and tell us all about our care and treatment plan? Do you think that’s possible?

Dr. Virginia Sun:

I mean, I think certainly possible. Hopefully not within my career, because then I won’t have a job. But to be serious, I actually don’t think that that will happen anytime soon. And it’s really because it’s so much more than just the medical knowledge. If you just gave like ChatGPT, for example, one of our licensing exams, it could probably outperform a lot of doctors out there. I think it does actually score a passing grade. But would you want ChatGPT to be your doctor? I don’t know. I personally wouldn’t. I think there are so many personal decisions. There are so many things about me needing to know a patient just as a person who they are, what their values are. And that informs me in my clinical decision-making just as much as the medical knowledge itself.

So, I certainly think it can help. I think it can help streamline diagnoses. I think it can help make us a lot more accurate and sort of providing these recommendations. But then in terms of actually sitting down and working with the patient and figuring out a plan that works with their values, I think we need another person in the room with them.

Lisa Hatfield:

Yeah. Well, thank you for that reassurance. And I agree with you also, because I believe that at least 75 percent of the time with my oncologist is spent with the human connection and providing hope and reading their body language that everything is going to be okay. And talking about those things like values and family, and oh I want to go to a wedding, so can I go off treatment? It’s the technical aspects that AI can do that’s as important, but less time is taken usually doing that than that human connection.

And I believe that AI, at least right now, cannot replace that. And we patients need that so much in our care to maintain our hope. So, thank you for that reassurance. I agree with you on that. So that brings us to the end of this episode. Thank you, Dr. Sun, and to all of you for joining us. We’re grateful to have you as part of this Patient Empowerment Network’s Decode Program, where we’re decoding complex topics to help you stay informed and empowered in your cancer care. I’m Lisa Hatfield. Until next time, take care and be well.

Understanding Artificial Intelligence and Its Direct Impact on Cancer Care

Transcript

Share On: