Advances in Care

Data Mining: Using Machine Learning for Predictive Neurocritical Care

Episode Summary

Over the years working in the neurocritical ICU, Dr. Soojin Park recognized a problem: She knew that 30 to 40% of her patients were at risk for stroke in the weeks following an aneurysmal subarachnoid hemorrhage, but it was still difficult to determine which patients were most likely to develop additional problems, like a delayed cerebral ischemia, and treat them accordingly. So, Dr. Park used her background in data science to develop a tool that can better predict which specific patients were at increased risk. The COSMIC score utilizes machine learning, and basic patient data such as blood pressure and heart rate, to predict likely outcomes, and improve targeted patient care in the neurocritical ICU.

Episode Notes

Monitoring patients with aneurysmal rupture for delayed cerebral ischemia was historically a numbers game. It was difficult for doctors to predict outcomes in the weeks that followed their rupture, so at-risk patients could find themselves under observation in the ICU anywhere from 7 to 21 days. Dr. Soojin Park, Medical Director of Critical Care Data Science and AI at NewYork-Presbyterian/Columbia, knew there had to be a better way to monitor patients and predict outcomes. So, relying on her background in machine learning and leveraging vast amounts of data, Dr. Park developed the potentially game-changing Continuous Monitoring Tool for Delayed Cerebral Ischemia (or COSMIC) score. The score uses machine learning, and basic patient data that can be collected with equipment available at any hospital, to detect signals that more accurately assess risk, allowing doctors to treat each neurocritical patient with targeted care - ultimately improving outcomes and patient experience.

Episode Transcription

[00:00:00]

Dr. Soojin Park grew up loving both computer programming and medicine. Ultimately when she got to college, she chose to pursue a medical degree. But despite following that path, and training to become a neurologist, she never truly gave up her love of programming. In fact, she grew more and more certain that she could apply computer processing techniques, such as artificial intelligence and machine learning, to improve patient outcomes.

The path she forged to build a career with one foot in medicine and the other in computer science wasn’t easy, and she often wondered if she was the only one who saw the potential yet to be harnessed in patient data. But Dr. Park persevered, and her passion for collecting and making sense of patient data led her to create something unique - the COSMIC score. It stands for Continuous Monitoring Tool for Delayed Cerebral Ischemia. And it’s a program that uses machine learning to help determine which patients in neurological intensive care units are at the highest risk of [00:01:00] suffering a stroke.

I'm Catherine Price and this is Advances in Care.

This week I had the pleasure of talking with Dr. Park, Medical Director of Critical Care Data Science and AI at NewYork-Presbyterian/Columbia about her research on signal processing and machine learning to do predictive analytics and what it can unlock for neurocritical care.

Catherine: Dr. Park. I'm so excited to get to talk to you today

Dr. Park: Me too. Thank you for having me. Looking forward to our conversation.

Catherine: Can you tell me about what motivated you to start working with machine learning and AI?

Dr. Park: So, very early on in my training, I was really befuddled by how much data from the neuro monitors we were dealing with, and was certain that there was information in this, in this data that was flying by. The thing that spurred my passion the most was that I felt that you should be able to look at a patient, all of their [00:02:00] massive amounts of data at 10 a. m. on a Monday with a full team, fresh, all the experts there together to merge all the data together and do the same thing for that patient, if it was Saturday night at 10 p. m.

And so from that original curiosity has just sort of continued. And, I've built a team of multidisciplinary folks and we mine that data. So physiologic data from monitors of the brain of the heart, the lungs, you know, all the things that you would see in an intensive care unit. And then we combine that data, we extract what we think is important information, and then we build that into models that hopefully can help physicians make more timely decisions.

Catherine: I want to get into your research and learn more about the COSMIC score… but first I think it would be helpful to lay out what's actually happening with the patients you study who have suffered a brain hemorrhage. So can you tell me how physicians are currently dealing with the increased risk of stroke and delayed cerebral ischemia?

Dr. Park: If you think about the problem of aneurysmal subarachnoid hemorrhage, which is [00:03:00] when you have an aneurysm in your blood vessel in your brain and it [00:03:00] ruptures and you have blood where it shouldn't be.

And then it triggers a bunch of cascaded events that in about 30 to 40 percent of patients can lead to a future stroke in the hospital in the next two to three weeks. We're monitoring patients for that in the hospital. So while ruptured aneurysms are not the most common thing, it's, it is frequent enough. It's our bread and butter here, but it ends up being a big portion of patient days that you care for in the neuro ICU and you're seeing patients over a course of two to three weeks where you're trying to distinguish is this the person who's going to have a stroke or not?

And so what we do really is monitor in real time for, “is this patient looking a little bit different?” And so, my belief was that we could detect and predict which patient is going to have that onset much more precisely than we were before. So the standard of how we do that right now is you come in, you have this ruptured blood everywhere in your brain, you take a CAT scan, you see the distribution and the [00:04:00] thickness of the blood that you see on that first scan, and it gives a risk [00:04:00] estimate of what is the likelihood that you're going to fall into a high or low risk of having this stroke?

And now if you fall in that high risk category, the group of patients studied in that risk category has a 40 percent chance of having a stroke. That means the majority of that group didn't. 60 percent didn't. But the lowest risk group has like a 20 percent chance. So you now have a low risk, but you still have a 1 in 5 chance of having a stroke. How does that help you in managing a patient? It doesn't. We're surprised all the time. Someone who looks like they're definitely gonna have a stroke, they don't. So we end up, you know, bringing everyone to the intensive care unit and monitoring them the same. And we keep these patients in the ICU for the same amount of time. It's between 7 to 14, sometimes 21 days, until we feel more and more sure they're not going to have a stroke.

Catherine: So it sounds like it’s really hard for humans to figure out which patients are gonna have a stroke… so when did it occur to you that machine learning could help with that? And also can you tell me a bit more about the COSMIC score that you’ve created helps to predict which [00:05:00] patients might actually suffer from a stroke?

Dr. Park: Way back in 2012, I was convinced that this could be a good way of bringing more precision. So machine learning, given enough data, you can try to give a prediction much more precisely than that one in five and that, you know, 60 percent or 40 percent for that patient.

And so my belief was we could do this in a time varying way where we can establish in real time, do my parameters about this patient that are observable characteristics, is it more like the group of patients who ended up having a stroke or is it more like the patients who didn't have a stroke? That was the driving belief in the beginning in our model for our, our COSMIC score, which is a prediction slash detection of the stroke syndrome that happens after the aneurysm rupture.

We've worked with the very neuro specific data where you can get a monitor placed in the head and get all that data in real time and look at brain tissue oxygen all those things. EEG, like these are things that may be available here [00:06:00] at NYP, but maybe it's not available at every single hospital.

And I wanted to have as much impact as possible across all patients with aneurysmal rupture. So we started with a pretty parsimonious model where we said, there's a shared pathomechanism between patients who have these vital sign changes and, can we see whether or not that change in the organism reflects also the patients who are developing this stroke afterwards because one of the hypothesized pathways by which this stroke might happen.

So it turned out very early on that there actually was a signal. So by selecting unique vital signs like heart rate, blood pressure, just the simple ones that every universal patient in the intensive care unit is going to have, we could find a signal that was pretty good on selecting which patient was going to have the stroke and which wasn't.

Catherine: So in other words, you don’t actually need any fancy equipment that’s hard to get, but you can just use things that are pretty much in any hospital.

Dr. Park: [00:07:00] Right, right, exactly. So it takes a large body of patient data, of patients who have had a ruptured aneurysm, who have or have not had this outcome of DCI or stroke. We have extracted types of monitoring data. So things that you would see in the ICU, like heart rate, blood pressures, that kind of stuff that you see, you monitor.

We also look at things like their, you know, gender, age. And we throw it into a model that then is able to classify an unseen patient as fitting one category or the other. And the category is, did you have this outcome of DCI or didn't you? And, our innovation is in trying to apply this in a temporal fashion, in a time varying fashion on unseen data and trying to make updated predictions every couple of hours.

Catherine: You said that when you looked at all that data you actually saw that there was a signal [00:08:00] coming through…

Dr. Park: Yes, right. We said, wait a second, there's something here where we can see something with the universal measures. Can we make that even stronger, given that we have quite a few patients, but we don't have millions of patients.

And so we did a little bit of feature engineering and really we created this cross correlation of these vital signs that would pick up that information and we put that into the model and it worked. Beautifully. And then we made it temporal. So we're trying to make it more time varying so it can be a risk score and we're calling it the COSMIC score.

Catherine: Okay, so the COSMIC score, which again, is short for Continuous Monitoring Tool for Delayed Cerebral Ischemia, that’s giving a score that would help physicians determine who's at the highest risk for DCI and that then helps those physicians to work together to make informed decisions about how to treat patients?

Dr. Park: Yes, right. So we have this cosmic score it is a machine learning score, right? So it's somewhat of a black box. People don't really know what's going into it. I want them to trust it. And so, we're doing a couple of steps, which I think is very, very [00:09:00] important before you try to implement machine learning in a clinical space, which is, first, you want to make it adapted to what the users really need. So in one arm, we are doing user centered design of this clinical decision support.

So we are actually surveying the actual clinicians who would use this information or are expert in this to say, “When you present this score, what other information do you think is necessary? What would make you believe this or not believe this? Or what would you want to confirm based on this? Does it raise the suspicion? And what information do you need right now?” to kind of deliver that mundane, you know, level of data, munging? That's number one. So we're trying to design it in a user centered way.

And so the number two in parallel, we're looking at, is it, is it working? Does it work? So when I developed it, we developed it on Columbia data. but we have friends elsewhere. So I had a group in Germany and a group in Texas who had similar types of patients, but maybe practice a little bit differently. Maybe their patient makeup is different ethnically, right? [00:10:00]

And they share their data sets with us and we were able to validate that it works, you know, you learn the model off of our patients and we validated it on theirs and it worked.So it's not overfit, you know, that term where you have all this data, you have a complex model and I have figured out that, brunette's with the name Catherine are really good at interviews but that may not generalize to all brunette Catherine's, you know what I mean?

It just really fits the model right now. So I need to know that it works on Catherine's from Germany and Catherine's from Texas and it did seem to work.

Catherine: I mean you can never tell about those Catherines.

Dr. Park: No yeah. [laughs]

Catherine: Totally unpredictable.

Dr. Park: Um yeah but then what we had done was like a pseudo prospective validation. I had their full data set. I knew everything that happened. I predicted one point in time. How does that work in real time now?

Cause you know, like you go to the store and you see a beautiful dress on a hanger and you think that looks so, so beautiful. You put it on, you're like, what is this monster I'm looking at? Right. Until you try it on, you're like, Oh, I did not have the imagination for all the things that it would have occurred to me on the hanger. So we're trying it [00:11:00] on. We're, and we're discovering tons of very interesting things, right?

So, we are understanding that when you are not knowledgeable of the full outcome of a patient and not looking at, you know, 80 patients at once, and you're looking at a single patient, how does that score get interpreted? What we're doing is we are prospectively following patients. with aneurysmal rupture, seeing what clinicians are doing, what they are seeing, when they are seeing it, and then seeing what the score says, and what it would have done differently. What would have been the actions in the idealized, fully adopted way?

So let's say, pretend they believe your AI, or your machine learning, I should say. They believe your score. They're going to, whether it's right or wrong, this is what they're going to do. How will that have hurt or helped the patient? We all know that different people are going to adopt a score like this in different ways and it maybe unpredictable. It might be that your least experienced people adopt it even more. Or maybe [00:12:00] they're the ones who adopt it the least because they don't understand the process already. They're trying to gain experience. Maybe the experienced people will adopt it more or less. We actually have no idea how that's going to shake out.

And so what are the factors that are going to increase adoption or decrease adoption? So that's the third arm. And we're doing the adoption in a simulated environment. So we're going to have cases that are designed by a software engineer to be in like a, a format where here's a real patient expert. Here's the information, you know, at this time, this is the score. What are you going to do?

Catherine: So what’ve you learned so far from the research? Has anything surprised you?

Dr. Park: So one of the things actually that was surprising to me as we're doing this clinical validation and exploring the utility of the score, we had designed it thinking we really need to figure out which of these 30 to 40 percent, these hidden people, are really going to have this untoward event. The things that we do to try to prevent it is to keep people very euvolemic. We try to make sure that they're tanked up. They have enough fluid on board. We try to control their blood pressure doesn't go too [00:13:00] low, but we'd also don't want it to be too high. It turns out there's going to be a benefit, almost maybe even more benefit to the patients who are not going to have it.

And that's the 60 to 70 percent of patients who are captive audiences of ours for two to three weeks. essentially, chained to their bed by a drain in their head or other monitoring equipment. They might be able to walk around the unit a bit. They're getting highly monitored because we're just not sure if they're the ones who are going to have the stroke.

That 60 to 70 percent of patients may not need to be in the ICU. They may be able to get that drain out of their head earlier. If we knew they did not need it earlier, the longer you have a drain in your head, the more chances you have for it to become infected. That's just the natural history of these things, right?

And so, I'm thinking, when we walked into this, that we were going to look for that, really, those sick patients I wanted to try to save stroke, but I'm starting to see that there might be, I think, as much, if not [00:14:00] more, benefit for the patients we can accurately predict are not going to have it. And get them out of the ICU faster, get those drains out faster.

Catherine: Yeah, I mean that is such an important and overlooked point because, if ultimately the goal is to have a positive patient experience, and to be healthy and safe, and you identify patients who don’t need to be in the hospital, that’s the outcome you’re looking for, like that’s the win.

Dr. Park: Yeah, it's just from the discovery of looking at the, the data that's coming out of our silent clinical validation. It's just very clear. And you know, 60 to 70 percent is more than 30 to 40%. These are the majority of our patients!

Catherine: I mean, that’s a really good point cus we focus so much on the people who stay in the hospital but ultimately it's just as important, if not more so, to focus on the people who don't need to be there! Just to take a step back, I know you want the COSMIC score to be open source so that any hospital can use it. But I was wondering if you can you tell me more about what needs to happen in order to make that possible?

Dr. Park: Fortunately or unfortunately, there was a mandate that all hospital systems that use digitized data or EHRs, you know, electronic health records have to be Translated into a universal health care language called FHIR fast health care interoperability resource. And that has occurred, for lots and lots of different kinds of data, thanks to our EHRs and our hospital systems. And we are building our machine learning model, sort of dictating how do you acquire the data at what frequency? What do you do with missing data?

So we want to figure out those issues in as universal a way as possible and write those pieces in FHIR. So then it kind of, goes and plugs into a hospital system. So this is a method that I see more and more people adopting of being able to try to do multi center clinical trials or even on the final state just being able to send it out and saying this is the model here are the [00:16:00] specifications like a recipe. This is what went into it. Be very transparent about it. And then be careful to update it from time to time to say, what is changing in your hospital? But in terms of its implementation, we've chosen this FHIR language to try to make it a lot, a lot more, implementable and shareable.

Catherine: So I'm gathering from you that you really have an interest in looking at things in the big picture and trying to figure out how to improve the overall system and make it better… And that actually brings me to your role at NewYork-Presbyterian because it definitely seems to me that that’s what you’re doing at NYP as well. So I’m wondering can you tell me a bit more about your journey and what drew you to NewYork-Presbyterian?

Dr. Park: So in my, in my previous institution, when I was a junior faculty member, and trying to figure out how to, you know, get this data to find this information, I had to kind of start from scratch. First, I had to get the data. and so I said, you know, could somebody help me figure out how to get data off of these devices?

I was really just knocking on the doors of everybody, you know, the clinical engineers. at the last hospital I worked at [00:17:00] and became friends with them.

They let me play in the device graveyard, which is in the basement of the hospital. They'd always like ventilators and monitoring devices that were being fixed or in storage. And so they would let me practice getting data off of them. I remember one of my first four days trying to find mentorship.

I knocked on a door, I traveled far and wide to go there and they basically said, why are you, why are you doing such a weird thing? Like, what are you trying to do? Nobody's interested in this, Soojin. Go back and do something much more normal. I was like, well, I guess you're not, going to be my mentor. That was great because it encouraged me to find mentorship elsewhere.

And so, about six years in, I was five months pregnant with my second kid and You know, they reached out to me from Columbia and they said, you know, we have a need for a clinical person who is mid-career. It said, and with your interest in informatics, it would be a perfect fit here because we have all of this data, that you could just come [00:18:00] here and mine, like you don't have to worry about getting the data anymore. There were only a couple of places that had invested on an enterprise ability to get data and Columbia and New York Presbyterian was one of them.

They celebrated the innovation. It was sort of a given, you know, it's, of course, this is cool and new, like you'd bring something of value to us. The Department of Biomedical Informatics, I chatted with people from there on the first day. And they were also like, yeah, this stuff is great what you're doing.

It's so innovative, so original, like you just, you definitely got something here. And yeah, so it was really hard even by the end of the day to convince myself or anybody else like why I would possibly ever, ever not come here. I never looked back. And when I got here, it was, it was such a great decision.

Catherine: That must have felt so validating after all that time.

Dr. Park: It was.

Catherine: So if you could wave a magic wand, you know, what would you vision be of what might be possible when it comes to [00:19:00] machine learning and AI in terms of how this could change the way that doctors treat patients and then what the outcomes might be.

Dr. Park: Yeah, you know, I'm hopeful that you're seeing machine learning and AI pop up in all sorts of places like I'm working in a very specific case here where I'm using data that's observable and making tools that's going to help me make decisions that comes from my Bye. like sort of frustrations early on, right?

I just want to do what I do, but better and more standardized. You're seeing it kind of pop up in workflow management. You're seeing it pop up in, aiding. So say you're a radiologist and you're looking at a very fuzzy film limited by the technology you have, and you're trying to find a tumor, and you have a certain amount of expertise.

Now there's people working on AI or machine learning to guide the radiologist to pinpoint say, I think there's an abnormality here. And these are the reasons why it's interpretable. What do you think? So like, it's like a boost or a help. It's a clinical decision support. So I think the possibilities are really endless and like so much potential. [00:20:00] Yeah, I think everywhere you're going to see it.

Catherine: Thank you so much for making the time to speak with me today This is absolutely fascinating and I am truly looking forward to seeing what you what else you do

Dr. Park: Thank you so much. It was fun talking to you.

Catherine: Huge thanks to Dr. Soojin Park for taking the time to talk to me about how data and machine learning can revolutionize neurocritical care and improve outcomes for patients.

I’m Catherine Price.

Advances in Care is a production of NewYork-Presbyterian Hospital. As a reminder, the views shared on this podcast solely reflect the expertise and experience of our guests. To listen to more episodes of Advances in Care, be sure to follow and subscribe on Apple Podcasts, [00:23:00] Spotify, or wherever you get your podcasts. And to learn more about the latest medical innovations from the pioneering physicians at New York Presbyterian, go to nyp.org/advances.

[00:20:53]