The Promise and Pitfalls of AI in Medicine: Guest Bob Wachter

April 18, 2024
Alex Smith
All Posts, Technology

Summary
Transcript

Summary

Eric asks the question that is on many of our minds – is the future of AI more Skynet from Terminator, in which AI takes over the world and drives humanity to the brink of extinction, or Wall-E, in which a benevolent and empathetic AI restores our humanity?

Our guest today is Bob Wachter, Chair of Medicine at UCSF and author of the Digital Doctor: Hope, Hype, and Harm at the Dawn of Medicine’s Computer Age. Bob recently wrote an essay in JAMA on AI and delivered a UCSF Grand Rounds on the same topic. We discuss, among other things:

Findings that in several studies AI was rated by patients as more empathetic than human clinicians (not less, that isn’t a typo). Turns my concern about lack of empathy from AI on its head – the AI may be more empathetic than clinicians, not less.
Skepticism on the ability of predictive models to transform healthcare
Consolidation of EHR’s into the hands of a very few companies, and potential for the drug and device industry to influence care delivery by tweaking AI in ways that are not transparent and already a sort of magical black box.
AI may de-skill clinicians in the same way that autopilot deskilled pilots, who no longer know how to fly the plane without autopilot
A live demonstration of AI breaking a cancer diagnosis to a young adult with kids (VITAL Talk watch out)
Use cases in healthcare: Bob predicts everyone will use digital scribes to chart within two years
Concerns about bias and other anticipated and unanticipated issues

And a real treat- Bob plays the song for this one! Terrific rendition of Tomorrow from the musical Annie on piano (a strong hint there about Bob’s answer to Eric’s first question). Enjoy!

-@AlexSmithMD

Transcript

Eric: Welcome to the GeriPal podcast. This is Eric Widera.

Alex: This is Alex Smith.

Eric: And Alex, who do we have with us today?

Alex: Today, we are delighted to welcome a very distinguished guest, Bob Wachter, who’s a hospitalist and professor and chair of the Department of Medicine at UCSF. He’s the author of The Digital Doctor: Hope, Hype, and Harm at the Dawn of Medicine’s Computing Age. He’s also credited with being the academic father, so to speak, of the hospitalist movement, and coined the term “hospitalist” in the New England Journal in the 1990s. Bob, welcome to the GeriPal podcast.

Bob: Thank you gentlemen. What a great pleasure to be here. I’ve been a long time admirer.

Alex: We should say, Bob was an early supporter. I remember walking through San Francisco Airport with Bob, we just happened to be there at similar times, and saying, “Eric and I are thinking about starting a blog.” And Bob said to us, “I’ve had more invitations to speak from my blog or my social media stuff, than from anything I ever wrote in the New England Journal or other journals.”

Eric: What was it called again, Wachter’s World?

Bob: I had Wachter’s World, ran it for about seven or eight years and then transitioned to Twitter before that became a cesspool. But yeah, it always struck me as strange in academia that the coin of the realm was writing a paper for a journal, which of course, is incredibly important, but maybe 10 people might read it. Whereas, if you do what you’re doing and 1,000 or 10,000 or 100,000 people see it, we’re here to make an impact, so whatever works works, and this clearly works.

Eric: Well, we have a lot of talk about. We’re going to be talking about AI and medicine. This was stimulated by a Grand Rounds Bob just did at UCSF, which we’ll have a link to in our show notes, around AI in medicine. But before we jump into that topic, we always ask for a song request. Bob, do you have a song request? I actually see that you’re sitting in front of a piano.

Bob: I have a request, but I’m sitting here at a piano, and you guys goaded me into playing and singing, which people should have a trigger warning that this will not good.

Eric: What song are you going to play, Bob?

Bob: I’ll give it a try. And I’m going to play a song that’s relevant for AI, because we’re really talking about the future. This is Tomorrow, from Annie. Let me see if I can get the computer right. There we go. All right. Here we go. And really, forgive the voice. I was in my high school choir, but that was more than five years ago.

(Singing).

Eric: That was excellent, Bob.

Bob: There you go. I’m going to keep my day job.

Alex: That was so great. I loved it.

Eric: That was great, Bob.

Alex: That was absolutely terrific.

Eric: I want to ask about the song. It is an uplifting song. “The sun’s going to come out.” And that starts off with my first question for you, Bob, this is my big question.

Bob: All right.

Eric: I may be done after this one [laughter]. Where are you with AI right now? Is this like Terminator: Skynet or Agent Smith from the Matrix, where the machines develop sentience and they take over and destroy us? Or is this more like Short Circuit or Wall-E, where you have these benevolent … Or not overlords. Benevolent helpers that are both cute, interesting, and maybe clean up some of our messes like Wall-E does?

Bob: It depends on short versus long term. Long term, I have no idea, but I worry a lot about it. I think, over time, the capacity of bad actors to take this over, the possibility that many of us will not have jobs and that will create social unrest, those are very real. I worry about this as the purveyor of misinformation. The fact that you could write now take a video of me and get me saying anything in any language in the world and it would look like I’m speaking. I think that’s scary-

Eric: And now, listeners have to really think, is this the real Bob, or did Alex and I have created a fake Bob Wachter?

Bob: Exactly. It could be anybody. So that, I worry about. In the short to medium term, I’m much more excited about it than scared of it. And I’d say, particularly in healthcare, where we’re not very good. Quality isn’t very good. Access isn’t very good. The experience of both patients and clinicians isn’t very good. Costs are bankrupting the society. As Biden likes to say, “Don’t compare me against the almighty. Compare me against the alternative.” And the alternative is the status quo and it’s not very good, and I can see a thousand ways that this will make it better. I think, that part, I’m very excited about. I think, the long haul, it’s really hard to know where this goes.

Eric: Yeah.

Alex: Can I follow up that question, Eric?

Eric: Yeah.

Alex: In 2019 … You wrote two viewpoints about AI recently in JAMA, relatively recently. I guess, 2019, and then, one again about last month or so.

And the first one was with Zeke Emanuel. And in that, you were very skeptical of predictive models and their ability to change patient behavior or clinician behavior. And I wonder, now, was this 2024 version sort of a redo? Would you have that same pessimism that you had then today about these predictive algorithms and their ability to transform healthcare?

Bob: Mostly, because I think that there are just a thousand different use cases for AI. And basically, everything humans do, AI will eventually be able to do a version of it.

What has been often highly touted, and you guys know this from all your work and predictions of when someone’s going to die, I don’t think it’s a done deal that knowing that number to the 32nd decimal place is more helpful than knowing that number through some kind of Gestalt-y feel. And clearly, if I thought my chances of dying in 10 years were 5% and an algorithm said it’s 80%, then, we have a problem and the algorithm could probably do some real good. But I think, in medicine, that’s often not the case. We sort of know that the patient has a decent chance of having sepsis or know that the patient has a decent chance of falling or being readmitted. And the use case of the algorithms being much more accurate predictors and getting you from, “I think it’s 20% to 30%,” to “22.7%,” that part, I’m not sure is all that helpful. But what Zeke and I were talking about, really, was one single use case, which is predictive models. I think what we’ve seen in the last few years is that the use cases are much, much broader than that.

And the use case, I was just attending on the wards, and periodically, we would see a patient and the question would come up, “How often is the ferritin high in autoimmune hepatitis?” And in the old days, I might’ve tried to search on Google or tried to search on UpToDate, and, now I would search on GPT-4, because it will integrate. And I think, often, better than searching the literature, it will integrate your entire situation the way you would if you went to a specialist.

And you’re not really asking a single factual question. You’re asking a question in the context of this patient, and I think that the ability of these tools to do that really well is astoundingly good. It’s not perfect, but I think it’s opened my eyes to all sorts of use cases that I didn’t think were possible until the first time I played with GPT-3.5 and I had that “Holy cow” moment that I think many of us have had. It’s pretty darn good, and it’s gotten better.

Eric: Mm-hmm. Let me ask you a question. Your example makes me think, how much should we trust in these symptoms, even just from a knowledge-based issue? Because I remember there was an article maybe six months ago in JAGS, maybe it was longer than that, everybody was talking about ChatGPT-3.5, I think it was, and they asked three questions about geriatrics, including about anti-psychotics, and it read perfectly. It was like if a geriatrician answered all three of those questions and it was fabulous, and I was about to just approve, I really had really no suggestions to say anything about the article, from a peer-reviewed perspective. Then, I saw one of the citations and it was Wooddaryl and Smithel. And I’m all, “Those two names sound kind of … ”

Bob: It made it up, just completely made it up, right.

Eric: And it completely made up all of the citations. And it made me think, these algorithms are based on, at that point, not based on what we think is fact, but what the algorithm predicts is what I want to hear is the next logical-sounding thing from a language model perspective. So in truth, I called it a bullshitter because the Book On Bullshit, you ever read that, Bob?

Bob: No, but I know about it.

Eric: Yeah, great book. Basically, it’s, a bullshitter doesn’t care if something’s true or not true. A bullshitter just cares …

Bob: That’s what you want to hear.

Eric: … does it sound believable?

Bob: Yeah.

Eric: Is that where we are with it? Is it still a bullshitter?

Bob: We’re better than that, but it’s still a real concern. He was probably using GPT 3.5. GPT-4 is out, GPT-5 will be out soon, and they’re better. They hallucinate, the term of art that people use is hallucinations, which is make stuff up. And it hallucinates less often than it used to. I’ll tell you a story about that in a second. But it’s hard to deny how good it is. And the reason I say that is, we went from a period, a year and a half ago, where it was brand new, what does this thing do? Then, it was like, well, it does as well as, and now, better than humans passing the medical boards, passing the law boards, doing great on the SAT, all that stuff. And you say, “Okay, that’s fine, but can it really … What about a real live clinical situation?”

And you probably saw, we talked a little bit about it in Grand Rounds, the study that just came out from Google using their AI tool. And what they did was a whole bunch of scenarios where actor patients basically chatted with a chat bot. And it was, “I’m having shortness of breath,” and it asked a bunch of questions, and, “Yes, I have a fever,” or, “My cough is productive,” or all that sort of stuff. And the actors were blinded to whether this was a primary care doctor or a chat bot answering them. They had no idea. The results were that the chat bots got the diagnoses right and had an escalation strategy that was better than the doctors, as judged by blinded specialists reading the transcripts. Okay, that’s scary, but pretty impressive. Scary if we like our jobs.

And then, the humans say, “Oh, but what about empathy?” And then, the blinded observers, these actor patients, were asked to judge the empathy of the responses. And this wasn’t like the study that came out six months earlier where it looked at Reddit back and forth where the chat bot just had infinite amount of time, so of course, it told you all sorts of wonderful stuff and made you feel better. This was about in the same volume of material, but they rated the empathy of the responses better than that of the human actor.

I think we’re getting closer to real live clinical situations, not just answering a test. It’s no longer a Jeopardy game, but it’s real live back and forth between patients and something. And I don’t think there’s any question these things can give answers that are accurate, or advice to you. Or if I ask you advice, as a specialist, helping me, as a generalist, take care of an older person, I don’t know that it’s going to be better than you, but it doesn’t have to be better than you. If it’s as good as you at a fraction of the cost and available 24/7, that’s a really, really useful tool.

I’ll say one more thing about hallucinations. I don’t know if you saw this, but Sarah Murray, who’s now our Chief Health AI Officer at UCSF Health, and who knew, two years ago, we needed a Chief Health AI Officer, but we now have one, put into GPT-3.5, maybe eight or 10 months ago, “Please write a prior authorization to the insurance company. I want to use a anticoagulant, like Apixaban, a DOAC for my patient with an insomnia.” Now, that of course is wacky. That is ridiculous to be using a blood thinner for a patient with insomnia. The thing just wrote this beautiful, absolutely beautiful, well-constructed prior authorization request to the insurance company. And it said, “In my experience, I’ve used this many times, and it really works well for my patients. And although it’s a little unconventional, the literature is increasingly supportive of this use case.” Sarah read this, and she was so flabbergasted that she went on to PubMed to search whether she’d missed some recent literature. Of course, there was not.

Six months later, she came back and put in the exact same request into GPT-4 and it said, “There must be some kind of a mistake. There’s no use of this medicine for this purpose, and it would be unethical for me to write this for you.” These things are getting better fast, and they will continue to get better fast. And I think the hallucination problem is very real, but it’s already better than it was, and I think it’s going to continue to get better and better.

I’ll say one more thing about trustworthiness. I just had another paper that I wrote with Julia Adler-Milstein and a colleague in Toronto that’ll become out in JAMA soon, but the thing that, in some ways, worries me the most in medicine, if the AI was right 50% of the time, we wouldn’t use it. It would be worthless. If it was right 100% of the time, it would be great, except, we wouldn’t have jobs. For the foreseeable future, it’s going to be right 90%, 95%, 85%, so the safety system, of course, will be, the AI says something, makes a recommendation or gives you a draft diagnosis or drafts your note, and then, the failsafe is, the doctor reads it over.

And our point, in this article, is that it’s an inherently unsafe system because humans suck at that. And they suck at it for a lot of reasons. One is, we’re not good at vigilance, particularly when the thing that’s giving us the first pass answer is generally right. We will fall asleep at the switch. The second is we will descale, over time, we’ll not be as good as we used to be and we won’t know that it’s wrong. The third is this thing called automation bias, which is the bias that we have to trust the technology even when it’s wrong.

The argument we have made is, we better approach this intentionally and strategically and say, “This system, which seems really safe, the doctor will always do the final reading, is an inherently flawed system.” And there are ways, we can talk about it if you want, but there are ways of probably making it more robust. But if you just say, “That’s the system. The AI will do its thing and the doctor will do the final sign off,” that’s going to get it wrong a fair amount of the time.

Alex: Oh, there’s so much to unpack there, so many interesting, fascinating pieces. I took notes while I was viewing your Grand Rounds, about things that were surprising or particularly concerning to me. And one of the things that just was so surprising is this finding that AI was more empathetic in that study that you cite, rated as more empathetic than the clinicians engaging in the chatbot with the patient scenarios. And that just blew my mind. And you also said that this has been found in other studies, that AI is more empathetic.

Eric: Alex, before you finish that, have you ever talked to anybody on a chatbot and thought they were empathetic?

Alex: Me?

Eric: Yeah. Think about all your interactions with a chat bot.

Bob: Like in a chatbot like the Bank of America or United Airlines?

Eric: Yeah, any of those.

Bob: Of course not.

Eric: While I am encouraged by it, it’s hard to get empathy from email, text, anything like that.

Alex: And yet, these people, they were relating the chatbot, the AI chatbot as more empathetic. But both agree there’s a limitation of the medium. And so much of this AI, right now, is in this text format, chatty back and forth in text, but the future is in voice. As Bob said, you’re going to be able to impersonate him before long, using video and speech.

When I first thought about this AI and what are the potential uses in geriatrics and palliative care in particular, I thought, oh, this is the last place where we’d want it, right? Particularly for communication, where we rely so much on the art of medicine, on empathy, on understanding emotions.

Eric: And picking up clues from the patient.

Alex: But this turns that whole thing on its head. Might it be, actually, that the AI is eventually rated as more empathetic than the clinicians in encounters with voices and images? Maybe-

Eric: I think one of the issues that comes up is, I was just on the wards and I had a few patients where we call the palliative care service to help with some very challenging end of life goals of care kinds of things. And seeing the palliative care attendant at the time do it, and we did it together, I’d be hard pressed to believe that the AI, anytime soon, will be better than her. I wouldn’t be hard pressed to believe that it’ll be better than me. Because I think I’m okay. I’m not nasty. But she basically, like you guys did, spend a considerable amount of your career learning this skill of communicating empathy and all that, and I didn’t. I like to believe I’m decent and I try, but she’s better at it than I am because that’s what she does all the time and she’s specifically trained in it.

Again, it’s, don’t compare me to the almighty. You can imagine a world where, if bringing in the palliative care or the geriatric specialist for a given situation gets you at a level of empathy that is just fantastic, and the chatbot doesn’t reach that level. But that’s not how 99.7% of these conversations happen. You guys aren’t involved. It’s regular docs trying to do it. So part of the use case here might not be that you have a conversation with the bot, but that the bot teaches the doctor how to be a little more empathetic. Can we try something a little crazy? You may never have done this on the podcast. You want to try GPT and see how it does?

Yeah, let’s see it.

Alex: Sure.

Bob: All right. Let’s give it a try. I’m connecting. This is GPT-4. “Hi, GPT-4. I’m on a podcast now. And what I want to do is, this is a tough task for you, I’m sure, but I have a patient with a new diagnosis of cancer, and it’s actually a 35-year-old woman. She’s got a couple of kids. And I’m dreading telling her that she has cancer. I wonder if you could suggest an empathetic conversation that I could have with her that would not sugarcoat the idea that she has cancer, but do it in the most supportive possible way, and taking into account that she’s got a family and she’s got young kids?”

Alex: Wow.

ChatGPT-4: Certainly, having this conversation requires a great deal of sensitivity and compassion. Here’s an approach that balances honesty with empathy. “Hi, patient’s name. Thank you for coming in today. I want to take a moment to acknowledge the strength it takes to face uncertainty about our health. I have received the results from your tests, and after a thorough review, it’s important that we talk openly about what we’ve found. We’ve discovered that you have cancer. I can only imagine how hearing this must feel for you, especially as a parent with young children. This news is overwhelming, and it’s okay to feel a mix of emotions. You’re not alone in this. My team and I are here for you every step of the way. I want to assure you that there are treatments available, and we’ll discuss all your options. It’s important-”

Bob: What do you think?

Alex: Wow.

Bob: By the way, I’ve never done that before. I took it on faith that it will do a decent job. “Wow” is the usual response.

Alex: I’ve never talked into chatGPT either.

Bob: Yeah. Tell me what your reaction is. Did that change the way you think about the way this will play out over time?

Eric: Well, it’s interesting. It felt a little bit prescriptive. We had another podcast on this idea-

Alex: The Uncanny Valley.

Eric: The Uncanny Valley, where something is, going back to, it’s 90% right, but it’s not 100%, and humans aren’t really good at picking up inauthenticity, so I worry about that a little bit.

Bob: Yeah, there’s the Hallmark card effect. Well, I use it to draft letters of recommendation, and really draft them, because I don’t think it’s a great writer and I am a decent writer and I go over it, and some of the time, it just feels pretty pro forma and rote, and I think you’re picking up on that.

Eric: But if you have to do a letter of recommendation, that doesn’t really matter. Sometimes, we do it for grants and stuff. It’s great at doing that. I am just amazed how far it is from chat GPT-3.5. And even in six months and a year, where it is. There’s edits I would’ve done to that, but that was pretty amazing.

Bob: It’s not bad.

Alex: That’s pretty amazing. Yeah. It expressed empathy, it expressed non-abandonment. It was specific to the relationship with the patient, the patient’s social context in which they might take this information. It didn’t fire a warning-

Eric: But it also reminded me a little bit of Alex’s video of taking out the trash, where he uses palliative care nurse statements around empathy to talk to his wife, which failed miserably.

Bob: Do you say, “I wish I didn’t have to take out the trash?”

Alex: That’s right. Exactly. Yeah.

Bob: Yeah, but you can imagine-

Alex: I wish, but I worry I will not.

Bob: Yeah. I would be surprised if it was as good as you guys, but I think that you can imagine a medical student or an intern doing this before they went into a room to talk to a patient and having it be pretty helpful.

Alex: This was another point that you brought up in your grand rounds, that in study after study, these AIs seem to help the least skilled more than the most skilled, who don’t need the AI to help them. And that’s the flip side of this de-skilling thing. It’s the promise and the pitfalls.

Bob: Right. Yeah, that is true. And most of those studies have been done in the business world, although I’m guessing they will be replicated in medicine, that everybody using AI as a co-pilot seems to do their job a little bit better and a little bit more efficiently, but the biggest gains are people who are newbies. And essentially, what it does is, it allows you to traverse your learning curve faster. And to me, that is super attractive.

Now, there is, I think you’re exactly right, there’s a tension between de-skilling and skilling that, on the one hand, if I use that before I had every difficult conversation, I’d probably get better at it. On the other hand, if at some point, I might get overly reliant on it and I turn my brain off, and that’s kind of, humans are that way, they tend to preserve their cognitive bandwidth, so that’s the risk.

And I think it’s particularly germane as it rolls out in people who are new to the profession. It can be a cognitive crutch, and they may not learn some of the things they really should learn. You and I would listen to a suggested list of diagnoses and say, “That’s a reasonable idea. That’s a reasonable idea. That one’s wacky. That’s crazy, that’s not.” And the question, it’s sort of a version of anchoring bias, whether over time, people would just sort of trust it and not think twice about it. And that’s kind of what humans do.

And there have been … We’ve seen this in other … One of the nice parts about medicine is we tend to be so late to digital transformation that you can look at other industries and see how that’s gone. And in aviation, there have been a few pretty terrible high-profile accidents that happened when the technology went on the fritz and no longer was working and the pilots now had to fly the plane. And I remember interviewing Captain Sullenberger for my book, and as he was describing accidents like this, he said, “They were flying a plane they were not familiar with, which is a plane without their digital wingman.” And yet, the flying is safer than it has ever been, in large part because of the digital. That’s part of the trade-off, that it probably will de-skill us, will make us less good, but if the technology is better, then, maybe net, it’s good for patients.

Alex: Yeah. Two things come up for me when I hear about this. One is EKGs and the automated read on the EKG. I remember, when I was in residency, we were taught, you print out the EKG, and the first thing you do is you fold over the top of the EKG so you can’t see the computer read of the EKG. And then, you go through and you analyze the EKG. And maybe we’re getting to a point where they’ll only be looking at the top of the EKG and reading that and not looking at all about, not thinking it through for themselves.

Bob: When I was in residency, Alex, they said, “I think, eventually, they’re going to have this thing called an EKG.” It was really inspiring. Yeah, it’s just natural to turn your brain off when the thing is right the vast majority of times, and the EKG reading is right the vast majority of times, so it’s logical that people will not get as good at that as your generation or my generation was.

Alex: Yeah, we talk a lot about de-skilling in geriatrics and palliative care, because so much of what we do is a consultant role. And we’re often increasingly asked to see patients when there’s a difficult conversation. And of course, there’s a spectrum of difficult conversations, and we’re happy to get involved, particularly in the most challenging, complex circumstances. But a lot of times, they ask us to get involved with conversations that they really should be having themselves, so we worry about that de-skilling phenomena and maybe-

Bob: Yeah, I often think they’re doing it partly because, if they do that, they have handed off an hour of work to you rather than them, as opposed to really believing that your skillset is incredibly valuable, but really, for the tough stuff. For the easy stuff, we should all be able to do it.

Alex: Mm-hmm. Eric, go ahead.

Eric: Yeah, it’s interesting, because I always think about when I was, because I used to do a lot of ward attending, I was a terrible palliative care doctor when I’m ward attending because you’re doing so many things. And that also makes me think that there are some things I’m perfectly happy to be de-skilled about. I don’t need to know how to bill, to enter encounter information to code. I can imagine scribes are not far off from dying because now, you have AI that could be a scribe.

Bob: I think you were going to ask, what’s the first use case? That is the first clinical use case. At UCSF today, 40 of our busiest ambulatory doctors are using a digital scribe. And I predict, within two years, everyone will. They went from being pretty good, but probably not good enough for prime time two or three years ago, to being wickedly good at this point. And good, I tested, one of the companies we’re using, I tested it a week or two ago, and I was like, I tried, I meandered, I wanted to talk about the weather and in between talk about the weather and my granddaughter, I had a little chest pain, but then, I wanted to talk about my urination. I did everything I could to throw this thing off, and it produced a completely coherent, as if I was a rational and linear patient, a completely cohesive and coherent soap note, basically.

And that, I think, makes the world a better place. It will save a significant amount of time. There’s probably a little de-skilling problem there, and as you say, who cares? But I actually think that I do, as I’m putting together my note, I am also repackaging my thinking. And even the act of, you have to make a decision. Does the chat, in the beginning, “How are the grandkids?” And, “How’s your tennis game?” Does that go in the note or not go in the note? The programs are probably going to be preset for it to not go in the note, but a lot of people want it in the note to remind them of this patient versus the next one. There are some design decisions to make, and I think there is some de-skilling issues, but if you think about the amount of time we spend documenting stuff, it’s just stupid.

This will be the first use case that really touches the doctor-patient encounter, and I think that’s really smart. AI has a long and not storied history in healthcare. In the ’70s and ’80s, there were a lot of efforts to build AI programs, and the first problem that they tried to tackle was diagnosis. And for my book, I interviewed a bunch of the founding fathers of healthcare AI, mostly men, and I said, “Why did you do that? In retrospect, it was such a terrible mistake, because it’s the hardest problem.” And they said, “We did it because it was the most interesting thing we could work on. It just seemed like, we’re building artificial intelligence. Of course, let’s see if we can match how the doctors think,” but they all failed, so they basically went on the shelf for 20 years. All the academic programs closed, all the-

Eric: If they had just stuck to prior auth.

Bob: Right. They should have stuck … If they’d gone to prior auth, the CEO of Doximity told me that they-

Eric: AI would be our benevolent overlord if they [inaudible 00:32:36].

Bob: The prior auth thing, I was talking to the head of Doximity, who said they built a prior off generator. All you have to do, you type in the letter “O” and it does Ozempic prior authorization UnitedHealthcare. It knows that’s what you want.

Eric: Now, insurance companies are going to want handwritten prior authorizations.

Bob: No, they’re not. What they’re going to do is, they’re going to put it in their AI that’s going to be smarter than your AI to reject your prior auth.

Eric: Oh, [inaudible 00:33:05].

Bob: And then, your AI is going to write the appeal. It’s going to be an AI arms race.

Eric: Other current applications of AI that UCSF or that you’re seeing that people are currently using with a good use case?

Bob: Yeah, I think the digital scribe is the first thing. You’re going to see chart summarization pretty quickly. A patient has 1,000 pages of Epic chart and you say, “Give me this in a one-page summary.” It will be able to do that and do it pretty well.

Eric: God, I would love the day when it just pulls in the goals of care, just from all, this is so hard to find in every EMR.

Bob: It will pull it in. I’ll probably then say, “You need to reconcile these because there are 37 different versions of it that don’t match.” Same with the med list.

It’ll certainly write the letter summarizing your visit to the primary care doc, and then, write another version of it in patient language, both their language and their reading level, if you want, to the patient. Those things are coming. There are also versions that are more prediction, but looking at things where that might be useful. At the Mayo clinic, for example, they’ve got a very robust hospital at home program. When you’re on rounds as a hospitalist, you’ll see a little banner at the top of the screen saying, “Your patient appears to be eligible for hospital at home,” taking into account a whole lot of variables that it’s ingesting, and then, making that determination, so you’ll start seeing more useful predictions, I think, that will actually make a difference.

There’s a lot on the operational side on, yes, prior authorizations, billing, scheduling, patient says to the system, “I want to see a doctor for my migraine headaches,” and knowing where you live, what the waiting lists are in various places, and then, sorting all that. I think a lot of back office function, predicting ER loads and OR loads to do better scheduling of staff. That stuff’s going to happen in the background, but I think the clinical uses will begin with digital scribes and some of these other chart related functions, but it’s not a big leap. If it’s doing digital scribing of our conversation, I don’t think it’s going to be very long before you will then see a list of a differential diagnosis. Based on what I heard in the conversation, it seems like these are the diagnoses that might be going on, and just as a prompt for you to think about it in a way that maybe I missed something.

Eric: Which brings up some of the worries, because now, we’re getting into potential diagnoses and treatment options because you have these massive … Large companies are the ones that can afford to create these AI algorithms, which means recommendations can be somewhat centralized, versus often diffuse with doctors, which pharma has always had a hard time with, because you got to hit a lot of doctors to make changes.

Alex: It’s taking a lot of people out to play golf, yeah.

Eric: Versus one AI algorithm, all of a sudden, and everybody can be your drug. And you may not know that, as a doctor or as a healthcare system, because AI is kind of a black box.

Bob: Yeah.

Eric: Is that a concern of yours?

Bob: It’s a real concern. You’ve consolidated, in the same way that, when Facebook is recommending things to you or TikTok’s recommending things to you, you don’t really know what’s under that hood. There’s a lot of room here for shenanigans and corporate bias.

There’ll probably be some rules about that, but they’re not here yet. So sure, how do you know if it’s making a recommendation to the doctor or, increasingly, to the patient, because you’re going to see more and more of these tools that are direct to consumer. It’s making a recommendation that you should really get Treatment X. How do you know that company that makes Treatment X isn’t sponsoring that?

That’s a risk with electronic health records in general. There’s already some decision support in the EHR, so it could already be happening, but I think the chances of it being happening at scale are greater as the AI becomes more ubiquitous and we just get used to it telling us what to do. There’s going to have to be some rules and regulations about that.

Alex: Yeah, really important. So many things to worry about. And in your book, you’ve written previously talked previously about adoption of electronic health records, how much hope and hype there was around adoption of health records. And it happened, and it happened in large part, as you note, because of enormous stimulus from the US government for health systems to adopt electronic health records. But instead of all the hope and promise, it led, immediately, to frustration on the parts of doctors. Here I am having to respond to this electronic health record, which has all of these other goals in mind, principally billing and generating the most revenue for the encounter rather than documentation of the best clinical care for the patient in the record.

I know I worry about so many things with AI, and there’s the anticipated and the unanticipated, and some of the things that we worry about, Eric was just talking about consolidation in a few specific corporations. Bias is a huge one, and that we are encoding … The system will replicate the bias that is inherent in clinical medicine right now. And we know that there are tremendous inequalities in how we treat patients.

I wonder and I worry, as you do too, that these systems will end up being biased and perpetuate these biases that are already prevalent in our system. And as you note, it won’t be more biased. I wonder if you want to say more about bias, if there are other concerns, whether anticipated or unanticipated about the adoption of AI, that, like electronic health records, clinicians are going to be throwing their hands up in the air and saying, “Why on Earth did we do this? Can we go back to paper charts, please?”

Bob: I guess I’ll ask you that question. If we could go back to paper charts, would you do it?

Alex: I would not.

Bob: No.

Alex: I guess I would not. Yeah, no.

Bob: Yeah, they are net positive. It’s easy to focus on the negatives. There’s no question that care is better and practice is better, but I think we did not understand, it’s not that there’s inherently anything wrong with the electronic health record, although they’re certainly not perfect. They’re not great digital tools when compared against things that we’re used to in the rest of our lives. But what we didn’t understand was the electronic health record as an enabler of a whole bunch of other stuff.

The complaining that I used to hear at UCSF Health about the documentation burden, which I do think is going to be largely solved by the digital scribes, now has been mostly replaced by complaining about the electronic health record inbox. And the electronic health record inbox is really an interesting problem. Here’s the problem that, okay, we have an electronic health record just for the doctors and the nurses, but now, we need one for the patients. That makes sense. Give one to the patients.

Patients now have a question for the health system, the next available appointment to see the doctor is three months from now, and there’s this little button on the top that says, “Send a message to your healthcare team.” It’s like, “Duh.” The fact that we were surprised by that, all of a sudden, you’re getting thousands of messages, with the assumption that you’re on 24/7, 365, and we have no business model to deal with this. The only surprising thing about that was it was surprising. And is it the fault of electronic health record? In a way. That couldn’t have happened if there wasn’t an EHR. But I think we’ve got to get better at thinking about these downstream effects.

One of the things I wrote about in the book that turned out to be one of the most commented on chapters was the demise of radiology rounds. Because back in the day, we used to go down every day to look at our films, and we would talk to the radiologists about the case. And it was wonderful and everybody loved it, and we learned from each other. And the minute digital radiology became a thing, those rounds went away. Nobody said that they wanted them to go away. They just did, because you no longer had to go to a single place to look at the film. We’ve got to get better at trying to anticipate the unanticipated. Similar things will happen with AI, without question. I think we’re smarter about it than we were. I think we’ve got governance structures in place, because we had to do that for the EHRs, that will help us think about how hard this is and the politics and the workflow and the culture in ways that we were naive about 15 years ago.

But yeah, there will be mischief, there will be unanticipated consequences, there will probably be some harm, but I’m guessing that the net effect will be positive, that quality will be better, safety will be better. The harm will be overemphasized in the same way that, when there’s a Tesla crash or a Waymo crash, “Look, I told you we can’t trust these things.” Whereas, even if the data says that, net, they’re safer, because when there’s a crash because someone’s looking at their phone while they’re driving, that doesn’t make the newspaper.

It’ll be an interesting ride, but I just think, I’ll tell you one last story about when I was teaching the medical students this several years ago, and they seemed too happy and I decided to throw them. I said, “You people need to realize you’re entering a world where you’re going to be under just intense pressure to deliver higher quality, safer, more accessible, more equitable care, and do it at a lower cost.” And one of the students raised his hand. He said, “What were you trying to do?” And the name of the game is, can we figure out a way of making care better and safer and less expensive and more accessible and more equitable? And I just don’t have any doubt that these tools will do that. It won’t be perfect. There’ll be problems. But I think it leaves me excited because I think the current system is just not going to achieve those goals the way we need it to, and I think the potential is pretty great.

Eric: Bob, it’s going to be close to the end of the hour. Before the song, I got one last question for you. Three use cases that Bob Wachter currently uses AI for that he finds helpful?

Bob: Definitely drafting letters of recommendation and announcements to the department. A new division chief, a new job, it will do the first pass. And I’d say it saves me half of the time, not all of it, half the time.

Planning a trip, going to Sicily for a friend’s wedding in a few months. “Give me an itinerary for three days in Sicily. My wife and I like really good Italian food, aren’t particularly interested in the beach, like a good museum, but maybe just one.” It will give you a layout, a three-day itinerary that’s nearly perfect.

And then, this little mini consultant. “I’m on Browns and I just have a question, and if I could call the specialist and bring them over into, that would be great, but that’s too much to do,” so I will throw that question into GPT and see what it says.

Eric: That’s great. Well, before we end, why don’t we get you back to the piano, Bob?

And now, we’re going to go back, not Agent Smith or Skynet, more Short Circuit or Wall-E. The sun’s going to come out.

Alex: Annie. Annie.

Eric: Annie.

Bob: We’re back to Annie. Okay.

Eric: No, I’m saying this is the happy version of AI. This is everything going well, like Wall-E picking up all our trash. [laughter]

Bob: Yeah, I think it’s going to be pretty cool. I think, by the time we’re wiped out as a civilization, I’ll probably be gone, so you guys are going to have to deal with it. [laughter]

(Singing).

One more, or we’re done?

Eric: That’s perfect. Wow.

Alex: That’s perfect.

Eric: I realize it’s the top of the hour. I know you have many things to do. I want to thank you for joining on this GeriPal Podcast.

Bob: Thank you, guys. It’s a pleasure. And thank you for the unbelievably great work you do, educating so many people and doing it with a smile and with music. It’s really pretty wonderful.

Eric: Yeah. Thank you for singing too.

Alex: Yeah.

Eric: Alex, I love this new addition here.

Alex: This is great. This was a gift to us, Bob. Thank you so much.

Eric: And thank you to all of our supporters.

Bob: Be well.

Podcasts

Summary

Transcript

Share This

Related Posts

Embedding Care in the ED: Liz Goldberg and Lauren Southerland

AI and Healthcare: Bob Wachter

The Role of Specialty Palliative Care in Cancer Surgery: Rebecca Aslakson & Myrick Shinall

The Future of Palliative Care? Community-Based Models with Alan Chiu, Mindy Stewart-Coffee, and Ben Thompson