Artificial intelligence (AI) in healthcare was initially hailed as a cost-saving technology, promising to streamline processes and improve patient outcomes. However, recent developments reveal a growing concern: these systems require extensive human resources and ongoing monitoring to function effectively, raising questions about their true value.
At the University of Pennsylvania Health System, an AI algorithm was designed to assist oncologists in discussing treatment and end-of-life preferences with cancer patients. This tool uses predictive models to estimate the likelihood of a patient’s death, nudging doctors to initiate these crucial conversations. However, a recent review of the system revealed that its performance had deteriorated significantly during the COVID-19 pandemic. A 2022 study found that the algorithm’s ability to predict patient mortality had decreased by seven percentage points, potentially leading to missed opportunities for discussions about discontinuing unnecessary treatments like chemotherapy.
Dr. Ravi Parikh, an oncologist at Emory University and lead author of the study, emphasized the broader implications. He noted that the algorithm’s failure to prompt timely conversations may have led to unnecessary treatments. Parikh also warned that many other medical algorithms could be experiencing similar issues, as institutions often do not conduct routine checks on their performance.
This issue highlights a growing dilemma: AI tools in healthcare require ongoing oversight to remain effective. Despite the promise of AI to alleviate workload pressures and improve care, its true cost may lie in the extensive human involvement necessary to maintain these systems.
AI in Healthcare: A Double-Edged Sword
Nigam Shah, chief data scientist at Stanford Health Care, noted the paradox of AI in healthcare. While AI has the potential to reduce costs and improve care, its implementation can inadvertently drive up expenses. “If AI increases the cost of care by 20%, is that sustainable?” Shah asked, pointing out that the resources required to ensure AI’s efficacy can be substantial.
Health officials have raised alarms over the lack of capacity in hospitals to adequately test and monitor AI systems. FDA Commissioner Robert Califf recently stated that no healthcare system in the U.S. has the infrastructure to fully validate AI algorithms before their use in clinical settings. As AI becomes more integrated into medical practice, questions surrounding its effectiveness and ongoing oversight become more pressing.
AI is already a fixture in many aspects of healthcare, from predicting patient risks to streamlining administrative tasks. With nearly a thousand AI products approved by the FDA and AI-driven startups generating significant revenue, the technology is poised to become an integral part of the healthcare landscape.
The Challenges of Ensuring AI’s Effectiveness
Despite the widespread adoption of AI, evaluating its performance remains a significant challenge. A recent study at Yale Medicine examined six “early warning systems” designed to alert clinicians when patients are at risk of rapid deterioration. The study, which used a supercomputer to process the data, revealed substantial performance discrepancies between the systems.
Hospitals face difficulties in selecting the right AI tools, as there are no standardized guidelines to evaluate these systems. “We have no standards,” said Dr. Jesse Ehrenfeld, past president of the American Medical Association, pointing out the lack of a reliable framework to assess AI tools’ performance in clinical settings.
Ambient documentation, a common AI tool that listens to and summarizes patient visits, is a case in point. Despite the substantial investment in this technology, Ehrenfeld emphasized that there is no standard for comparing the accuracy of these tools’ outputs, which could have serious consequences if errors occur.
In another study at Stanford University, researchers tested large language models, the underlying technology of tools like ChatGPT, to summarize patient medical histories. The results were troubling: even the most accurate models had a 35% error rate. In medicine, where a single omitted detail can have significant consequences, such mistakes are unacceptable.
AI Failures and the Need for Human Oversight
The reasons behind AI failures are often clear: changes in data sources, such as switching lab providers, can render algorithms ineffective. However, some failures are harder to explain. Sandy Aronson, a tech executive at Mass General Brigham, shared an example where an AI application for genetic counselors exhibited “nondeterminism,” providing inconsistent results when asked the same question multiple times.
Despite these challenges, many experts remain optimistic about AI’s potential in healthcare, particularly in supporting overburdened professionals like genetic counselors. But Aronson stressed that the technology still needs significant improvement before it can be reliably integrated into clinical workflows.
To ensure the continued effectiveness of AI tools, hospitals and healthcare systems will need to invest significant resources in their maintenance. At Stanford, it took 115 man-hours over the course of eight to 10 months to audit just two AI models for fairness and reliability.
Some experts suggest that the solution could lie in AI monitoring other AI systems. However, this would require even more resources — a difficult proposition given the constraints of hospital budgets and the limited number of AI specialists available.
“How many more people are we going to need?” Shah asked, acknowledging the complexity of the issue. The vision of AI monitoring AI is enticing, but it raises the question of whether healthcare institutions can realistically bear the financial burden of such an approach.
As AI continues to play a larger role in healthcare, balancing the promise of efficiency with the reality of the human resources required to sustain it will be key to determining its long-term viability in the field.
Related Topics
WHO and Lesotho Ministry of Health Join Forces to Advance Digital Health Strategy
Legionella Contamination Leads to Closure of Mental Health Ward at Lister Hospital
No Plans to Ban Ranitidine in India, Health Ministry Clarifies