Machine learning and artificial intelligence (AI) are growing in popularity in fields ranging from art to science and everything in between — including medicine and bioengineering. While these tools have the potential to make significant improvements in healthcare, the systems are not perfect. How can we identify when machine learning and AI are suggesting solutions that don’t work in the real world?
Carl Illinois College of Medicine (CI MED) faculty member and bioengineering professor Yogathisan Varatharaja is working to answer this question with his research group, which aims to understand when and how specific models created by AI will fail. Varathiraja and his team recently presented a paper on this topic, titled Evaluation of latent space robustness and uncertainty of EEG-ML models under realistic distribution shifts, At the prestigious Conference on Neural Information Processing Systems, or NeuralPS.
“Every field of healthcare uses machine learning in one way or another, and so they become a mainstay of computational diagnostics and predictions in healthcare,” said Varatharaja. “The problem is that when we do machine learning-based studies — to develop a diagnostic tool, for example — we run the models, and then we say, well, the model works well in a limited test setting, so it’s good to start. But when we actually deploy it in the real world To make real-time clinical decisions, many of these approaches do not work as expected.”
Varatharaja explained that one of the most common reasons for this disconnect between models and the real world is the natural discrepancy between the collected data used to create a model and the data collected after the model is published. This variance may come from the hardware or protocol used to collect the data, or simply the differences between patients in and out of the model. These small differences could add up to important changes in the model’s predictions, possibly which model fails to help patients.
“If we can identify these differences early on, we may be able to develop some additional tools to prevent those failures or at least know that these models will fail in certain scenarios,” Varatharaja said. “That is the goal of this paper.”
To do this, Varatharajah and his students focused their efforts on machine learning models based on electrophysiology data, specifically EEG recordings collected from patients with neurological diseases. From there, the team analyzed clinically relevant applications, such as comparing normal EEGs to abnormalities, to determine whether the two could be differentiated.
“We looked at the kind of variability that can occur in the real world, especially those variables that might cause problems for machine learning models,” Varatharaja said. “And then we modeled those variables and developed some ‘diagnostic’ measures to diagnose the models themselves, to see when and how they will fail. As a result, we can be aware of these errors and take steps to mitigate them early on, so the models are really able to help clinicians take action.” Clinical Decisions.
Sam Rawal, co-author of the paper and student at CI MED, says this study can help clinicians make better decisions about patient care by bridging the gaps between the results of a large-scale study and factors that relate to the local population. “The significance of this work is identifying the disconnect between the data on which AI models are trained, compared to the real-world scenarios they interact with when deployed in hospitals,” said Rawal. “Being able to identify such real-world scenarios, where models might fail or perform unexpectedly, can help guide their deployment and ensure they are used in a safe and efficient manner.”
Presenting the team’s research at NeurIPS—one of the premier machine learning conferences in the world—was particularly significant. “It is quite an achievement to have a publication accepted in this place – it gives us a name in this community,” said Varataraja. “This will also give us the opportunity to develop this tool further into something that can be used in the real world.” Bioengineering PhD student Neeraj Wagh presented this work at the NeurIPS conference.
Contributors to the work include co-authors Sam Rawal of CI MED; Bioengineering, Neeraj Wagh, Jeonghao Wei, and Brent Perry. Varathiraja also applauded the partnership between Illinois Bioengineering and Mayo Clinic’s Department of Neurology. This project was also facilitated by the Mayo Clinic and supported by the National Science Foundation.
Editor’s Notes: The original version of this article can be found by Bethan Owen of the Department of Bioengineering at UIUC here.