Focus Area: Clinical AI

May 01,2025

Significance of Image Reconstruction Parameters for Future Lung Cancer Risk Prediction Using Low-Dose Chest Computed Tomography and the Open-Access Sybil Algorithm

Purpose

Sybil is a validated publicly available deep learning–based algorithm that can accurately predict lung cancer risk from a single low-dose computed tomography (LDCT) scan. We aimed to study the effect of image reconstruction parameters and CT scanner manufacturer on Sybil's performance.

Materials and Methods

Using LDCTs of a subset of the National Lung Screening Trial participants, which we previously used for internal validation of the Sybil algorithm (test set), we ran the Sybil algorithm on LDCT series pairs matched on kilovoltage peak, milliampere-seconds, reconstruction interval, reconstruction diameter, and either reconstruction filter or axial slice thickness. We also evaluated the cumulative effect of these parameters by combining the best- and the worst-performing parameters. A subanalysis compared Sybil's performance by CT manufacturer. We considered any LDCT positive if future lung cancer was subsequently confirmed by biopsy or surgical resection. The areas under the curve (AUCs) for each series pair were compared using DeLong's test.

Results

There was no difference in Sybil's performance between 1049 pairs of standard versus bone reconstruction filter (AUC at 1 year 0.84 [95% confidence interval (CI): 0.70–0.99] vs 0.86 [95% CI: 0.75–0.98], P = 0.87) and 1961 pairs of standard versus lung reconstruction filter (AUC at 1 year 0.98 [95% CI: 0.97–0.99] vs 0.98 [95% CI: 0.96–0.99], P = 0.81). Similarly, there was no difference in 1288 pairs comparing 2-mm versus 5-mm axial slice thickness (AUC at 1 year 0.98 [95% CI: 0.94–1.00] vs 0.99 [95% CI: 0.97–0.99], P = 0.68). The best-case scenario combining a lung reconstruction filter with 2-mm slice thickness compared with the worst-case scenario combining a bone reconstruction filter with 2.5-mm slice thickness uncovered a significantly different performance at years 2–4 (P = 0.03). Subanalysis showed no significant difference in performance between Siemens and Toshiba scanners.

Conclusions

Sybil's predictive performance for future lung cancer risk is robust across different reconstruction filters and axial slice thicknesses, demonstrating its versatility in various imaging settings. Combining favorable reconstruction parameters can significantly enhance predictive ability at years 2–4. The absence of significant differences between Siemens and Toshiba scanners further supports Sybil's versatility.

Contributors: Judit Simon, Peter Mikhael, Alexander Graur, Allison E. B. Chang, Steven J Skates, Raymond U. Osarogiagbon, Lecia V. Sequist, Florian J. Fintelmann Learn more

February 04,2025

Artificial intelligence for hemodynamic monitoring with a wearable electrocardiogram monitor

Background

The ability to non-invasively measure left atrial pressure would facilitate the identification of patients at risk of pulmonary congestion and guide proactive heart failure care. Wearable cardiac monitors, which record single-lead electrocardiogram data, provide information that can be leveraged to infer left atrial pressures.

Methods

We developed a deep neural network using single-lead electrocardiogram data to determine when the left atrial pressure is elevated. The model was developed and internally evaluated using a cohort of 6739 samples from the Massachusetts General Hospital (MGH) and externally validated on a cohort of 4620 samples from a second institution. We then evaluated model on patch-monitor electrocardiographic data on a small prospective cohort.

Results

The model achieves an area under the receiver operating characteristic curve of 0.80 for detecting elevated left atrial pressures on an internal holdout dataset from MGH and 0.76 on an external validation set from a second institution. A further prospective dataset was obtained using single-lead electrocardiogram data with a patch-monitor from patients who underwent right heart catheterization at MGH. Evaluation of the model on this dataset yielded an area under the receiver operating characteristic curve of 0.875 for identifying elevated left atrial pressures for electrocardiogram signals acquired close to the time of the right heart catheterization procedure.

Conclusions

These results demonstrate the utility and the potential of ambulatory cardiac hemodynamic monitoring with electrocardiogram patch-monitors.

Contributors: Daphne E. Schlesinger, Ridwan Alam, Roey Ringel, Eugene Pomerantsev, Srikanth Devireddy, Pinak Shah, Joseph Garasic Learn more

January 11,2025

Tackling algorithmic bias and promoting transparency in health datasets: the STANDING Together consensus recommendations

Without careful dissection of the ways in which biases can be encoded into artificial intelligence (AI) health technologies, there is a risk of perpetuating existing health inequalities at scale. One major source of bias is the data that underpins such technologies. The STANDING Together recommendations aim to encourage transparency regarding limitations of health datasets and proactive evaluation of their effect across population groups. Draft recommendation items were informed by a systematic review and stakeholder survey. The recommendations were developed using a Delphi approach, supplemented by a public consultation and international interview study. Overall, more than 350 representatives from 58 countries provided input into this initiative. 194 Delphi participants from 25 countries voted and provided comments on 32 candidate items across three electronic survey rounds and one in-person consensus meeting. The 29 STANDING Together consensus recommendations are presented here in two parts. Recommendations for Documentation of Health Datasets provide guidance for dataset curators to enable transparency around data composition and limitations. Recommendations for Use of Health Datasets aim to enable identification and mitigation of algorithmic biases that might exacerbate health inequalities. These recommendations are intended to prompt proactive inquiry rather than acting as a checklist. We hope to raise awareness that no dataset is free of limitations, so transparent communication of data limitations should be perceived as valuable, and absence of this information as a limitation. We hope that adoption of the STANDING Together recommendations by stakeholders across the AI health technology lifecycle will enable everyone in society to benefit from technologies which are safe and effective.

Contributors:Joseph E Alderman, Joanne Palmer, Elinor Laws, Melissa D McCradden, Johan Ordish, Marzyeh Ghassemi, Stephen R Pfohl, Negar Rostamzadeh, Heather Cole-Lewis, Ben Glocker, Melanie Calvert, Tom J Pollard, Jaspret Gill, Jacqui Gath, Adewale Adebajo, Jude Beng, Cassandra H Leung, Stephanie Kuku, Lesley-Anne Farmer, Rubeta N Matin, Bilal A Mateen, Francis McKay, Katherine Heller, Alan Karthikesalingam, Darren Treanor, Maxine Mackintosh, Lauren Oakden-Rayner, Russell Pearson, Arjun K Manrai, Puja Myles, Judit Kumuthini, Zoher Kapacee, Neil J Sebire, Lama H Nazer, Jarrel Seah, Ashley Akbari, Lew Berman, Judy W Gichoya, Lorenzo Righetto, Diana Samuel, William Wasswa, Maria Charalambides, Anmol Arora, Sameer Pujari, Charlotte Summers, Elizabeth Sapey, Sharon Wilkinson, Vishal Thakker, Alastair Denniston, Xiaoxuan Liu Learn more

July 11,2024

Faster Machine Unlearning via Natural Gradient Descent

We address the challenge of efficiently and reliably deleting data from machine learning models trained using Empirical Risk Minimization (ERM), a process known as machine unlearning. To avoid retraining models from scratch, we propose a novel algorithm leveraging Natural Gradient Descent (NGD). Our theoretical framework ensures strong privacy guarantees for convex models, while a practical Min/Max optimization algorithm is developed for non-convex models. Comprehensive evaluations show significant improvements in privacy, computational efficiency, and generalization compared to state-of-the-art methods, advancing both the theoretical and practical aspects of machine unlearning.

Contributors: Omri Lev Learn more

May 22,2024

MicrobioRaman: an open-access web repository for microbiological Raman spectroscopy data

Here we present the establishment of an open-access web-based repository for microbiological Raman spectroscopy data. The data collection, called ‘MicrobioRaman’ (https://www.ebi.ac.uk/biostudies/MicrobioRaman/studies), was inspired by the great success and usefulness of research databases such as GenBank and UniProt. This centralized repository, residing within the BioStudies database — which is maintained by a public institution, the European Bioinformatics Institute — minimizes the risk of data loss or eventual abandonment, offering a long-term common reference for analysis with advantages in accessibility and transparency over commercial data analysis tools. We feel that MicrobioRaman will provide a foundation for this growing field by serving as an open-access repository for sharing microbiological Raman data and through the codification of a set of reporting standards. Contributors: Kang Soo Lee, Zachary Landry, Awais Athar, Uria Alcolombri, Pratchaya Pramoj Na Ayutthaya, David Berry, Philippe de Bettignies, Ji-Xin Cheng, Gabor Csucs, Li Cui, Volker Deckert, Thomas Dieing, Jennifer Dionne, Ondrej Doskocil, Glen D’Souza, Cristina García-Timermans, Notburga Gierlinger, Keisuke Goda, Roland Hatzenpichler, Richard Henshaw, Wei Huang, Ievgeniia Iermak, Natalia Ivleva, Janina Kneipp, Patrick Kubryk, Kirsten Küsel, Tae Kwon Lee, Sung Sik Lee, Bo Ma, Clara Martínez-Pérez, Pavel Matousek, Rainer U. Meckenstock, Wei Min, Peter Mojzeš, Oliver Müller, Naresh Kumar, Per Halkjær Nielsen, Ioan Notingher, Márton Palatinszky, Fátima C. Pereira, Giuseppe Pezzotti, Zdenek Pilat, Filip Plesinger, Jürgen Popp, Alexander Probst, Alessandra Riva, Amr. Saleh, Ota Samek, Haley Sapers, Olga Schubert, Astrid Stubbusch, Gordon Taylor, Michael Wagner, Jing Wang, Huabing Yin, Yang Yue, Renato Zenobi, Jacopo Zini, Ugis Sarkans & Roman Stocker. Learn more

May 11,2024

Sharpness-Aware Minimization (SAM) Improves Classification Accuracy of Bacterial Raman Spectral Data Enabling Portable Diagnostics

Antimicrobial resistance is expected to claim 10 million lives per year by 2050, and resource-limited regions are most affected. Raman spectroscopy is a novel pathogen diagnostic approach promising rapid and portable antibiotic resistance testing within a few hours, compared to days when using gold standard methods. However, current algorithms for Raman spectra analysis 1) are unable to generalize well on limited datasets across diverse patient populations and 2) require increased complexity due to the necessity of non-trivial pre-processing steps, such as feature extraction, which are essential to mitigate the low-quality nature of Raman spectral data. In this work, we address these limitations using Sharpness-Aware Minimization (SAM) to enhance model generalization across a diverse array of hyperparameters in clinical bacterial isolate classification tasks. We demonstrate that SAM achieves accuracy improvements of up to 10.7% on a single split, and an increase in average accuracy of 2.5% across all splits in spectral classification tasks over the traditional optimizer, Adam. These results display the capability of SAM to advance the clinical application of AI-powered Raman spectroscopy tools.

Contributors: Kaitlin Zareno, Jarett Dewbury, Siamak Sorooshyari, Hossein Mobahi Learn more

May 01,2024

Prediction-powered Generalization of Causal Inferences

Causal inferences from a randomized controlled trial (RCT) may not pertain to a target population where some effect modifiers have a different distribution. Prior work studies generalizing the results of a trial to a target population with no outcome but covariate data available. We show how the limited size of trials makes generalization a statistically infeasible task, as it requires estimating complex nuisance functions. We develop generalization algorithms that supplement the trial data with a prediction model learned from an additional observational study (OS), without making any assumptions on the OS. We theoretically and empirically show that our methods facilitate better generalization when the OS is "high-quality", and remain robust when it is not, and e.g., have unmeasured confounding.

Contributors: Ilker Demirel, Ahmed Alaa, Anthony Philippakis Learn more

May 01,2024

Position: Scarce Resource Allocations That Rely On Machine Learning Should Be Randomized

Contrary to traditional deterministic notions of algorithmic fairness, this paper argues that fairly allocating scarce resources using machine learning often requires randomness. We address why, when, and how to randomize by offering a set of stochastic procedures that more adequately account for all of the claims individuals have to allocations of social goods or opportunities and effectively balances their interests.

Contributors: Shomik Jain, Kathleen Creel Learn more

May 01,2024

Mean-field Underdamped Langevin Dynamics and its Spacetime Discretization

We propose a new method called the N-particle underdamped Langevin algorithm for optimizing a special class of non-linear functionals defined over the space of probability measures. Examples of problems with this formulation include training mean-field neural networks, maximum mean discrepancy minimization and kernel Stein discrepancy minimization. Our algorithm is based on a novel spacetime discretization of the mean-field underdamped Langevin dynamics, for which we provide a new, fast mixing guarantee. In addition, we demonstrate that our algorithm converges globally in total variation distance, bridging the theoretical gap between the dynamics and its practical implementation.

Contributor: Qiang Fu Learn more

May 01,2024

Measuring Stochastic Data Complexity with Boltzmann Influence Functions

Estimating the uncertainty of a model’s prediction on a test point is a crucial part of ensuring reliability and calibration under distribution shifts.A minimum description length approach to this problem uses the predictive normalized maximum likelihood (pNML) distribution, which considers every possible label for a data point, and decreases confidence in a prediction if other labels are also consistent with the model and training data. In this work we propose IF-COMP, a scalable and efficient approximation of the pNML distribution that linearizes the model with a temperature-scaled Boltzmann influence function. IF-COMP can be used to produce well-calibrated predictions on test points as well as measure complexity in both labelled and unlabelled settings. We experimentally validate IF-COMP on uncertainty calibration, mislabel detection, and OOD detection tasks, where it consistently matches or beats strong baseline methods.

Contributors: Nathan Hoyen Ng, Roger Baker Grosse Learn more