Data Consistent Deep Rigid MRI Motion Correction

Motion artifacts are a pervasive problem in MRI, leading to misdiagnosis or mischaracterization in population-level imaging studies. Current retrospective rigid intra-slice motion correction techniques jointly optimize estimates of the image and the motion parameters. In this paper, we use a deep network to reduce the joint image-motion parameter search to a search over rigid motion parameters alone. Our network produces a reconstruction as a function of two inputs: corrupted k-space data and motion parameters. We train the network using simulated, motion-corrupted k-space data generated from known motion parameters. At test-time, we estimate unknown motion parameters by minimizing a data consistency loss between the motion parameters, the network-based image reconstruction given those parameters, and the acquired measurements. Intra-slice motion correction experiments on simulated and realistic 2D fast spin echo brain MRI achieve high reconstruction fidelity while retaining the benefits of explicit data consistency-based optimization.

Contributors: Nalini M. Singh, Neel Dey, Malte Hoffmann, Bruce Fischl, Elfar Adalsteinsson, Robert Frost, Adrian V. Dalca Learn more

Sybil: a validated deep learning model to predict future lung cancer risk from a single low-dose chest computed tomography

Purpose: Low-dose computed tomography (LDCT) for lung cancer screening is effective, although most eligible people are not being screened. Tools that provide personalized future cancer risk assessment could focus approaches toward those most likely to benefit. We hypothesized that a deep learning model assessing the entire volumetric LDCT data could be built to predict individual risk without requiring additional demographic or clinical data.

Methods: We developed a model called Sybil using LDCTs from the National Lung Screening Trial (NLST). Sybil requires only one LDCT and does not require clinical data or radiologist annotations; it can run in real time in the background on a radiology reading station. Sybil was validated on three independent data sets: a heldout set of 6,282 LDCTs from NLST participants, 8,821 LDCTs from Massachusetts General Hospital (MGH), and 12,280 LDCTs from Chang Gung Memorial Hospital (CGMH, which included people with a range of smoking history including nonsmokers).

Results: Sybil achieved area under the receiver-operator curves for lung cancer prediction at 1 year of 0.92 (95% CI, 0.88 to 0.95) on NLST, 0.86 (95% CI, 0.82 to 0.90) on MGH, and 0.94 (95% CI, 0.91 to 1.00) on CGMH external validation sets. Concordance indices over 6 years were 0.75 (95% CI, 0.72 to 0.78), 0.81 (95% CI, 0.77 to 0.85), and 0.80 (95% CI, 0.75 to 0.86) for NLST, MGH, and CGMH, respectively.

Conclusion: Sybil can accurately predict an individual's future lung cancer risk from a single LDCT scan to further enable personalized screening. Future study is required to understand Sybil's clinical applications. Our model and annotations are publicly available.

Contributors: Adam Yala, Ludvig Karstens, Justin Xiang, Angelo K. Takigami, Patrick P. Bourgouin, PuiYee Chan, Sofiane Mrah, Wael Amayri, Yu-Hsiang Juan, Cheng-Ta Yang, Yung-Liang Wan, Gigin Lin, Lecia V. Sequist, Florian J. Fintelmann Learn more

Monitoring gait at home with radio waves in Parkinson’s disease: A marker of severity, progression, and medication response

Parkinson’s disease (PD) is the fastest-growing neurological disease in the world. A key challenge in PD is tracking disease severity, progression, and medication response. Existing methods are semisubjective and require visiting the clinic. In this work, we demonstrate an effective approach for assessing PD severity, progression, and medication response at home, in an objective manner. We used a radio device located in the background of the home. The device detected and analyzed the radio waves that bounce off people’s bodies and inferred their movements and gait speed. We continuously monitored 50 participants, with and without PD, in their homes for up to 1 year. We collected over 200,000 gait speed measurements. Cross-sectional analysis of the data shows that at-home gait speed strongly correlates with gold-standard PD assessments, as evaluated by the Movement Disorder Society-Sponsored Revision of the Unified Parkinson’s Disease Rating Scale (MDS-UPDRS) part III subscore and total score. At-home gait speed also provides a more sensitive marker for tracking disease progression over time than the widely used MDS-UPDRS. Further, the monitored gait speed was able to capture symptom fluctuations in response to medications and their impact on patients’ daily functioning. Our study shows the feasibility of continuous, objective, sensitive, and passive assessment of PD at home and hence has the potential of improving clinical care and drug clinical trials.

Contributors: Yingcheng Liu, Guo Zhang, Christopher G Tarolli, Rumen Hristov, Stella Jensen-Roberts, Emma M Waddell, Taylor L Myers, Meghan E Pawlik, Julia M Soto, Renee M Wilson, Yuzhe Yang, Timothy Nordahl, Karlo J Lizarraga, Jamie L Adams, Ruth B Schneider, Karl Kieburtz, Terry Ellis, E Ray Dorsey Learn more

Integrated multimodal artificial intelligence framework for healthcare applications

Artificial intelligence (AI) systems hold great promise to improve healthcare over the next decades. Specifically, AI systems leveraging multiple data sources and input modalities are poised to become a viable method to deliver more accurate results and deployable pipelines across a wide range of applications. In this work, we propose and evaluate a unified Holistic AI in Medicine (HAIM) framework to facilitate the generation and testing of AI systems that leverage multimodal inputs. Our approach uses generalizable data pre-processing and machine learning modeling stages that can be readily adapted for research and deployment in healthcare environments. We evaluate our HAIM framework by training and characterizing 14,324 independent models based on HAIM-MIMIC-MM, a multimodal clinical database (N = 34,537 samples) containing 7279 unique hospitalizations and 6485 patients, spanning all possible input combinations of 4 data modalities (i.e., tabular, time-series, text, and images), 11 unique data sources and 12 predictive tasks. We show that this framework can consistently and robustly produce models that outperform similar single-source approaches across various healthcare demonstrations (by 6–33%), including 10 distinct chest pathology diagnoses, along with length-of-stay and 48 h mortality predictions. We also quantify the contribution of each modality and data source using Shapley values, which demonstrates the heterogeneity in data modality importance and the necessity of multimodal inputs across different healthcare-relevant tasks. The generalizable properties and flexibility of our Holistic AI in Medicine (HAIM) framework could offer a promising pathway for future multimodal predictive systems in clinical and operational healthcare settings.

Contributors Luis R. Soenksen, Yu Ma, Cynthia Zeng, Leonard Boussioux, Liangyuan Na, Holly M. Wiberg, Michael L. Li Learn more

Artificial intelligence-enabled detection and assessment of Parkinson’s disease using nocturnal breathing signals

There are currently no effective biomarkers for diagnosing Parkinson’s disease (PD) or tracking its progression. Here, we developed an artificial intelligence (AI) model to detect PD and track its progression from nocturnal breathing signals. The model was evaluated on a large dataset comprising 7,671 individuals, using data from several hospitals in the United States, as well as multiple public datasets. The AI model can detect PD with an area-under-the-curve of 0.90 and 0.85 on held-out and external test sets, respectively. The AI model can also estimate PD severity and progression in accordance with the Movement Disorder Society Unified Parkinson’s Disease Rating Scale (R = 0.94, P = 3.6 × 10–25). The AI model uses an attention layer that allows for interpreting its predictions with respect to sleep and electroencephalogram. Moreover, the model can assess PD in the home setting in a touchless manner, by extracting breathing from radio waves that bounce off a person’s body during sleep. Our study demonstrates the feasibility of objective, noninvasive, at-home assessment of PD, and also provides initial evidence that this AI model may be useful for risk assessment before clinical diagnosis.

Contributors: Yuzhe Yang, Yuan Yuan, Guo Zhang, Hao Wang, Ying-Cong Chen, Yingcheng Liu, Christopher G. Tarolli, Daniel Crepeau, Jan Bukartyk, Mithri R. Junna, Aleksandar Videnovic, Terry D. Ellis, Melissa C. Lipford, Ray Dorsey Learn more

Multi-Institutional Validation of a Mammography-Based Breast Cancer Risk Model

Accurate risk assessment is essential for the success of population screening programs in breast cancer. Models with high sensitivity and specificity would enable programs to target more elaborate screening efforts to high-risk populations, while minimizing overtreatment for the rest. Artificial intelligence (AI)-based risk models have demonstrated a significant advance over risk models used today in clinical practice. However, the responsible deployment of novel AI requires careful validation across diverse populations. To this end, we validate our AI-based model, Mirai, across globally diverse screening populations.

Contributors: Fredrik Strand, Gigin Lin, Siddharth Satuluru, Thomas Kim, Imon Banerjee, Judy Gichoya, Hari Trivedi, Constance D. Lehman, Kevin Hughes, David J. Sheedy, Lisa M. Matthis, Bipin Karunakara, Karen E. Hegarty, Silvia Sabino, Thiago B. Silva, Maria C. Evangelista, Renato F. Caron, Bruno Souza, Edmundo C. Mauad, Tal Patalon, Sharon Handelman-Gotlib, Michal Guindy Learn more

Toward robust mammography-based models for breast cancer risk

Improved breast cancer risk models enable targeted screening strategies that achieve earlier detection and less screening harm than existing guidelines. To bring deep learning risk models to clinical practice, we need to further refine their accuracy, validate them across diverse populations, and demonstrate their potential to improve clinical workflows. We developed Mirai, a mammography-based deep learning model designed to predict risk at multiple timepoints, leverage potentially missing risk factor information, and produce predictions that are consistent across mammography machines. Mirai was trained on a large dataset from Massachusetts General Hospital (MGH) in the United States and tested on held-out test sets from MGH, Karolinska University Hospital in Sweden, and Chang Gung Memorial Hospital (CGMH) in Taiwan, obtaining C-indices of 0.76 (95% confidence interval, 0.74 to 0.80), 0.81 (0.79 to 0.82), and 0.79 (0.79 to 0.83), respectively. Mirai obtained significantly higher 5-year ROC AUCs than the Tyrer-Cuzick model (P < 0.001) and prior deep learning models Hybrid DL (P < 0.001) and Image-Only DL (P < 0.001), trained on the same dataset. Mirai more accurately identified high-risk patients than prior methods across all datasets. On the MGH test set, 41.5% (34.4 to 48.5) of patients who would develop cancer within 5 years were identified as high risk, compared with 36.1% (29.1 to 42.9) by Hybrid DL (P = 0.02) and 22.9% (15.9 to 29.6) by the Tyrer-Cuzick model (P < 0.001). Learn more

A Deep Learning Mammography-based Model for Improved Breast Cancer Risk Prediction

Background Mammographic density improves the accuracy of breast cancer risk models. However, the use of breast density is limited by subjective assessment, variation across radiologists, and restricted data. A mammography-based deep learning (DL) model may provide more accurate risk prediction.

Purpose To develop a mammography-based DL breast cancer risk model that is more accurate than established clinical breast cancer risk models.

Materials and Methods This retrospective study included 88 994 consecutive screening mammograms in 39 571 women between January 1, 2009, and December 31, 2012. For each patient, all examinations were assigned to either training, validation, or test sets, resulting in 71 689, 8554, and 8751 examinations, respectively. Cancer outcomes were obtained through linkage to a regional tumor registry. By using risk factor information from patient questionnaires and electronic medical records review, three models were developed to assess breast cancer risk within 5 years: a risk-factor-based logistic regression model (RF-LR) that used traditional risk factors, a DL model (image-only DL) that used mammograms alone, and a hybrid DL model that used both traditional risk factors and mammograms. Comparisons were made to an established breast cancer risk model that included breast density (Tyrer-Cuzick model, version 8 [TC]). Model performance was compared by using areas under the receiver operating characteristic curve (AUCs) with DeLong test (P < .05).

Results The test set included 3937 women, aged 56.20 years ± 10.04. Hybrid DL and image-only DL showed AUCs of 0.70 (95% confidence interval [CI]: 0.66, 0.75) and 0.68 (95% CI: 0.64, 0.73), respectively. RF-LR and TC showed AUCs of 0.67 (95% CI: 0.62, 0.72) and 0.62 (95% CI: 0.57, 0.66), respectively. Hybrid DL showed a significantly higher AUC (0.70) than TC (0.62; P < .001) and RF-LR (0.67; P = .01).

Conclusion Deep learning models that use full-field mammograms yield substantially improved risk discrimination compared with the Tyrer-Cuzick (version 8) model.

Contributors: Constance Lehman, Tal Schuster, Tally Portnoi Learn more

Surgical Risk Is Not Linear: Derivation and Validation of a Novel, User-friendly, and Machine-learning-based Predictive OpTimal Trees in Emergency Surgery Risk (POTTER) Calculator

Introduction: Most risk assessment tools assume that the impact of risk factors is linear and cumulative. Using novel machine-learning techniques, we sought to design an interactive, nonlinear risk calculator for Emergency Surgery (ES).

Methods: All ES patients in the American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP) 2007 to 2013 database were included (derivation cohort). Optimal Classification Trees (OCT) were leveraged to train machine-learning algorithms to predict postoperative mortality, morbidity, and 18 specific complications (eg, sepsis, surgical site infection). Unlike classic heuristics (eg, logistic regression), OCT is adaptive and reboots itself with each variable, thus accounting for nonlinear interactions among variables. An application [Predictive OpTimal Trees in Emergency Surgery Risk (POTTER)] was then designed as the algorithms’ interactive and user-friendly interface. POTTER performance was measured (c-statistic) using the 2014 ACS-NSQIP database (validation cohort) and compared with the American Society of Anesthesiologists (ASA), Emergency Surgery Score (ESS), and ACS-NSQIP calculators’ performance.

Results: Based on 382,960 ES patients, comprehensive decision-making algorithms were derived, and POTTER was created where the provider's answer to a question interactively dictates the subsequent question. For any specific patient, the number of questions needed to predict mortality ranged from 4 to 11. The mortality c-statistic was 0.9162, higher than ASA (0.8743), ESS (0.8910), and ACS (0.8975). The morbidity c-statistics was similarly the highest (0.8414).

Conclusion: POTTER is a highly accurate and user-friendly ES risk calculator with the potential to continuously improve accuracy with ongoing machine-learning. POTTER might prove useful as a tool for bedside preoperative counseling of ES patients and families.

Contributors: Jack Dunn, George Velmahos, Haytham Kaafarani Learn more