Validation of an Artificial Intelligence-Based Prediction Model Using 5 External PET/CT Datasets of Diffuse Large B-Cell Lymphoma.

in Journal of nuclear medicine : official publication, Society of Nuclear Medicine by Maria C Ferrández, Sandeep S V Golla, Jakoba J Eertink, Sanne E Wiegers, Gerben J C Zwezerijnen, Martijn W Heymans, Pieternella J Lugtenburg, Lars Kurch, Andreas Hüttmann, Christine Hanoun, Ulrich Dührsen, Sally F Barrington, N George Mikhaeel, Luca Ceriani, Emanuele Zucca, Sándor Czibor, Tamás Györke, Martine E D Chamuleau, Josée M Zijlstra, Ronald Boellaard,

TLDR

  • This study tested a deep learning model to predict treatment outcome in diffuse large B-cell lymphoma. The model was trained on PET/CT scans and compared with the international prognostic index (IPI) and 2 models incorporating radiomic PET/CT features. The deep learning model outperformed the IPI and radiomic models in predicting treatment outcome, with a significantly higher AUC of 0.66 compared to the IPI's AUC of 0.60. The deep learning model predicted treatment outcome without tumor delineation but at the cost of a lower prognostic performance than with radiomics.

Abstract

The aim of this study was to validate a previously developed deep learning model in 5 independent clinical trials. The predictive performance of this model was compared with the international prognostic index (IPI) and 2 models incorporating radiomic PET/CT features (clinical PET and PET models).In total, 1,132 diffuse large B-cell lymphoma patients were included: 296 for training and 836 for external validation. The primary outcome was 2-y time to progression. The deep learning model was trained on maximum-intensity projections from PET/CT scans. The clinical PET model included metabolic tumor volume, maximum distance from the bulkiest lesion to another lesion, SUV, age, and performance status. The PET model included metabolic tumor volume, maximum distance from the bulkiest lesion to another lesion, and SUVModel performance was assessed using the area under the curve (AUC) and Kaplan-Meier curves.The IPI yielded an AUC of 0.60 on all external data. The deep learning model yielded a significantly higher AUC of 0.66 (< 0.01). For each individual clinical trial, the model was consistently better than IPI. Radiomic model AUCs remained higher for all clinical trials. The deep learning and clinical PET models showed equivalent performance (AUC, 0.69;> 0.05). The PET model yielded the highest AUC of all models (AUC, 0.71;< 0.05).The deep learning model predicted outcome in all trials with a higher performance than IPI and better survival curve separation. This model can predict treatment outcome in diffuse large B-cell lymphoma without tumor delineation but at the cost of a lower prognostic performance than with radiomics.

Overview

  • The study aimed to validate a deep learning model in 5 independent clinical trials for predicting treatment outcome in diffuse large B-cell lymphoma. The model was trained on maximum-intensity projections from PET/CT scans and compared with the international prognostic index (IPI) and 2 models incorporating radiomic PET/CT features. The primary outcome was 2-y time to progression. The study included 1,132 diffuse large B-cell lymphoma patients: 296 for training and 836 for external validation. The deep learning model outperformed the IPI and radiomic models in predicting treatment outcome, with a significantly higher area under the curve (AUC) of 0.66 compared to the IPI's AUC of 0.60. The deep learning model consistently predicted better outcomes than the IPI in each individual clinical trial, and the PET model yielded the highest AUC of all models. The deep learning model predicted treatment outcome without tumor delineation but at the cost of a lower prognostic performance than with radiomics.

Comparative Analysis & Findings

  • The deep learning model outperformed the IPI and radiomic models in predicting treatment outcome, with a significantly higher AUC of 0.66 compared to the IPI's AUC of 0.60. The deep learning model consistently predicted better outcomes than the IPI in each individual clinical trial, and the PET model yielded the highest AUC of all models. The deep learning model predicted treatment outcome without tumor delineation but at the cost of a lower prognostic performance than with radiomics.

Implications and Future Directions

  • The study highlights the potential of deep learning models in predicting treatment outcome in diffuse large B-cell lymphoma without tumor delineation. However, the lower prognostic performance of the deep learning model compared to radiomic models suggests that incorporating radiomic features may improve the model's performance. Future research should explore the potential of combining deep learning models with radiomics to improve prognostication in diffuse large B-cell lymphoma.