crossNN is an explainable framework for cross-platform DNA methylation-based classification of tumors.

in Nature cancer by Dongsheng Yuan, Robin Jugas, Petra Pokorna, Jaroslav Sterba, Ondrej Slaby, Simone Schmid, Christin Siewert, Brendan Osberg, David Capper, Skarphedinn Halldorsson, Einar O Vik-Mo, Pia S Zeiner, Katharina J Weber, Patrick N Harter, Christian Thomas, Anne Albers, Markus Rechsteiner, Regina Reimann, Anton Appelt, Ulrich Schüller, Nabil Jabareen, Sebastian Mackowiak, Naveed Ishaque, Roland Eils, Sören Lukassen, Philipp Euskirchen

TLDR

  • This study presents a novel machine learning framework, crossNN, that can accurately classify brain tumors and other types of cancer using sparse methylomes from different platforms and sequencing depths.
  • crossNN outperforms other models in accuracy and computational efficiency and demonstrates high precision rates in a large-scale validation study.
  • The framework has significant implications for the development of diagnostic techniques and precision medicine applications in oncology.

Abstract

DNA methylation-based classification of (brain) tumors has emerged as a powerful and indispensable diagnostic technique. Initial implementations used methylation microarrays for data generation, while most current classifiers rely on a fixed methylation feature space. This makes them incompatible with other platforms, especially different flavors of DNA sequencing. Here, we describe crossNN, a neural network-based machine learning framework that can accurately classify tumors using sparse methylomes obtained on different platforms and with different epigenome coverage and sequencing depth. It outperforms other deep and conventional machine learning models regarding accuracy and computational requirements while still being explainable. We use crossNN to train a pan-cancer classifier that can discriminate more than 170 tumor types across all organ sites. Validation in more than 5,000 tumors profiled on different platforms, including nanopore and targeted bisulfite sequencing, demonstrates its robustness and scalability with 99.1% and 97.8% precision for the brain tumor and pan-cancer models, respectively.

Overview

  • This study focuses on developing a machine learning framework called crossNN, which can accurately classify brain tumors using sparse methylomes obtained from different platforms and sequencing depths.
  • The framework is designed to overcome the limitations of fixed methylation feature spaces and improve compatibility with different DNA sequencing platforms.
  • The study aims to develop a pan-cancer classifier that can discriminate over 170 tumor types across all organ sites and is designed to be accurate, computationally efficient, and explainable.

Comparative Analysis & Findings

  • The study shows that crossNN outperforms other deep and conventional machine learning models regarding accuracy and computational requirements.
  • crossNN is validated in more than 5,000 tumors profiled on different platforms, including nanopore and targeted bisulfite sequencing, with high precision rates for the brain tumor and pan-cancer models.
  • The pan-cancer model achieves 97.8% precision, while the brain tumor model achieves 99.1% precision, demonstrating the robustness and scalability of crossNN.

Implications and Future Directions

  • The study has significant implications for the development of diagnostic techniques for brain tumors and other types of cancer.
  • Future studies can further evaluate the performance of crossNN in clinical settings and explore its potential applications in precision medicine.
  • Additionally, future research can investigate the potential of crossNN for identifying specific biomarkers and improving the accuracy of cancer diagnosis and treatment.