Sol Research

Development and validation of a machine learning model based on laboratory parameters for preoperative prediction of Ki-67 expression in gliomas.

Last Updated Mar 28, 2025 in Journal of neurosurgery by Jinlan Huang, Shoupeng Ding, Lijin Lin, Guiyang Zhong, Zhou Yu, Qingwen Luo, Dongmei Chen, Yazhi Chen, Shouzhao Zheng, Shihao Zheng

TLDR

A noninvasive machine learning model based on routine laboratory parameters was developed and validated to predict the Ki-67 proliferation index in patients with gliomas, potentially aiding in prognostic evaluation and clinical decision-making.

Abstract

Glioma is the most common form of brain tumor and has high mortality. The Ki-67 proliferation index, a vital marker of cell proliferation, has been demonstrated to predict tumor classification and prognosis. The aim of this study was to develop and validate a noninvasive model based on machine learning (ML) and routine laboratory parameters to preoperatively predict the level of Ki-67 in gliomas. A total of 506 patients with pathological confirmation of glioma from 2 medical centers (January 2020 to December 2023) were retrospectively enrolled and divided into training (n = 352), internal validation (n = 88), and external validation (n = 66) cohorts. According to the Ki-67 proliferation index, patients were classified into low Ki-67 (index < 10%) and high Ki-67 (index ≥ 10%) groups. Laboratory parameters were obtained within 1 week before surgery from the Laboratory Information System. The potential features associated with Ki-67 levels were screened using extreme gradient boosting (XGBoost), support vector machine (SVM), and least absolute shrinkage and selection operator (LASSO). Then, 10 ML classifiers, including SVM, XGBoost, logistic regression (LR), random forest, adaptive boosting (AdaBoost), gradient boosting machine, partitioning around medoids, naive Bayes, neural network, and bagged classification and regression trees (CART), were trained. The performance of these models was evaluated on internal and external validation sets using the area under the receiver operating characteristic curve (AUC). Calibration curve, decision curve, and clinical impact curve analyses were used for validation. Fifteen laboratory parameters that met the requirements of XGBoost, SVM, and LASSO were selected. Among all tested ML models, the LR model had superior performance with relatively high AUC, accuracy, sensitivity, and specificity. The LR model achieved AUCs of 0.838 in the training set, 0.800 (with the highest accuracy [0.782] and optimal sensitivity [0.845]) in the internal validation set, and 0.757 in the external validation set. Finally, the LR model was visualized as a nomogram based on the top 6 laboratory parameters (age, anion gap, apolipoprotein A-1, apolipoprotein B, calcium, creatinine) to individually predict the Ki-67 proliferation index in patients with gliomas. The authors successfully constructed an LR model based on routine laboratory parameters, with relatively high sensitivity and specificity, to preoperatively predict the level of Ki-67 in patients with gliomas, which might be helpful for prognostic evaluation and clinical decision-making.

Overview

The study aimed to develop and validate a noninvasive model to preoperatively predict the Ki-67 proliferation index in gliomas using machine learning and routine laboratory parameters.
The study enrolled 506 patients with pathological confirmation of glioma from 2 medical centers and divided them into training, internal validation, and external validation cohorts.
The primary objective was to predict the level of Ki-67 in gliomas using machine learning algorithms and routine laboratory parameters, to aid in prognostic evaluation and clinical decision-making.

Comparative Analysis & Findings

The study used extreme gradient boosting, support vector machine, and least absolute shrinkage and selection operator to screen potential features associated with Ki-67 levels.
Among all tested machine learning models, the logistic regression model had superior performance, achieving high AUC, accuracy, sensitivity, and specificity in the training, internal validation, and external validation sets.
The top 6 laboratory parameters used to predict the Ki-67 proliferation index were age, anion gap, apolipoprotein A-1, apolipoprotein B, calcium, and creatinine.

Implications and Future Directions

The developed logistic regression model has the potential to be helpful for prognostic evaluation and clinical decision-making in patients with gliomas.
Future studies could explore the use of this model in clinical practice and investigate its potential as a biomarker for glioma diagnosis and treatment.
The model could also be validated in larger and more diverse patient populations to further improve its predictive accuracy.

Read Full Article