临床荟萃 ›› 2025, Vol. 40 ›› Issue (11): 988-998.doi: 10.3969/j.issn.1004-583X.2025.11.004

• 论著 • 上一篇    下一篇

基于机器学习算法的H3K27M突变型弥漫中线胶质瘤患者预后预测模型构建及其验证

王壮壮1, 杨庆军1, 任欢2, 刘彦廷3, 田春雷3()   

  1. 1.谷城县人民医院 外科,湖北 襄阳 441700
    2.十堰市爱尔眼科医院,湖北 十堰 442000
    3.三峡大学第一临床医学院(宜昌市中心人民医院) 神经外科,湖北 宜昌 443003
  • 收稿日期:2025-09-23 出版日期:2025-11-20 发布日期:2025-12-02
  • 通讯作者: 田春雷 E-mail:cltianyc@163.com

Development and validation of a machine learning-based prognostic model for H3K27 mmutant diffuse midline glioma

Wang Zhuangzhuang1, Yang Qingjun1, Ren Huan2, Liu Yanting3, Tian Chunlei3()   

  1. 1. Department of Surgery, People's Hospital of Gucheng County, Xiangyang 441700, China
    2. Shiyan Aier Eye Hospital, Shiyan 442000, China
    3. Department of Neurosurgery,the First Clinical Medical College of China Three Gorges University (Yichang Central People's Hospital), Yichang 443003, China
  • Received:2025-09-23 Online:2025-11-20 Published:2025-12-02
  • Contact: Tian Chunlei E-mail:cltianyc@163.com

摘要:

目的 探究H3K27M突变型弥漫中线胶质瘤(diffuse midline glioma, DMG)生存预后的影响因素,构建并验证H3K27M突变型DMG预后不良的预测模型。方法 回顾性分析2000-2019年监测、流行病学及预后数据库中H3K27M突变型DMG患者的临床资料,按照7:3的比例随机分为训练集(n=97)和验证集(n=41)。采用极端梯度提升、随机森林、最小绝对收缩和选择算子回归分析和决策树模型筛选变量,分析4种机器学习方法“重叠覆盖”的风险因素,采用多因素Cox回归方法验证结果的独立预测性,并在此基础上构建列线图预测H3K27M突变型DMG患者6、12、18个月生存率。使用受试者工作特征曲线下面积、校准曲线和临床决策曲线评估列线图模型的预测效能、准确性和临床适用性,Kaplan-Meier法绘制预后影响变量的生存曲线。结果 4种机器学习算法各自筛选出不同预后影响因素,经过“重叠覆盖分析”得到5种共同的H3K27M型DMG患者预后影响因素:年龄、肿瘤体积、世界卫生组织分级、肿瘤侧别和放疗。多因素Cox回归分析发现,>60岁(HR=3.018, 95%CI:1.15~7.92,P=0.025),肿瘤体积增大(HR=1.039, 95%CI:1.01~1.06,P=0.004),高级别胶质瘤(HR=2.057, 95%CI:1.21~3.49,P=0.008),中线结构(HR=2.101, 95%CI:1.32~3.34,P=0.002)是H3K27M突变型DMG预后的危险性因素,放疗(HR=0.410, 95%CI:0.23~0.75,P=0.004)是H3K27M突变型DMG的保护性因素。基于5个预后影响因素构建的列线图模型验证结果显示,训练集和验证集6、12、18个月的曲线下面积分别为0.647、0.746、0.625和0.632、0.725、0.725,表明模型预测性能良好,校准曲线显示预测值与理想曲线具有较好一致性,临床决策曲线分析曲线显示列线图模型有良好的临床效能。结论 基于机器学习算法筛选出的预后影响因素构建的列线图模型有可靠的预测效能,可帮助临床医生识别H3K27M突变型DMG患者高危预后因素,制定个体化治疗方案。

关键词: H3K27M, 机器学习, 弥漫中线胶质瘤, 列线图, 预后因素

Abstract:

Objective To identify prognostic factors for H3K27 mmutant diffuse midline glioma (DMG) and to develop and validate a nomogram for predicting poor prognosis. Methods Patients with histologically confirmed H3K27Mmutant DMG recorded in the Surveillance Epidemiology and End Results database from 2000 to 2019 were retrospectively included. Cases were randomly split into a training set (n=97) and a validation set (n=41) at a 7:3 ratio. Candidate variables were screened by four machinelearning approaches-extreme gradient boosting random forest, least absolute shrinkage and selection operator regression, and decision tree. Multivariate Cox regression models were then used to develop predictive models. Risk factors identified by multiple methods were further tested by multivariate Cox regression to confirm independent prognostic value. A nomogram to predict 6, 12, and 18month overall survival was constructed from the independently significant predictors. Model discrimination, calibration, and clinical utility were evaluated using area under the receiver operating characteristic curve (AUC), calibration plots, and decision curve analysis (DCA). Survival differences were assessed by Kaplan-Meier analysis. Results The four machinelearning methods produced overlapping but not identical sets of candidate predictors; five variables overlapped across methods: Age, tumor size, WHO grade, laterality, and radiation. Multivariate Cox regression identified four independent adverse prognostic factors: Age >60 years (HR=3.018; 95%CI: 1.15-7.92; P=0.025), larger tumor size (per unit increase) (HR=1.039; 95%CI: 1.01-1.06; P=0.004), higher WHO grade (HR=2.057; 95%CI: 1.21-3.49; P=0.008), and midline location (HR=2.101; 95%CI: 1.32-3.34; P=0.002). Radiotherapy was independently associated with improved survival (protective effect: HR=0.410; 95%CI: 0.23-0.75; P=0.004). The nomogram incorporating these factors demonstrated AUCs for 6, 12, and 18 month OS of 0.647, 0.746, and 0.625 in the training set and 0.632, 0.725, and 0.725 in the validation set, respectively, indicating acceptable discrimination. Calibration plots showed good agreement between predicted and observed survival probabilities, and DCA indicated favorable clinical utility. Conclusion A nomogram developed from machine learning-selected predictors reliably estimates shortterm OS in patients with H3K27 mmutant DMG. This model may help clinicians identify highrisk patients and tailor individualized treatment strategies.

Key words: H3K27M, machine learning, diffuse midline glioma (DMG), nomogram, prognostic factors

中图分类号: