###
中国临床研究:2025,38(8):1173-1181
本文二维码信息
码上扫一扫!
基于机器学习算法构建重症监护病房急性胰腺炎并发急性呼吸窘迫综合征的风险预测模型
(1. 苏州大学附属常熟医院 常熟市第一人民医院消化内科, 江苏 常熟 215500;2. 苏州大学附属常熟医院 常熟市第一人民医院重症医学科, 江苏 常熟 215500)
Development of a prediction model for acute respiratory distress syndrome in ICU patients with acute pancreatitis based on machine learning algorithms
摘要
本文已被:浏览 580次   下载 176
投稿时间:2025-04-16   网络发布日期:2025-08-20
中文摘要: 目的 旨在开发和验证一种基于机器学习算法的预测模型,用于评估急性胰腺炎(AP)患者在重症监护病房(ICU)内发生急性呼吸窘迫综合征(ARDS)的风险。方法 回顾性分析美国重症监护医学信息数据库Ⅳv2.2(MIMIC-Ⅳ v2.2)中的 857 例 AP 患者的相关资料,按 7∶3 的比例随机划分为训练集(n=601)和内部验证集(n=256),另收集 2019 年 1月至2024年3月苏州大学附属常熟医院ICU126例AP患者的相关资料作为外部测试集。根据是否并发ARDS将所有患者分为ARDS组与非ARDS组,收集其人口学特征、入ICU 24 h内初始的生命体征、实验室数据、功能评分及并发症情况,采用最小绝对收缩和选择算子(LASSO)回归进行特征选择,并使用随机森林(RF)、极端梯度提升(XGBoost)、轻量级梯度提升机(LightGBM)、决策树(DT)、逻辑回归(LR)、支持向量机(SVM)和K最近邻(KNN)7种机器学习算法构建预测模型。模型性能评估利用受试者工作特征(ROC)曲线、校准曲线及决策曲线分析(DCA),最后借助夏普利加性解释(SHAP)算法对模型进行可解释性分析。结果 MIMIC-Ⅳ数据库中202例(23.57%)并发ARDS,外部测试集中26例(20.63%)并发ARDS。基于训练集数据,采用LASSO回归从43个变量中筛选出7个关键变量进行模型构建,多种机器学习模型比较结果显示,RF模型在内部验证集和外部测试集ROC曲线下面积(AUC)分别为0.780(95%CI为0.721~0.846)和0.842(95%CI为0.751~0.917),均高于其他6种模型;校准曲线显示RF模型的预测概率与实际概率的偏差较其他模型小,整体预测性能最佳。基于RF模型的SHAP算法分析表明,机械通气、序贯器官功能衰竭(SOFA)评分、身体质量指数(BMI)、脉博血氧饱和度(SpO2)和简明急性生理功能Ⅱ(SAPSⅡ)评分是影响ARDS风险的主要因素。机械通气可使ARDS的发生风险从16%上升至37%;SOFA大于8分时ARDS风险会显著上升;ARDS发生风险会随着BMI的增加而升高;SpO2低于90%时,ARDS发生风险维持在30%,当SpO2超过90%后风险则随着SpO2增加而呈下降趋势;SAPSⅡ评分在46~60分之间时,ARDS的风险呈明显上升趋势。结论 基于RF算法的预测模型为AP患者并发ARDS的风险评估提供了可靠工具,通过SHAP方法增强了模型的可解释性,有助于临床决策。
Abstract:Objective To develop and validate a predictive model based on machine learning algorithms to assess the risk of acute respiratory distress syndrome(ARDS)in patients with acute pancreatitis(AP)admitted to the intensive care unit(ICU). Methods The relevant data of 857 AP patients from the Medical Information Mart for Intensive CareⅣ v2.2(MIMIC-Ⅳ v2.2)database were retrospectively analyzed and were randomly divided into a training set(n=601)and an internal validation set(n=256)in a 7∶3 ratio. Additionally,the relavent data of 126 AP patients from the ICU of Changshu Hospital Affiliated to Soochow University from January 2019 to March 2024 were collected as an external test set. Patients were categorized into ARDS and non - ARDS groups based on the occurrence of ARDS. Demographic characteristics,initial vital signs,laboratory data,functional scores,and complications within the initial 24-hour of ICU admission were collected. Feature selection was performed using least absolute shrinkage and selection operator(LASSO)regression. Predictive models were constructed using seven machine learning algorithms:random forest(RF),extreme gradient boosting(XGBoost),light gradient boosting machine(LightGBM),decision tree(DT),logistic regression(LR),support vector machine(SVM),and K-nearest neighbors(KNN). Model performance was evaluated using receiver operating characteristic (ROC) curves,calibration curves,and decision curve analysis(DCA). Finally,model interpretability was enhanced through Shapley additive explanations(SHAP)analysis. Results In the MIMIC-Ⅳ database,202 patients(23.57%)developed ARDS,while 26 patients(20.63%)developed ARDS in the external test set. Seven key variables were selected by LASSO regression from 43 variables in the training set to construct the models. Among various machine learning models,the RF model demonstrated the best performance with an area under the curve(AUC)of 0.780(95%CI:0.721-0.846)in the internal validation set and 0.842(95%CI:0.751-0.917)in the external test set,outperforming the other six models. The calibration curve indicated that the predicted probabilities from the RF model had the smaller deviation from the actual probabilities compared to other models,showing the best overall predictive performance. SHAP analysis based on the RF model revealed that mechanical ventilation,sequential organ failure assessment(SOFA)score,body mass index(BMI),peripheral oxygen saturation(SpO2)and simplified acute physiology score(SAPS Ⅱ)were the main factors influencing ARDS risk. Mechanicalventilation increased the risk of ARDS from 16% to 37% . When the SOFA score exceeded 8, the ARDS risk rose significantly. The risk of ARDS elevated with increased BMI. While SpO2 remained below 90%, ARDS risk stabilized at 30%; once SpO2 surpassed 90%, the risk demonstrated a declining trend with further increases in SpO2. For SAPS-Ⅱ scores between 46 and 60, ARDS risk showed a pronounced upward trend. Conclusion The RF predictive model provides a reliable tool for assessing the risk of ARDS in AP patients and enhances model interpretability through the SHAP method,aiding in clinical decision-making.
文章编号:     中图分类号:R563.8 TP181    文献标志码:A
基金项目:苏州市科技发展计划项目(SLT2023006);常熟市科技发展计划重点项目(CSWS202209);中华国际医学交流基金会呼吸疾病专项项目(Z-2014-08-2309-1)
附件
引用文本:
任夏,刘罗杰,查俊杰,等.基于机器学习算法构建重症监护病房急性胰腺炎并发急性呼吸窘迫综合征的风险预测模型[J].中国临床研究,2025,38(8):1173-1181.

用微信扫一扫

用微信扫一扫