北京大学学报(医学版) ›› 2021, Vol. 53 ›› Issue (3): 566-572. doi: 10.19723/j.issn.1671-167X.2021.03.021
林瑜1,2,吴静依3,蔺轲1,2,胡永华2,4,孔桂兰1,3,Δ()
LIN Yu1,2,WU Jing-yi3,LIN Ke1,2,HU Yong-hua2,4,KONG Gui-lan1,3,Δ()
摘要:
目的: 基于集成学习算法建立患者再入重症监护病房(intensive care unit, ICU)的风险预测模型,并比较各个模型的预测性能。方法: 使用美国重症医学数据库(medical information mart for intensive care,MIMIC)-Ⅲ,根据纳入、排除标准筛选患者,提取人口学特征、生命体征、实验室检查、合并症等可能对结局有预测作用的变量,基于集成学习方法随机森林、自适应提升算法(adaptive boosting, AdaBoost)和梯度提升决策树(gradient boosting decision tree, GBDT)建立再入ICU预测模型,并比较集成学习与Logistic回归的预测性能。使用五折交叉验证后的平均灵敏度、阳性预测值、阴性预测值、假阳性率、假阴性率、受试者工作特征曲线下面积(area under the receiver operating characteristic curve,AUROC)和Brier评分评价模型效果,基于最佳性能模型给出重要性排序前10位的预测变量。结果: 所有模型中,GBDT (AUROC=0.858)优于随机森林(AUROC=0.827),略好于AdaBoost (AUROC=0.851)。与Logistic回归(AUROC=0.810)相比,集成学习算法在区分度上均有较大的提升。GBDT算法给出的变量重要性排序中,平均动脉压、收缩压、舒张压、心率、尿量、血肌酐等变量排序靠前,相对而言,再入ICU患者的心血管功能和肾功能更差。结论: 基于集成学习算法的患者再入ICU预测模型表现出较好的性能,优于Logistic回归。使用集成学习算法建立的再入ICU风险预测模型可用于识别再入ICU风险高的患者,医务人员可针对高风险患者采取干预措施,改善患者的整体临床结局。
中图分类号:
[1] |
Halpern NA, Pastores SM. Critical care medicine in the United States 2000-2005: an analysis of bed numbers, occupancy rates, payer mix, and costs[J]. Crit Care Med, 2010,38(1):65-71.
doi: 10.1097/CCM.0b013e3181b090d0 |
[2] |
Woldhek AL, Rijkenberg S, Bosman RJ, et al. Readmission of ICU patients: A quality indicator?[J]. J Crit Care, 2017,38:328-334.
doi: 10.1016/j.jcrc.2016.12.001 |
[3] |
Kramer AA, Higgins TL, Zimmerman JE. The association between ICU readmission rate and patient outcomes[J]. Crit Care Med, 2013,41(1):24-33.
doi: 10.1097/CCM.0b013e3182657b8a |
[4] |
Rosenberg AL, Hofer TP, Hayward RA, et al. Who bounces back? Physiologic and other predictors of intensive care unit readmission[J]. Crit Care Med, 2001,29(3):511-518.
pmid: 11373413 |
[5] |
Baker DR, Pronovost PJ, Morlock LL, et al. Patient flow variabi-lity and unplanned readmissions to an intensive care unit[J]. Crit Care Med, 2009,37(11):2882-2887.
doi: 10.1097/CCM.0b013e3181b01caf |
[6] |
Martin LA, Kilpatrick JA, Al-Dulaimi R, et al. Predicting ICU readmission among surgical ICU patients: Development and validation of a clinical nomogram[J]. Surgery, 2019,165(2):373-380.
doi: S0039-6060(18)30429-X pmid: 30170817 |
[7] |
Lee H, Lim CW, Hong HP, et al. Efficacy of the APACHE Ⅱ score at ICU discharge in predicting post-ICU mortality and ICU readmission in critically ill surgical patients[J]. Anaesth Intensive Care, 2015,43(2):175-186.
doi: 10.1177/0310057X1504300206 |
[8] |
Fialho AS, Cismondi F, Vieira SM, et al. Data mining using clinical physiology at discharge to predict ICU readmissions[J]. Expert Syst Appl, 2012,39(18):13158-13165.
doi: 10.1016/j.eswa.2012.05.086 |
[9] |
Desautels T, Das R, Calvert J, et al. Prediction of early unplanned intensive care unit readmission in a UK tertiary care hospital: a cross-sectional machine learning approach[J]. BMJ Open, 2017,7(9):e017199.
doi: 10.1136/bmjopen-2017-017199 |
[10] |
Hosni M, Abnane I, Idri A, et al. Reviewing ensemble classification methods in breast cancer[J]. Comput Methods Programs Biomed, 2019,177:89-112.
doi: 10.1016/j.cmpb.2019.05.019 |
[11] | Liu Y, Gu Y, Nguyen JC, et al. Symptom severity classification with gradient tree boosting[J]. J Biomed Inform, 2017,75S:S105-S111. |
[12] |
Johnson AE, Pollard TJ, Shen L, et al. MIMIC-Ⅲ, a freely accessible critical care database[J]. Sci Data, 2016,3:160035.
doi: 10.1038/sdata.2016.35 |
[13] |
Austin SR, Wong YN, Uzzo RG, et al. Why summary comorbidity measures such as the Charlson comorbidity index and Elixhauser score work[J]. Med Care, 2015,53(9):E65-E72.
doi: 10.1097/MLR.0b013e318297429c |
[14] |
Oakes DF, Borges IN, Forgiarini Junior LA, et al. Assessment of ICU readmission risk with the stability and workload index for transfer score[J]. J Bras Pneumol, 2014,40(1):73-76.
doi: 10.1590/S1806-37132014000100011 |
[15] |
Xue Y, Klabjan D, Luo Y. Predicting ICU readmission using grouped physiological and medication trends[J]. Artif Intell Med, 2019,95:27-37.
doi: S0933-3657(17)30648-6 pmid: 30213670 |
[16] |
He HB, Garcia EA. Learning from imbalanced data[J]. IEEE T Knowl Data En, 2009,21(9):1263-1284.
doi: 10.1109/TKDE.2008.239 |
[17] |
Rahman R, Matlock K, Ghosh S, et al. Heterogeneity aware random forest for drug sensitivity prediction[J]. Sci Rep, 2017,7(1):11347.
doi: 10.1038/s41598-017-11665-4 |
[18] |
Hu J. Automated detection of driver fatigue based on AdaBoost classifier with EEG signals[J]. Front Comput Neurosci, 2017,11:72.
doi: 10.3389/fncom.2017.00072 |
[19] |
Friedman JH. Greedy function approximation: A gradient boosting machine[J]. Ann Stat, 2001,29(5):1189-1232.
doi: 10.1214/aos/1013203450 |
[20] | Mani I, Zhang I. kNN approach to unbalanced data distributions: a case study involving information extraction[C]// ICML 2003 Workshop on Learning from Imbalanced Datasets, August 21-24, 2003. Washington, D.C.: ICML, 2003. |
[1] | 吴静依,林瑜,蔺轲,胡永华,孔桂兰. 基于机器学习的重症监护室超长入住时长预测[J]. 北京大学学报(医学版), 2021, 53(6): 1163-1170. |
[2] | 朱学华,杨明钰,夏海缀,何为,张智荧,刘余庆,肖春雷,马潞林,卢剑. 机器学习模型在预测肾结石输尿管软镜碎石术后早期结石清除率中的应用[J]. 北京大学学报(医学版), 2019, 51(4): 653-659. |
|