Responsive image
博碩士論文 etd-0716121-152110 詳細資訊
Title page for etd-0716121-152110
論文名稱
Title
基於混合效應的機器學習模型—以嚴重敗血症預測為例
Severe Sepsis Prediction Using Mixed-effect Machine Learning Models
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
51
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2021-07-02
繳交日期
Date of Submission
2021-08-16
關鍵字
Keywords
嚴重敗血症、機器學習、隨機森林、混合效應模型、早期預警系統
Severe Sepsis, Machine Learning, Random Forests, Mixed-Effects Model, Early Warning System
統計
Statistics
本論文已被瀏覽 389 次,被下載 0
The thesis/dissertation has been browsed 389 times, has been downloaded 0 times.
中文摘要
根據衛生福利部統計處統計,嚴重敗血症曾於2015年至2018年連續三年列為台灣女性十大死因之一,且此症屬醫療急症,可能在數小時至數日內快速惡化。根據美國急診醫學雜誌期刊指出,此類患者應早於急診即接受適當的治療,有75%左右患者從中受益。
因此本研究目的在於急診病人檢傷分類,醫師依序完成治療計畫後,機器學習模型輔助提供醫生診斷資訊,預測病人嚴重敗血症發生可能性,進一步警示醫療人員及早處置。除此之外,更重要的是在於能找出影響診斷為嚴重敗血症的重要變數,透過變數重要性可以及早使臨床醫療人員識別發生嚴重敗血症之徵象與介入治療,即時施以醫療照護,降低因嚴重敗血症造成的死亡機率。
而在研究中我們提出了基於混合效應的機器學習方法,即廣義線性混合效應模型樹(Generalized Linear Mixed-effects Model Trees, GLMM Tree),以適配逐有重複測量且高度相關資料,廣義線性模型中加入定義為組間特定因素的隨機效應後,再以固定效應相互搭配函式影響模型,結合決策樹建模,在節點上建立線性模型來提升各規則預測準確度。其中本研究亦藉由隨機森林(Random Forests, RF)方法挑選影響診斷為嚴重敗血症的重要變數,且相關體徵資料符合了文獻研究及獲得專家佐證,GLMM Tree的方法確實幫助模型觀察指標上的提升,可供醫療院所導入相關預警示系統,讓醫療團隊可以早期介入處置與治療,有效提升醫療照護品質及重視病人安全。
Abstract
According to the Ministry of Health and Welfare’s statistics from 2015 to 2018, severe sepsis had been one of the top ten leading causes of death among women in Taiwan. It’s a medical emergency that the patient’s condition may deteriorate within few hours to few days. The American Journal of Emergency Medical indicates that 75% of the patients with severe sepsis, that their condition can be alleviated when having appropriate treatment in the emergency room (ER).
This paper aims to predict the probability of patients having severe sepsis with a machine learning model. After the patients are triaged in the ER and finish the medical procedure, the model will help the doctors with assistive diagnosis. To warn medical staff applying early treatment, the model will predict the probability of the patients having severe sepsis. More importantly, finding the variables on the diagnosis of severe sepsis can help medical staff identify the signs of severe sepsis with the vital variables. Thereby the medical staff can apply the appropriate treatment to decrease the death rate caused by severe sepsis.
We proposed a learning algorithm called Generalized Linear Mixed-effects Model Trees (GLMM Tree), applied to some correlated data that are repeated measurements. By adding clustered random effects and fixed effects to the generalized linear model (GLM), the combination with the decision tree for modeling can improve the accuracy of the prediction by the node-specific parameter estimates of the GLM. We attempted to select vital variables of severe sepsis by using a random forest algorithm, which is corroborated by literature review and experts. The GLMM tree does help to improve the observation of the model on the indicators. With the improvement, importing the early warning system into the hospital information system can help the medical team having early treatment on patients. It can effectively improve the quality of both medical care and patient safety.
目次 Table of Contents
論文審定書 i
誌 謝 ii
摘 要 iii
Abstract iv
圖 次 vii
表 次 viii
第一章 緒論 1
1.1 研究背景 1
1.2 研究動機 2
1.3 研究目的 2
第二章 文獻探討 4
2.1 嚴重敗血症 4
2.1.1 定義 4
2.1.2 病因 4
2.1.3 臨床表徵 5
2.1.4 治療 6
2.2 混合效應模型Mixed-effects Model 8
2.3 廣義線性混合效應模型樹Generalized Linear Mixed-effects Model Trees, GLMM Tree 9
第三章 研究方法與步驟 12
3.1 研究方法 12
3.2 評估標準 13
3.2.1 ROC曲線 13
3.2.2 ROC曲線下面積 ─ AUC 14
3.3研究架構 15
第四章 實驗結果與討論分析 17
4.1資料整理 17
4.2 研究流程 18
4.3 研究過程 23
4.4 研究分析 24
4.5 研究限制 29
第五章 研究結論與建議 31
5.1 研究結論 31
5.2 未來研究 32
參考文獻 33
附錄 40


參考文獻 References
Alvarez, B. D., Razente, D. M., Lacerda, D. A. M., Lother, N. S., Von-Bahten, L. C., & Stahlschmidt, C. M. M. (2016). Analysis of the Revised Trauma Score (RTS) in 200 victims of different trauma mechanisms. Revista Do Colégio Brasileiro de Cirurgiões, 43(5), 334–340. https://doi.org/10.1590/0100-69912016005010
Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7), 1145–1159. https://doi.org/10.1016/S0031-3203(96)00142-2
Breiman, L. (1996). Bagging Predictors. Machine Learning, 24(2), 123–140. https://doi.org/10.1023/A:1018054314350
Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
Brown, J. D. (2018). Mixed-Effects Models. In J. D. Brown, Advanced Statistics for the Behavioral Sciences (pp. 495–526). Springer International Publishing. https://doi.org/10.1007/978-3-319-93549-2_14
Champion, H. R., Sacco, W. J., Copes, W. S., Gann, D. S., Gennarelli, T. A., & Flanagan, M. E. (1989). A Revision of the Trauma Score: The Journal of Trauma: Injury, Infection, and Critical Care, 29(5), 623–629. https://doi.org/10.1097/00005373-198905000-00017
Developing a New Definition and Assessing New Clinical Criteria for Septic Shock: For the Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). (2016). 13.
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874. https://doi.org/10.1016/j.patrec.2005.10.010
Fokkema, M., Edbrooke-Childs, J., & Wolpert, M. (2021). Generalized linear mixed-model (GLMM) trees: A flexible decision-tree method for multilevel and longitudinal data. Psychotherapy Research, 31(3), 329–341. https://doi.org/10.1080/10503307.2020.1785037
Fokkema, M., Smits, N., Zeileis, A., Hothorn, T., & Kelderman, H. (2018). Detecting treatment-subgroup interactions in clustered data with generalized linear mixed-effects model trees. Behavior Research Methods, 50(5), 2016–2034. https://doi.org/10.3758/s13428-017-0971-x
Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., & Herrera, F. (2012). A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(4), 463–484. https://doi.org/10.1109/TSMCC.2011.2161285
Gałecki, A., & Burzykowski, T. (2013). Linear Mixed-Effects Model. In A. Galecki & T. Burzykowski, Linear Mixed-Effects Models Using R (pp. 245–273). Springer New York. https://doi.org/10.1007/978-1-4614-3900-4_13
Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. (pp. xi, 455). John Wiley.
Henderson, C. R. (1953). Estimation of Variance and Covariance Components. Biometrics, 9(2), 226. https://doi.org/10.2307/3001853
Hosmer, D. W., & Lemeshow, S. (2000). Applied Logistic Regression: Hosmer/Applied Logistic Regression. John Wiley & Sons, Inc. https://doi.org/10.1002/0471722146
Jin Huang, & Ling, C. X. (2005). Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 17(3), 299–310. https://doi.org/10.1109/TKDE.2005.50
Kumar, P., Bhatnagar, R., Gaur, K., & Bhatnagar, A. (2021). Classification of Imbalanced Data:Review of Methods and Applications. IOP Conference Series: Materials Science and Engineering, 1099(1), 012077. https://doi.org/10.1088/1757-899X/1099/1/012077
Levy, M. M., Evans, L. E., & Rhodes, A. (2018). The Surviving Sepsis Campaign Bundle: 2018 update. Intensive Care Medicine, 44(6), 925–928. https://doi.org/10.1007/s00134-018-5085-0
Long, B. (n.d.). Best Clinical Practice: Blood Culture Utility in the Emergency Department. 11.
McCulloch, C. (2000). An Introduction to Generalized Linear Mixed Models. JASA. Journal of the American Statistical Association, 95. https://doi.org/10.2307/2669780
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. ArXiv:1301.3781 [Cs]. http://arxiv.org/abs/1301.3781
Molenberghs, G., Laenen, A., & Vangeneugden, T. (2007). Estimating Reliability and Generalizability from Hierarchical Biomedical Data. Journal of Biopharmaceutical Statistics, 17(4), 595–627. https://doi.org/10.1080/10543400701329448
Oza, N. C., & Tumer, K. (2008). Classifier ensembles: Select real-world applications. Information Fusion, 9(1), 4–20. https://doi.org/10.1016/j.inffus.2007.07.002
Paoli, C. J., Reynolds, M. A., Sinha, M., Gitlin, M., & Crouser, E. (2018). Epidemiology and Costs of Sepsis in the United States—An Analysis Based on Timing of Diagnosis and Severity Level*: Critical Care Medicine, 46(12), 1889–1897. https://doi.org/10.1097/CCM.0000000000003342
Rana, R., Singhal, R., & Singh, V. (2013). Analysis of repeated measurement data in the clinical trials. Journal of Ayurveda and Integrative Medicine, 4(2), 77. https://doi.org/10.4103/0975-9476.113872
Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance), Pub. L. No. 32016R0679, 119 OJ L (2016). http://data.europa.eu/eli/reg/2016/679/oj/eng
Rhodes, A., Evans, L. E., Alhazzani, W., Levy, M. M., Antonelli, M., Ferrer, R., Kumar, A., Sevransky, J. E., Sprung, C. L., Nunnally, M. E., Rochwerg, B., Rubenfeld, G. D., Angus, D. C., Annane, D., Beale, R. J., Bellinghan, G. J., Bernard, G. R., Chiche, J.-D., Coopersmith, C., … Dellinger, R. P. (2017). Surviving Sepsis Campaign: International Guidelines for Management of Sepsis and Septic Shock: 2016. Intensive Care Medicine, 43(3), 304–377. https://doi.org/10.1007/s00134-017-4683-6
Rudd, K. E., Johnson, S. C., Agesa, K. M., Shackelford, K. A., Tsoi, D., Kievlan, D. R., Colombara, D. V., Ikuta, K. S., Kissoon, N., Finfer, S., Fleischmann-Struzek, C., Machado, F. R., Reinhart, K. K., Rowan, K., Seymour, C. W., Watson, R. S., West, T. E., Marinho, F., Hay, S. I., … Naghavi, M. (2020). Global, regional, and national sepsis incidence and mortality, 1990–2017: Analysis for the Global Burden of Disease Study. The Lancet, 395(10219), 200–211. https://doi.org/10.1016/S0140-6736(19)32989-7
Scheer, C. S. (2018). Impact of antibiotic administration on blood culture positivity at the beginning of sepsis: A prospective clinical cohort study. 6.
Seymour, C. W., Gesten, F., Prescott, H. C., Friedrich, M. E., Iwashyna, T. J., Phillips, G. S., Lemeshow, S., Osborn, T., Terry, K. M., & Levy, M. M. (2017). Time to Treatment and Mortality during Mandated Emergency Care for Sepsis. New England Journal of Medicine, 376(23), 2235–2244. https://doi.org/10.1056/NEJMoa1703058
Sherwin, R., Winters, M. E., Vilke, G. M., & Wardi, G. (2017). Does Early and Appropriate Antibiotic Administration Improve Mortality in Emergency Department Patients with Severe Sepsis or Septic Shock? The Journal of Emergency Medicine, 53(4), 588–595. https://doi.org/10.1016/j.jemermed.2016.12.009
Singer, M., Deutschman, C. S., Seymour, C. W., Shankar-Hari, M., Annane, D., Bauer, M., Bellomo, R., Bernard, G. R., Chiche, J.-D., Coopersmith, C. M., Hotchkiss, R. S., Levy, M. M., Marshall, J. C., Martin, G. S., Opal, S. M., Rubenfeld, G. D., van der Poll, T., Vincent, J.-L., & Angus, D. C. (2016). The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). JAMA, 315(8), 801. https://doi.org/10.1001/jama.2016.0287
Swets, J. A. (1996). Signal detection theory and ROC analysis in psychology and diagnostics: Collected papers. (pp. xv, 308). Lawrence Erlbaum Associates, Inc.
Teasdale, G., & Jennett, B. (1974). ASSESSMENT OF COMA AND IMPAIRED CONSCIOUSNESS. The Lancet, 304(7872), 81–84. https://doi.org/10.1016/S0140-6736(74)91639-0
the investigators of the SATISEPSIS group, Dubin, A., Loudet, C., Kanoore Edul, V. S., Osatnik, J., Ríos, F., Vásquez, D., Pozo, M., Lattanzio, B., Pálizas, F., Klein, F., Piezny, D., Rubatto Birri, P. N., Tuhay, G., García, A., Santamaría, A., Zakalik, G., González, C., & Estenssoro, E. (2020). Characteristics of resuscitation, and association between use of dynamic tests of fluid responsiveness and outcomes in septic patients: Results of a multicenter prospective cohort study in Argentina. Annals of Intensive Care, 10(1), 40. https://doi.org/10.1186/s13613-020-00659-7
The Surviving Sepsis Campaign Guidelines Committee including The Pediatric Subgroup*, Dellinger, R. P., Levy, M. M., Rhodes, A., Annane, D., Gerlach, H., Opal, S. M., Sevransky, J. E., Sprung, C. L., Douglas, I. S., Jaeschke, R., Osborn, T. M., Nunnally, M. E., Townsend, S. R., Reinhart, K., Kleinpell, R. M., Angus, D. C., Deutschman, C. S., Machado, F. R., … Moreno, R. (2013). Surviving Sepsis Campaign: International Guidelines for Management of Severe Sepsis and Septic Shock, 2012. Intensive Care Medicine, 39(2), 165–228. https://doi.org/10.1007/s00134-012-2769-8
Wold, H. (1985). Encyclopedia of statistical sciences. Partial Least Squares. Wiley, New York, 581–591.
Zweigner, J., Gramm, H.-J., Singer, O. C., Wegscheider, K., & Schumann, R. R. (2001). High concentrations of lipopolysaccharide-binding protein in serum of patients with severe sepsis or septic shock inhibit the lipopolysaccharide response in human monocytes. Blood, 98(13), 3800–3808. https://doi.org/10.1182/blood.V98.13.3800
林昆賢 & 蔡俊明. (2019). 基於深度學習的自然語言處理中預訓練Word2Vec模型的研究. 國教新知, 66(1), 15–31. https://doi.org/10.6701/TEEJ.201906_66(1).0002
衛生福利部統計處. (2017, January 24). 衛生福利部統計處死因統計. 統計處; 統計處. https://dep.mohw.gov.tw/dos/np-1776-113.html
陳音潔. (2002). 病患多次利用急診醫療之影響因素探討—以中部某醫學中心為例 [中國醫藥學院]. In 醫務管理研究所: Vol. 碩士. https://hdl.handle.net/11296/652wxy

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus:開放下載的時間 available 2024-08-16
校外 Off-campus:開放下載的時間 available 2024-08-16

您的 IP(校外) 位址是 3.89.163.120
現在時間是 2024-03-29
論文校外開放下載的時間是 2024-08-16

Your IP address is 3.89.163.120
The current date is 2024-03-29
This thesis will be available to you on 2024-08-16.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 2024-08-16

QR Code