Responsive image
博碩士論文 etd-0622123-214807 詳細資訊
Title page for etd-0622123-214807
論文名稱
Title
基於集成學習演算法的不平衡資料分析-以藥物交互作用引起的急性冠狀動脈綜合症為例
Handling Class Imbalance Problems using Weighted Ensemble Models—A Case Study of Drug-Drug Interactions induced Acute Coronary Syndrome
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
44
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2023-07-07
繳交日期
Date of Submission
2023-07-22
關鍵字
Keywords
機器學習、集成學習演算法、不平衡資料集、可解釋模型、C50、CART
Machine Learning, Ensemble Learning Algorithms, Imbalanced Dataset, Interpretable Model, C50, CART
統計
Statistics
本論文已被瀏覽 223 次,被下載 15
The thesis/dissertation has been browsed 223 times, has been downloaded 15 times.
中文摘要
隨著現代醫學技術的迅速發展,治療各種疾病的藥物數量和類型也隨之增多。這種藥物多元化帶來的挑戰之一是掌握和預防可能的藥物交互作用,尤其是當這些交互作用可能引起急性冠狀動脈綜合症(Acute Coronary Syndrome,ACS)。然而,傳統的研究方法可能難以全面研究所有藥物交互作用及其對ACS的可能影響。
因此,本研究嘗試採用先進的機器學習方法來應對這個挑戰。首先,本研究使用了台灣全民健康保險研究資料庫中的病人用藥紀錄和ACS發病紀錄。這醫療數據是一個不平衡的數據集,本研究運用重抽樣的技術來克服這種不平衡,並進行模型訓練。本研究採用了集成學習演算法CART演算法及C50決策樹演算法進行模型訓練,並比較了它們的預測效能。最終使用CART演算法及C50演算法找出共同的規則,解釋藥物交互作用與ACS的潛在關聯。
透過這種方法,本研究不僅揭示了可能影響ACS風險的特定藥物交互作用,增強了對此類風險的警覺性,而且提供了一種預防策略,可以降低這種風險。這對於臨床醫生在處方藥物時的決策有著實質的幫助。
Abstract
With the rapid advancement of modern medical technology, the quantity and variety of drugs for treating various diseases have also increased. One of the challenges brought about by this diversity of medications is mastering and preventing potential drug interactions, especially when these interactions could lead to Acute Coronary Syndrome (ACS). However, traditional research methods might struggle to comprehensively study all drug interactions and their potential impacts on ACS.
Therefore, this study attempts to tackle this challenge using advanced machine learning techniques. Firstly, the study utilizes patient medication records and ACS incidence records from the Taiwan National Health Insurance Research Database. This medical data forms an imbalanced dataset, and the study employs resampling techniques to overcome this imbalance for model training. The study uses ensemble learning algorithms, specifically the CART algorithm and the C50 decision tree algorithm for model training, comparing their predictive performances. Finally, shared rules identified by the CART algorithm and the C50 algorithm are used to explain potential associations between drug interactions and ACS.
Through this approach, not only has the study revealed specific drug interactions that may influence the risk of ACS, enhancing vigilance towards such risks, but it also provides a preventive strategy to lower this risk. This provides substantial assistance to clinical doctors in making decisions when prescribing medications.
目次 Table of Contents
論文審定書........................................................................................................................i
誌謝...................................................................................................................................ii
中文摘要..........................................................................................................................iii
英文摘要..........................................................................................................................iv
目錄...................................................................................................................................v
圖次.................................................................................................................................vii
表次................................................................................................................................viii
第一章、緒論.....................................................................................................................1
1.1研究背景...............................................................................................................1
1.2研究動機...............................................................................................................1
1.3研究目的...............................................................................................................2
第二章、文獻探討.............................................................................................................3
2.1藥物交互作用與急性冠狀動脈綜合症關係.......................................................3
2.1.1 急性冠狀動脈綜合症...............................................................................3
2.1.2 藥物引發ACS..........................................................................................3
2.1.3 藥物治療ACS..........................................................................................4
2.2資料類別不平衡問題..........................................................................................5
2.2.1資料類別不平衡.........................................................................................5
2.2.2重採樣方法(Resampling)...........................................................................5
2.2.3 Bagging重抽樣方法..................................................................................7
2.3 Hold-out................................................................................................................7
2.4可解釋模型..........................................................................................................8
2.4.1可解釋性模型.............................................................................................8
2.4.2 CART演算法.............................................................................................9
2.4.3 C50 演算法................................................................................................9
第三章、研究方法與步驟...............................................................................................11
3.1研究流程.............................................................................................................11
3.2研究方法.............................................................................................................11
3.3評估模型與標準.................................................................................................12
3.3.1 ROC曲線..................................................................................................13
3.3.2 AUC..........................................................................................................13
3.3.3 Precision...................................................................................................13
3.3.4 Recall........................................................................................................14
3.3.5 F1 score.....................................................................................................14
第四章、研究結果與分析...............................................................................................15
4.1資料蒐集.............................................................................................................15
4.1.1研究族群篩選流程...................................................................................15
4.1.2 ACS定義..................................................................................................15
4.2資料清理.............................................................................................................16
4.3建立模型.............................................................................................................17
4.4評估模型.............................................................................................................19
4.5解釋模型.............................................................................................................19
第五章、討論與建議......................................................................................................22
5.1研究結論.............................................................................................................22
5.2未來建議.............................................................................................................22
5.3研究限制.............................................................................................................23
參考文獻.........................................................................................................................24
附錄.................................................................................................................................34
參考文獻 References
Alhumaid, W., & Paterson, D. I. (2020). Drug-induced acute coronary syndrome: A new cardiovascular concern with immune checkpoint inhibitors and the need for a prospective registry. Canadian Journal of Cardiology, 36(4), 455–456.
Attar, R., Wu, A., Wojdyla, D., Jensen, S. E., Andell, P., Mahaffey, K. W., Roe, M. T., James, S. K., Wallentin, L., Vemulapalli, S., Alexander, J. H., Lopes, R. D., Ohman, E. M., Hernandez, A. F., Patel, M. R., & Jones, W. S. (2022). Outcomes After Acute Coronary Syndrome in Patients With Diabetes Mellitus and Peripheral Artery Disease (from the TRACER, TRILOGY-ACS, APPRAISE-2, and PLATO Clinical Trials). The American Journal of Cardiology, 178, 11–17. https://doi.org/10.1016/j.amjcard.2022.04.062
Bhatt, D. L., Lopes, R. D., & Harrington, R. A. (2022). Diagnosis and Treatment of Acute Coronary Syndromes: A Review. JAMA, 327(7), 662–675. https://doi.org/10.1001/jama.2022.0358
Braun, L. (1983). Calcium channel blockers for the treatment of coronary artery spasm: Rationale, effects, and nursing responsibilities. Heart & Lung : The Journal of Critical Care. https://www.semanticscholar.org/paper/Calcium-channel-blockers-for-the-treatment-of-and-Braun/8bc6503d637a5c468f9f0e312c80b15f4f375476
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. https://doi.org/10.1007/BF00058655
Cacabelos, R., Cacabelos, N., & Carril, J. C. (2019). The role of pharmacogenomics in adverse drug reactions. Expert Review of Clinical Pharmacology, 12(5), 407–442. https://doi.org/10.1080/17512433.2019.1597706
Castro, V. M., Clements, C. C., Murphy, S. N., Gainer, V. S., Fava, M., Weilburg, J. B., Erb, J. L., Churchill, S. E., Kohane, I. S., Iosifescu, D. V., Smoller, J. W., & Perlis, R. H. (2013). QT interval and antidepressant use: A cross sectional study of electronic health records. BMJ, 346(jan29 3), f288–f288. https://doi.org/10.1136/bmj.f288
Charlot, M., Grove, E. L., Hansen, P. R., Olesen, J. B., Ahlehoff, O., Selmer, C., Lindhardsen, J., Madsen, J. K., Køber, L., Torp-Pedersen, C., & Gislason, G. H. (2011). Proton pump inhibitor use and risk of adverse cardiovascular events in aspirin treated patients with first time myocardial infarction: Nationwide propensity score matched study. BMJ, 342, d2690. https://doi.org/10.1136/bmj.d2690
Extensions to the CART algorithm—ScienceDirect. (n.d.). Retrieved June 30, 2023, from https://www.sciencedirect.com/science/article/abs/pii/0020737389900278
Fernández, A., del Jesus, M. J., & Herrera, F. (2009). On the influence of an adaptive inference system in fuzzy rule based classification systems for imbalanced data-sets. Expert Systems with Applications, 36(6), 9805–9812. https://doi.org/10.1016/j.eswa.2009.02.048
Franck, C., Filion, K. B., & Eisenberg, M. J. (2018). Smoking Cessation in Patients With Acute Coronary Syndrome. The American Journal of Cardiology, 121(9), 1105–1111. https://doi.org/10.1016/j.amjcard.2018.01.017
Gadaras, I., & Mikhailov, L. (2009). An interpretable fuzzy rule-based classification methodology for medical diagnosis. Artificial Intelligence in Medicine, 47(1), 25–41. https://doi.org/10.1016/j.artmed.2009.05.003
Hookana, E., Ansakorpi, H., Kortelainen, M.-L., Junttila, M. J., Kaikkonen, K. S., Perkiömäki, J., & Huikuri, H. V. (2016). Antiepileptic medications and the risk for sudden cardiac death caused by an acute coronary event: A prospective case-control study. Annals of Medicine, 48(1–2), 111–117. https://doi.org/10.3109/07853890.2016.1140225
Iino, K., & Ito, H. (2006). [Use of angiotensin converting enzyme inhibitors and angiotensin receptor blockers in patients with acute coronary syndrome]. Nihon rinsho Japanese journal of clinical medicine, 64(4), 734–741.
Khoshgoftaar, T. M., Van Hulse, J., & Napolitano, A. (2011). Comparing Boosting and Bagging Techniques With Noisy and Imbalanced Data. IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, 41(3), 552–568. https://doi.org/10.1109/TSMCA.2010.2084081
Kriegeskorte, N., & Golan, T. (2019). Neural network models and deep learning. Current Biology, 29(7), R231–R236. https://doi.org/10.1016/j.cub.2019.02.034
Kuhn, M., & Quinlan, R. (2023). C50: C5.0 Decision Trees and Rule-Based Models. https://topepo.github.io/C5.0/
Kuhn & Max. (2008). Building Predictive Models in R Using the caret Package. Journal of Statistical Software, 28(5), 1–26. https://doi.org/10.18637/jss.v028.i05
Kumar, P., Bhatnagar, R., Gaur, K., & Bhatnagar, A. (2021). Classification of Imbalanced Data:Review of Methods and Applications. IOP Conference Series: Materials Science and Engineering, 1099(1), 012077. https://doi.org/10.1088/1757-899X/1099/1/012077
Layne, K. (2017). Antiplatelet Therapy in Acute Coronary Syndrome. https://www.ecrjournal.com/articles/antiplatelet-therapy-acute-coronary-syndrome
Li, M., & Yang, H. (2021). Decision Tree Algorithm in College Students’ Health Evaluation System. In J. Abawajy, Z. Xu, M. Atiquzzaman, & X. Zhang (Eds.), 2021 International Conference on Applications and Techniques in Cyber Intelligence (pp. 705–710). Springer International Publishing. https://doi.org/10.1007/978-3-030-79197-1_101
Li, W.-P., Neradilek, M. B., Gu, F.-S., Isquith, D. A., Sun, Z.-J., Wu, X., Li, H.-W., & Zhao, X.-Q. (2017). Pregnancy-associated plasma protein-A is a stronger predictor for adverse cardiovascular outcomes after acute coronary syndrome in type-2 diabetes mellitus. Cardiovascular Diabetology, 16(1), 45. https://doi.org/10.1186/s12933-017-0526-6
Linardatos, P., Papastefanopoulos, V., & Kotsiantis, S. (2021). Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy, 23(1), Article 1. https://doi.org/10.3390/e23010018
Liu, R., Mai, F., Shan, Z., & Wu, Y. (2020). Predicting shareholder litigation on insider trading from financial text: An interpretable deep learning approach. Information & Management, 57(8), 103387. https://doi.org/10.1016/j.im.2020.103387
Lu, J., Wang, L., Bennamoun, M., Ward, I., An, S., Sohel, F., Chow, B. J. W., Dwivedi, G., & Sanfilippo, F. M. (2021). Machine learning risk prediction model for acute coronary syndrome and death from use of non-steroidal anti-inflammatory drugs in administrative data. Scientific Reports, 11(1), Article 1. https://doi.org/10.1038/s41598-021-97643-3
Mavroudis, C. A., Eleftheriou, D., Hong, Y., Majumder, B., Koganti, S., Sapsford, R., North, J., Lowdell, M., Klein, N., Brogan, P., & Rakhit, R. D. (2017). Microparticles in acute coronary syndrome. Thrombosis Research, 156, 109–116. https://doi.org/10.1016/j.thromres.2017.06.003
Mearns, B. M. (2015). Risks linked to NSAID use after MI. Nature Reviews Cardiology, 12(5), Article 5. https://doi.org/10.1038/nrcardio.2015.36
Mohammed, R., Rawashdeh, J., & Abdullah, M. (2020). Machine Learning with Oversampling and Undersampling Techniques: Overview Study and Experimental Results. 2020 11th International Conference on Information and Communication Systems (ICICS), 243–248. https://doi.org/10.1109/ICICS49469.2020.239556
Nguyen, K. N., Aursnes, I., & Kjekshus, J. (1997). Interaction Between Enalapril and Aspirin on Mortality After Acute Myocardial Infarction: Subgroup Analysis of the Cooperative New Scandinavian Enalapril Survival Study II (CONSENSUS II). The American Journal of Cardiology, 79(2), 115–119. https://doi.org/10.1016/S0002-9149(96)00696-0
Niazi, M., Galehdar, N., Jamshidi, M., Mohammadi, R., & Moayyedkazemi, A. (2020). A Review of the Role of Statins in Heart Failure Treatment. Current Clinical Pharmacology, 15(1), 30–37. https://doi.org/10.2174/1574884714666190802125627
Pak, D., Hwang, M., Lee, M., Woo, S.-I., Hahn, S.-W., Lee, Y.-J., Hwang, J., Pak, D., Hwang, M., Lee, M., Woo, S.-I., Hahn, S.-W., Lee, Y.-J., & Hwang, J. (2020). Application of Text-Classification Based Machine Learningin Predicting Psychiatric Diagnosis. Journal of the Korean Society of Biological Psychiatry, 18–26.
Park, K. C., Gaze, D. C., Collinson, P. O., & Marber, M. S. (2017). Cardiac troponins: From myocardial infarction to chronic disease. Cardiovascular Research, 113(14), 1708–1718. https://doi.org/10.1093/cvr/cvx183
Sámóczi, M., Farkas, A., Sipos, E., & Tarján, J. (1995). [Adverse effects of combined use of acenocoumarol and acetylsalicylic acid after myocardial infarct and unstable angina]. Orvosi Hetilap. https://www.semanticscholar.org/paper/%5BAdverse-effects-of-combined-use-of-acenocoumarol-S%C3%A1m%C3%B3czi-Farkas/a385044d9340fe1c2f60752c72c2a593114c9e65
Sathyadevi, G. (2011). Application of CART algorithm in hepatitis disease diagnosis. 2011 International Conference on Recent Trends in Information Technology (ICRTIT), 1283–1287. https://doi.org/10.1109/ICRTIT.2011.5972349
Schjerning Olsen, A.-M., Gislason, G. H., McGettigan, P., Fosbøl, E., Sørensen, R., Hansen, M. L., Køber, L., Torp-Pedersen, C., & Lamberts, M. (2015). Association of NSAID Use With Risk of Bleeding and Cardiovascular Events in Patients Receiving Antithrombotic Therapy After Myocardial Infarction. JAMA, 313(8), 805–814. https://doi.org/10.1001/jama.2015.0809
Smith, J. N., Negrelli, J. M., Manek, M. B., Hawes, E. M., & Viera, A. J. (2015). Diagnosis and Management of Acute Coronary Syndrome: An Evidence-Based Update. The Journal of the American Board of Family Medicine, 28(2), 283–293. https://doi.org/10.3122/jabfm.2015.02.140189
Sokolova, M., Japkowicz, N., & Szpakowicz, S. (2006). Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation. In A. Sattar & B. Kang (Eds.), AI 2006: Advances in Artificial Intelligence (pp. 1015–1021). Springer. https://doi.org/10.1007/11941439_114
Stoschitzky, K. (2009). Beta-Blockers in Acute Coronary Syndrome. Cardiology, 115(3), 170–171. https://doi.org/10.1159/000271470
Strauss, M. H., Hall, A. S., & Narkiewicz, K. (2021a). The Combination of Beta-Blockers and ACE Inhibitors Across the Spectrum of Cardiovascular Diseases. Cardiovascular Drugs and Therapy. https://doi.org/10.1007/s10557-021-07248-1
Strauss, M. H., Hall, A. S., & Narkiewicz, K. (2021b). The Combination of Beta-Blockers and ACE Inhibitors Across the Spectrum of Cardiovascular Diseases. Cardiovascular Drugs and Therapy. https://doi.org/10.1007/s10557-021-07248-1
Su, C.-T., Chen, L.-S., & Yih, Y. (2006). Knowledge acquisition through information granulation for imbalanced data. Expert Systems with Applications, 31(3), 531–541. https://doi.org/10.1016/j.eswa.2005.09.082
Therneau, T., & Atkinson, B. (2022). rpart: Recursive Partitioning and Regression Trees. https://CRAN.R-project.org/package=rpart
Thiazolidinedione-Induced Congestive Heart Failure—Alice YY Cheng, I George Fantus, 2004. (n.d.). Retrieved July 21, 2023, from https://journals.sagepub.com/doi/abs/10.1345/aph.1D400
Wright, M. N., & Ziegler, A. (2017). ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R. Journal of Statistical Software, 77(1), 1–17. https://doi.org/10.18637/jss.v077.i01
張輝霞, 張恒興, & 徐濤成. (2004). 糖尿病並發急性冠狀動脈综合症82例臨床分析. 實用心腦肺血管病雜誌, 12(6), 338–339.
統計處. (2022, June 30). 110年國人死因統計結果 [文字]. 統計處; 統計處. https://www.mohw.gov.tw/cp-16-70314-1.html
鄭悦彤. (2021). 考量不平衡資料集之線上拍賣詐騙偵測方法 [Master’s Thesis, 淡江大學]. In 淡江大學資訊管理學系碩士班學位論文 (Issue 2021年). https://www.airitilibrary.com/Publication/alDetailedMesh1?DocID=U0002-2401202118453200
陳宇慈. (2021). 信用卡盜刷模型偵測:分別以類神經網路及支援向量機之模型成效比較 [Master’s Thesis, 國立政治大學]. In 政治大學企業管理研究所(MBA學位學程)學位論文 (Issue 2021年). https://doi.org/10.6814/NCCU202100493

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code