論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus:開放下載的時間 available 2026-08-17
校外 Off-campus:開放下載的時間 available 2026-08-17
論文名稱 Title |
應用機器學習方法建立懷孕婦女發生子癲前症之風險預測模型 Employing Machine Learning Approach to Predict the Risk of Preeclampsia in Pregnant Women |
||
系所名稱 Department |
|||
畢業學年期 Year, semester |
語文別 Language |
||
學位類別 Degree |
頁數 Number of pages |
88 |
|
研究生 Author |
|||
指導教授 Advisor |
|||
召集委員 Convenor |
|||
口試委員 Advisory Committee |
|||
口試日期 Date of Exam |
2022-08-05 |
繳交日期 Date of Submission |
2023-08-17 |
關鍵字 Keywords |
子癲前症、早產、機器學習方法、預測模型、醫療決策輔助 preeclampsia, preterm birth, machine learning, predictive modeling, clinical decision support |
||
統計 Statistics |
本論文已被瀏覽 160 次,被下載 0 次 The thesis/dissertation has been browsed 160 times, has been downloaded 0 times. |
中文摘要 |
根據衛生福利部統計,孕產婦的死亡率從2016年至2019年,每十萬活產為11.6增長到16.0,主要的死亡原因前三名為產科栓塞、產後出血及子癲前症。子癲前症的發生率約為2%-5%,是不可輕忽的病症,若能及早預測發生情況並進行預防措施,將可減少其引發的危險以及更多的醫療成本費用、資源; 近年因疫情的影響,預防醫學更加顯現其重要性。因此本研究主要針對台灣於2015年至2019年之間孕產婦發生子癲前症及子癲前症患者發生早產的情況,利用全民健康保險研究資料篩選出發生子癲前症的危險因子及醫療費用,進行正常孕產婦及患者之間危險因子、醫療費用的比較以及建立預測模型。 本研究首先將利用統計檢定分析各項危險因子,再利用機器學習方法:羅吉斯迴歸、簡單貝式、決策樹、隨機森林及梯度提升方法進行建立預測模型,並分析其在預測健保資料中有發生子癲前症及早產的模型Receiver Operating Characteristic(ROC)曲線之Area Under Curve(AUC)表現以及特徵變數的重要性。研究結果顯示,隨機森林模型對子癲前症的預測表現較佳(AUC=0.732),其次為梯度提升方法(AUC=0.728),其預測表現皆顯著優於羅吉斯迴歸模型(p-value皆<.001)。此外,患有子癲前症之婦女,其健保醫療點數平均為44328.78點,除了顯著高於正常孕婦生產的28666.85點(p-value<.001),也有較高的剖腹產率、早產、甚至胎兒死亡率(p-value皆<.001)。本研究建議未來研究相關主題可以優先採用隨機森林、決策樹及梯度提升的方法,以及建議醫療人員在進行醫療決策上可以採納大數據分析預測的結果為輔助,對疾病及早發現與治療,進而節省醫療成本。 |
Abstract |
Data in the latest report published by The Centers for Disease Control and Prevention (CDC) shows the overall maternal death rate in Taiwan rose from 9.8 deaths per 100,000 live births in 2016 to 16 deaths per 100,000 in 2020. The three leading causes of preventable maternal mortality are obstetric embolism, postpartum hemorrhage and preeclampsia. The incidence of preeclampsia is estimated to range from 2% to 5% of all pregnancies. It is a condition unique to pregnancy that needed to be treated by a healthcare provider. However, The Covid-19 pandemic has had a major impact on the capacity of health systems to continue the delivery of essential health services; therefore, it is essential to allocate limited medical resources. The aim of this paper is to develop an accurate and useful clinical prediction model with multiple variables using Health and Welfare Data Center (HWDC) dataset. The HWDC’s data are used to screen the risk factors and medical costs for pregnant women between 2015 and 2019. We will conduct a statistical test to identify risk factors and build prediction models including Logistic regression, Naive Bayes, decision tree, random forest and gradient boost methods. By analyzing and comparing the AUC performance of ROC curve and the importance of characteristic variables in above prediction models will help us choose the most suitable prediction model to provide prevention suggestions for medical management in Healthcare. The results indicate that the model using random forest reached the best discrimination (AUC=0.732), followed by gradient boost methods (AUC=0.728). Both outperformed the model done with logistic regression in prediction. |
目次 Table of Contents |
目錄 論文審定書 i 摘要 ii 英文摘要 iii 圖目錄 vi 表目錄 viii 第一章 緒論1 第一節 研究背景1 第二節 研究動機2 第三節 研究目的3 第二章 文獻探討4 第一節 子癲前症及相關危險因子之探討4 第二節 機器學習演算法9 第三節 機器學習法應用於健保資料11 第四節 子癲前症預測12 第三章 研究方法17 第一節 研究架構17 第二節 資料來源20 第三節 研究對象21 第四節 資料預處理23 第五節 統計分析及模型建立30 第四章 研究結果37 第一節 各項危險因子及相關變項之統計檢定38 第二節 子癲前症預測模型結果41 第三節 子癲前症患者發生早產情況預測模型結果55 第五章 研究討論68 第一節 預測結果與危險因子討論68 第二節 模型方法討論69 第三節 醫療管理建議71 第四節 研究限制72 第六章 結論73 參考文獻74 英文參考文獻74 中文參考文獻79 |
參考文獻 References |
英文參考文獻 Atallah, A., Lecarpentier, E., Goffinet, F., Doret-Dion, M., Gaucherand, P., & Tsatsaris, V. (2017). Aspirin for prevention of preeclampsia. Drugs, 77(17), 1819-1831. Bell, M. J. (2010). A historical overview of preeclampsia‐eclampsia. Journal of Obstetric, Gynecologic & Neonatal Nursing, 39(5), 510-518. Baschat, A. A., Magder, L. S., Doyle, L. E., Atlas, R. O., Jenkins, C. B., & Blitzer, M. G. (2014). Prediction of preeclampsia utilizing the first trimester screening examination. American journal of obstetrics and gynecology, 211(5), 514-e1. Belgiu, M., & Drăguţ, L. (2016). Random forest in remote sensing: A review of applications and future directions. ISPRS journal of photogrammetry and remote sensing, 114, 24-31. Berrar, D. (2018). Bayes’ theorem and naive Bayes classifier. Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics, 403. Bdolah, Y., Elchalal, U., Natanson-Yaron, S., Yechiam, H., Bdolah-Abram, T., Greenfield, C., ... & Hochner-Celnikier, D. (2014). Relationship between nulliparity and preeclampsia may be explained by altered circulating soluble fms-like tyrosine kinase 1. Hypertension in pregnancy, 33(2), 250-259. Chaiworapongsa, T., Romero, R., Korzeniewski, S. J., Kusanovic, J. P., Soto, E., Lam, J., ... & Hassan, S. S. (2013). Maternal plasma concentrations of angiogenic/antiangiogenic factors in the third trimester of pregnancy to identify the patient at risk for stillbirth at or near term and severe late preeclampsia. American journal of obstetrics and gynecology, 208(4), 287-e1. Christina, K. H., Smith, G. C., Papageorghiou, A. T., Cacho, A. M., Nicolaides, K. H., & Fetal Medicine Foundation Second Trimester Screening Group. (2005). An integrated model for the prediction of preeclampsia using maternal factors and uterine artery Doppler velocimetry in unselected low-risk women. American journal of obstetrics and gynecology, 193(2), 429-436. Davies, E. L., Bell, J. S., & Bhattacharya, S. (2016). Preeclampsia and preterm delivery: A population-based case–control study. Hypertension in pregnancy, 35(4), 510-519. England, L. J., Levine, R. J., Qian, C., Morris, C. D., Sibai, B. M., Catalano, P. M., ... & Klebanoff, M. A. (2002). Smoking before pregnancy and risk of gestational hypertension and preeclampsia. American journal of obstetrics and gynecology, 186(5), 1035-1040. Fumo, D. (2017). Types of machine learning algorithms you should know. Towards Data Science, Towards Data Science, 15. Guy, G. P., Ling, H. Z., Garcia, P., Poon, L. C., & Nicolaides, K. H. (2017). Maternal cardiac function at 35–37 weeks' gestation: prediction of pre‐eclampsia and gestational hypertension. Ultrasound in Obstetrics & Gynecology, 49(1), 61-66. Ghojazadeh, M., Azami-Aghdash, S., Mohammadi, M., Vosoogh, S., Mohammadi, S., & Naghavi-Behzad, M. (2013). Prognostic risk factors for early diagnosing of Preeclampsia in Nulliparas. Nigerian medical journal: journal of the Nigeria Medical Association, 54(5), 344. Huang, J., & Ling, C. X. (2005). Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on knowledge and Data Engineering, 17(3), 299-310. He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on knowledge and data engineering, 21(9), 1263-1284. Islam, M. J., Wu, Q. J., Ahmadi, M., & Sid-Ahmed, M. A. (2007, November). Investigating the performance of naive-bayes classifiers and k-nearest neighbor classifiers. In 2007 International Conference on Convergence Information Technology (ICCIT 2007) (pp. 1541-1546). IEEE. Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255-260. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112, p. 18). New York: springer. Kumar, M., Sharma, K., Singh, R., Singh, S., Ravi, V., Singh, K., ... & Bhattacharya, J. (2016). Role of maternal factors, PAPP-A, and Doppler in screening for early-and late-onset pregnancy hypertension in Asian population. Hypertension in pregnancy, 35(3), 382-393. Kattah, A. (2020). Preeclampsia and kidney disease: deciphering cause and effect. Current Hypertension Reports, 22(11), 1-11. Korevaar, T. I., Steegers, E. A., Chaker, L., Medici, M., Jaddoe, V. W., Visser, T. J., ... & Peeters, R. P. (2016). The risk of preeclampsia according to high thyroid function in pregnancy differs by hCG concentration. The Journal of Clinical Endocrinology & Metabolism, 101(12), 5037-5043. Lai, J., Poon, L. C., Pinas, A., Bakalis, S., & Nicolaides, K. H. (2013). Uterine artery Doppler at 30-33 weeks' gestation in the prediction of preeclampsia. Fetal diagnosis and therapy, 33(3), 156-163. Mbah, A. K., Kornosky, J. L., Kristensen, S., August, E. M., Alio, A. P., Marty, P. J., ... & Salihu, H. M. (2010). Super‐obesity and risk for early and late pre‐eclampsia. BJOG: An International Journal of Obstetrics & Gynaecology, 117(8), 997-1004. Myatt, L., Clifton, R. G., Roberts, J. M., Spong, C. Y., Hauth, J. C., Varner, M. W., ... & Anderson, G. D. (2012). First-trimester prediction of preeclampsia in low-risk nulliparous women. Obstetrics and gynecology, 119(6), 1234. Natekin, A., & Knoll, A. (2013). Gradient boosting machines, a tutorial. Frontiers in neurorobotics, 7, 21. Narkhede, S. (2018). Understanding auc-roc curve. Towards Data Science, 26(1), 220-227. Preeclampsia and Pregnancy. (2022). Retrieved 15 June 2022, from https://www.acog.org/womens-health/infographics/preeclampsia-and-pregnancyDe Kat, A. C., Hirst, J., Woodward, M., Kennedy, S., & Peters, S. A. (2019). Prediction models for preeclampsia: a systematic review. Pregnancy hypertension, 16, 48-66. Peng, C. Y. J., Lee, K. L., & Ingersoll, G. M. (2002). An introduction to logistic regression analysis and reporting. The journal of educational research, 96(1), 3-14. Raschka, S. (2014). Predictive modeling, supervised machine learning, and pattern classification . Retrieved 28 July 2022, from https://sebastianraschka.com/Articles/2014_intro_supervised_learning.html Rolnik, D. L., Wright, D., Poon, L. C., O’Gorman, N., Syngelaki, A., de Paco Matallana, C., ... & Nicolaides, K. H. (2017). Aspirin versus placebo in pregnancies at high risk for preterm preeclampsia. New England Journal of Medicine, 377(7), 613-622. Recommendation: Aspirin Use to Prevent Preeclampsia and Related Morbidity and Mortality: Preventive Medication | United States Preventive Services Taskforce. (2022). Retrieved 15 June 2022, from https://www.uspreventiveservicestaskforce.org/uspstf/recommendation/low-dose-aspirin-use-for-the-prevention-of-morbidity-and-mortality-from-preeclampsia-preventive-medication Scazzocchio, E., Figueras, F., Crispi, F., Meler, E., Masoller, N., Mula, R., & Gratacos, E. (2013). Performance of a first-trimester screening of preeclampsia in a routine care low-risk setting. American journal of obstetrics and gynecology, 208(3), 203-e1. Sardana, D., Nanda, S., & Kharb, S. (2009). Thyroid hormones in pregnancy and preeclampsia. Journal of the Turkish German Gynecological Association, 10(3), 168. Spradley, F. T., Palei, A. C., & Granger, J. P. (2015). Increased risk for the development of preeclampsia in obese pregnancies: weighing in on the mechanisms. American Journal of Physiology-Regulatory, Integrative and Comparative Physiology, 309(11), R1326-R1343. Song, Y. Y., & Ying, L. U. (2015). Decision tree methods: applications for classification and prediction. Shanghai archives of psychiatry, 27(2), 130. Tubbergen, P., Lachmeijer, A. M. A., Althuisius, S. M., Vlak, M. E. J., Van Geijn, H. P., & Dekker, G. A. (1999). Change in paternity: a risk factor for preeclampsia in multiparous women?. Journal of reproductive immunology, 45(1), 81-88. Weissgerber, T. L., & Mudd, L. M. (2015). Preeclampsia and diabetes. Current diabetes reports, 15(3), 1-10. Xu, T. T., Zhou, F., Deng, C. Y., Huang, G. Q., Li, J. K., & Wang, X. D. (2015). Low‐Dose aspirin for preventing preeclampsia and its complications: a meta‐analysis. The Journal of Clinical Hypertension, 17(7), 567-573. Young, B. C., Levine, R. J., & Karumanchi, S. A. (2010). Pathogenesis of preeclampsia. Annual Review of Pathology: Mechanisms of Disease, 5, 173-192. 中文參考文獻 性別統計專區- 統計處. (2022). Retrieved 15 June 2022, from https://dep.mohw.gov.tw/dos/np-5076-113.html 林義隆, & 蔡淳娟. (2019). 機器學習與海量資料在醫學教育之應用. 台灣擬真醫學教育期刊, 6(1), 37-47. 陳育群, & 李偉強. (2016). 醫療大數據: 健保資料庫之臨床應用與研究. 台灣醫學, 20(6), 602-608. 重要性別統計資料庫-國內指標 . (2022). Retrieved 15 June 2022, from https://www.gender.ey.gov.tw/gecdb/Stat_Statistics_Category.aspx?fs=fTQP3HmkUvd1PbnmtSP3rw%40%40 |
電子全文 Fulltext |
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。 論文使用權限 Thesis access permission:自定論文開放時間 user define 開放時間 Available: 校內 Campus:開放下載的時間 available 2026-08-17 校外 Off-campus:開放下載的時間 available 2026-08-17 您的 IP(校外) 位址是 18.221.93.167 現在時間是 2024-11-21 論文校外開放下載的時間是 2026-08-17 Your IP address is 18.221.93.167 The current date is 2024-11-21 This thesis will be available to you on 2026-08-17. |
紙本論文 Printed copies |
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。 開放時間 available 2026-08-17 |
QR Code |