國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,可解釋的表徵學習 - 使用基於模型的深度規則森林,Interpretable Representation Learning with Model-based Deep Rule Forest

論文名稱 Title	可解釋的表徵學習 - 使用基於模型的深度規則森林 Interpretable Representation Learning with Model-based Deep Rule Forest
系所名稱 Department	資訊管理學系 Department of Information Management
畢業學年期 Year, semester	110 學年度第 2 學期 The spring semester of Academic Year 110	語文別 Language	英文 English
學位類別 Degree	碩士 Master	頁數 Number of pages	64
研究生 Author	徐　湛 Chan Hsu
指導教授 Advisor	康藝晃 KANG, YI-HUANG
召集委員 Convenor	李珮如 LEE, PEI-JU
口試委員 Advisory Committee	楊惠芳 Yang,Huei-Fang
口試日期 Date of Exam	2022-07-29	繳交日期 Date of Submission	2022-09-14
關鍵字 Keywords	規則學習、可解釋性、機器學習、深度規則森林、深度模型結構 Rule Learning, Interpretability, Machine Learning, Deep Rule Forest, Deep Model Architecture
統計 Statistics	本論文已被瀏覽 410 次，被下載 62 次 The thesis/dissertation has been browsed 410 times, has been downloaded 62 times.

中文摘要
受益於運算成本的下降，深度學習模型被廣泛應用到許多任務中。雖然這類模型在預測上的表現比傳統機器學習模型改善許多，但由於這類模型的模型結構過於複雜，使得人們無法理解模型的決策過程，資料中隱含的歧視可能被模型學習到且無法被辨識出來，因此出現法律要求模型具有可解釋性。然而相比於深度學習模型，目前具有可解釋性的模型無法從資料中學到比較複雜的特徵，因此預測表現很難與深度學習模型相比。為了在學習到複雜的特徵的同時也能具有可解釋性，我們從基於模型的隨機森林中萃取出規則與其中的有母數模型的參數，並且加深模型結構來學習到更複雜的特徵。我們提出基於模型的深度規則森林，結合可解釋性與深層的模型架構來使模型兼具可解釋性與較好的預測表現，同時結合模型中的有母數模型的係數，來比較不同規則中的有母數模型的變數關係。
Abstract
Deep learning models are widely applied to many fields due to the decrease in computation cost. Although the performance of such models in prediction is much better than traditional machine learning models, the complexity of the model structure of such models makes it impossible to understand the decision process of the models. The implicit discrimination in the data may be learned by the models and cannot be recognized, so there is a legal requirement for the models to be interpretable. However, current models with interpretability cannot learn more complex features from the data than deep learning models, so the prediction performance is hardly comparable to deep learning models. To learn complex features while being interpretable, we extract the rules from the model-based random forest with the parameters of the parent model and deepen the model structure to learn more complex features. We propose a model-based Deep Rule Forest (mobDRF) that combines interpretability with a deep model structure to make the model interpretable and better predictive. We combine the coefficients of the parent model in the model to compare the relationship between the variables of the parent model in different rules.

目次 Table of Contents
論文審定書i 摘要ii Abstractiii List of Figuresv List of Tablesvi 1. Introduction1 2. Background3 2.1 Representation Learning3 2.2 Interpretable Machine Learning4 2.3 Random Forests with Deep Architecture6 2.4 Model-based Recursive Partitioning7 3. Using Model-based Deep Rule Forests9 3.1 Building Model-based DRF10 3.2 Rule Interpretation for Model-based DRF12 4. Experiment13 4.1 Experiment Setup13 4.2 Model Performance Comparison15 4.3 Model Interpretation with Learned Rules19 5. Conclusion and Discussion24 References26 Appendix32 Appendix A. Rules of Care Home Incidents32 Table A1. Rule of rpart32 Table A2. Rule of C5034 Table A3. Rule of PRE37 Table A4. Rule of MOB tree40 Table A5. Rule of MOB tree with 1st layer of mobDRF41 Appendix B. Rules of TLSA 201545 Table B1. Rule of rpart45 Table B2. Rule of MOB tree47 Table B3. Rule of PRE49 Table B4. Rule of MOB tree with 1st layer of mobDRF51

參考文獻 References
Adadi, A., & Berrada, M. (2018). Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access, 6, 52138–52160. https://doi.org/10.1109/ACCESS.2018.2870052 Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., & Verkamo, A. I. (1996). Fast discovery of association rules. In Advances in knowledge discovery and data mining (pp. 307–328). American Association for Artificial Intelligence. Allaire, J. J., & Chollet, F. (2022). keras: R Interface to “Keras.” https://keras.rstudio.com Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning, 2(1), 1–127. https://doi.org/10.1561/2200000006 Bengio, Y., Courville, A., & Vincent, P. (2013). Representation Learning: A Review and New Perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 1798–1828. https://doi.org/10.1109/TPAMI.2013.50 Bengio, Y., Delalleau, O., & Simard, C. (2010). Decision Trees Do Not Generalize to New Variations. Computational Intelligence, 26(4), 449–467. https://doi.org/10.1111/j.1467-8640.2010.00366.x Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and Regression Trees. Taylor & Francis. Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., Chen, K., Mitchell, R., Cano, I., Zhou, T., Li, M., Xie, J., Lin, M., Geng, Y., Li, Y., Yuan, J., & implementation), Xgb. contributors (base Xgb. (2022). xgboost: Extreme Gradient Boosting (1.6.0.1). https://CRAN.R-project.org/package=xgboost Cutler, F. original by L. B. and A., & Wiener, R. port by A. L. and M. (2022). randomForest: Breiman and Cutler’s Random Forests for Classification and Regression (4.7-1.1). https://CRAN.R-project.org/package=randomForest Deng, L. (2012). The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6), 141–142. Dua, D., & Graff, C. (2017). UCI Machine Learning Repository. University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml Fokkema, M. (2020). Fitting Prediction Rule Ensembles with R Package pre. Journal of Statistical Software, 92(12). https://doi.org/10.18637/jss.v092.i12 Fokkema, M., Edbrooke-Childs, J., & Wolpert, M. (2021). Generalized linear mixed-model (GLMM) trees: A flexible decision-tree method for multilevel and longitudinal data. Psychotherapy Research, 31(3), 329–341. https://doi.org/10.1080/10503307.2020.1785037 Friedman, J. H., & Popescu, B. E. (2008). Predictive Learning via Rule Ensembles. The Annals of Applied Statistics, 2(3), 916–954. Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1). https://doi.org/10.18637/jss.v033.i01 Fürnkranz, J., Gamberger, D., & Lavrač, N. (2012). Foundations of Rule Learning. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-75197-7 Geirhos, R., Jacobsen, J.-H., Michaelis, C., Zemel, R., Brendel, W., Bethge, M., & Wichmann, F. A. (2020). Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11), 665–673. https://doi.org/10.1038/s42256-020-00257-z Goodman, B., & Flaxman, S. (2017). European Union Regulations on Algorithmic Decision-Making and a “Right to Explanation.” AI Magazine, 38(3), 50–57. https://doi.org/10.1609/aimag.v38i3.2741 GuidottiRiccardo, MonrealeAnna, RuggieriSalvatore, TuriniFranco, GiannottiFosca, & PedreschiDino. (2018). A Survey of Methods for Explaining Black Box Models. ACM Computing Surveys (CSUR). https://doi.org/10.1145/3236009 He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. 770–778. https://openaccess.thecvf.com/content_cvpr_2016/html/He_Deep_Residual_Learning_CVPR_2016_paper.html Hothorn, T., & Zeileis, A. (2015). partykit: A Modular Toolkit for Recursive Partytioning in R. Journal of Machine Learning Research, 16, 3905–3909. Hutson, G., Laldin, A., & Velásquez, I. (2022). MLDataR: Collection of Machine Learning Datasets for Supervised Machine Learning (0.1.3). https://CRAN.R-project.org/package=MLDataR Jean, S., Cho, K., Memisevic, R., & Bengio, Y. (2015). On Using Very Large Target Vocabulary for Neural Machine Translation (arXiv:1412.2007). arXiv. https://doi.org/10.48550/arXiv.1412.2007 Jeong, M., Nam, J., & Ko, B. C. (2020). Lightweight Multilayer Random Forests for Monitoring Driver Emotional Status. IEEE Access, 8, 60344–60354. https://doi.org/10.1109/ACCESS.2020.2983202 Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. Kuhn, M., & Quinlan, R. (2022). C50: C5.0 Decision Trees and Rule-Based Models. https://topepo.github.io/C5.0/ LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539 Miller, K., Hettinger, C., Humpherys, J., Jarvis, T., & Kartchner, D. (2017). Forward Thinking: Building Deep Random Forests. ArXiv:1705.07366 [Cs, Stat]. http://arxiv.org/abs/1705.07366 Qiao, L., Wang, W., & Lin, B. (2021). Learning Accurate and Interpretable Decision Rule Sets from Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 35(5), 4303–4311. Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106. https://doi.org/10.1007/BF00116251 Quinlan, J. R. (2014). C4.5: Programs for Machine Learning. Elsevier. R Core Team. (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. https://www.R-project.org/ Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144. https://doi.org/10.1145/2939672.2939778 Ripley, B., Venables, B., Bates, D. M., ca 1998), K. H. (partial port, ca 1998), A. G. (partial port, & Firth, D. (2021). MASS: Support Functions and Datasets for Venables and Ripley’s MASS (7.3-54). https://CRAN.R-project.org/package=MASS RStudio Team. (2020). RStudio: Integrated Development Environment for R. RStudio, PBC. http://www.rstudio.com/ Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206–215. https://doi.org/10.1038/s42256-019-0048-x Rudin, C., Chen, C., Chen, Z., Huang, H., Semenova, L., & Zhong, C. (2021). Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges. ArXiv:2103.11251 [Cs, Stat]. http://arxiv.org/abs/2103.11251 Seibold, H., Zeileis, A., & Hothorn, T. (2016). Model-Based Recursive Partitioning for Subgroup Analyses. The International Journal of Biostatistics, 12(1), 45–63. https://doi.org/10.1515/ijb-2015-0032 Simonyan, K., & Zisserman, A. (2015). Very Deep Convolutional Networks for Large-Scale Image Recognition (arXiv:1409.1556). arXiv. https://doi.org/10.48550/arXiv.1409.1556 Su, G., Wei, D., Varshney, K. R., & Malioutov, D. M. (2016). Interpretable Two-level Boolean Rule Learning for Classification. ArXiv:1606.05798 [Cs, Stat]. http://arxiv.org/abs/1606.05798 Taiwan Longitudinal Study on Aging (TLSA). (n.d.). Health Promotion Administration, Ministry of Health and Welfare. Retrieved June 30, 2022, from https://www.hpa.gov.tw/EngPages/Detail.aspx?nodeid=1077&pid=6197 Therneau, T., Atkinson, B., port, B. R. (producer of the initial R., & maintainer 1999-2017). (2022). rpart: Recursive Partitioning and Regression Trees (4.1.16). https://CRAN.R-project.org/package=rpart Wagner, C. H. (1982). Simpson’s Paradox in Real Life. The American Statistician, 36(1), 46–48. https://doi.org/10.2307/2684093 Wang, T., Rudin, C., Doshi-Velez, F., Liu, Y., Klampfl, E., & MacNeille, P. (2017). A Bayesian Framework for Learning Rule Sets for Interpretable Classification. Journal of Machine Learning Research, 18(70), 1–37. Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org Wright, M. N., Wager, S., & Probst, P. (2022). ranger: A Fast Implementation of Random Forests (0.14.1). https://CRAN.R-project.org/package=ranger Zeileis, A., Hothorn, T., & Hornik, K. (2008). Model-Based Recursive Partitioning. Journal of Computational and Graphical Statistics, 17(2), 492–514. https://doi.org/10.1198/106186008X319331 Zhou, Z.-H., & Feng, J. (2017). Deep Forest: Towards An Alternative to Deep Neural Networks. 3553–3559. Zhou, Z.-H., & Feng, J. (2019). Deep forest. National Science Review, 6(1), 74–86. https://doi.org/10.1093/nsr/nwy108

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：校內校外完全公開 unrestricted 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0814122-092722.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS