Responsive image
博碩士論文 etd-0814122-092722 詳細資訊
Title page for etd-0814122-092722
論文名稱
Title
可解釋的表徵學習 - 使用基於模型的深度規則森林
Interpretable Representation Learning with Model-based Deep Rule Forest
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
64
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2022-07-29
繳交日期
Date of Submission
2022-09-14
關鍵字
Keywords
規則學習、可解釋性、機器學習、深度規則森林、深度模型結構
Rule Learning, Interpretability, Machine Learning, Deep Rule Forest, Deep Model Architecture
統計
Statistics
本論文已被瀏覽 411 次,被下載 62
The thesis/dissertation has been browsed 411 times, has been downloaded 62 times.
中文摘要
受益於運算成本的下降,深度學習模型被廣泛應用到許多任務中。雖然這類模型在預測上的表現比傳統機器學習模型改善許多,但由於這類模型的模型結構過於複雜,使得人們無法理解模型的決策過程,資料中隱含的歧視可能被模型學習到且無法被辨識出來,因此出現法律要求模型具有可解釋性。然而相比於深度學習模型,目前具有可解釋性的模型無法從資料中學到比較複雜的特徵,因此預測表現很難與深度學習模型相比。為了在學習到複雜的特徵的同時也能具有可解釋性,我們從基於模型的隨機森林中萃取出規則與其中的有母數模型的參數,並且加深模型結構來學習到更複雜的特徵。我們提出基於模型的深度規則森林,結合可解釋性與深層的模型架構來使模型兼具可解釋性與較好的預測表現,同時結合模型中的有母數模型的係數,來比較不同規則中的有母數模型的變數關係。
Abstract
Deep learning models are widely applied to many fields due to the decrease in computation cost. Although the performance of such models in prediction is much better than traditional machine learning models, the complexity of the model structure of such models makes it impossible to understand the decision process of the models. The implicit discrimination in the data may be learned by the models and cannot be recognized, so there is a legal requirement for the models to be interpretable. However, current models with interpretability cannot learn more complex features from the data than deep learning models, so the prediction performance is hardly comparable to deep learning models. To learn complex features while being interpretable, we extract the rules from the model-based random forest with the parameters of the parent model and deepen the model structure to learn more complex features. We propose a model-based Deep Rule Forest (mobDRF) that combines interpretability with a deep model structure to make the model interpretable and better predictive. We combine the coefficients of the parent model in the model to compare the relationship between the variables of the parent model in different rules.
目次 Table of Contents
論文審定書i
摘要ii
Abstractiii
List of Figuresv
List of Tablesvi
1. Introduction1
2. Background3
2.1 Representation Learning3
2.2 Interpretable Machine Learning4
2.3 Random Forests with Deep Architecture6
2.4 Model-based Recursive Partitioning7
3. Using Model-based Deep Rule Forests9
3.1 Building Model-based DRF10
3.2 Rule Interpretation for Model-based DRF12
4. Experiment13
4.1 Experiment Setup13
4.2 Model Performance Comparison15
4.3 Model Interpretation with Learned Rules19
5. Conclusion and Discussion24
References26
Appendix32
Appendix A. Rules of Care Home Incidents32
Table A1. Rule of rpart32
Table A2. Rule of C5034
Table A3. Rule of PRE37
Table A4. Rule of MOB tree40
Table A5. Rule of MOB tree with 1st layer of mobDRF41
Appendix B. Rules of TLSA 201545
Table B1. Rule of rpart45
Table B2. Rule of MOB tree47
Table B3. Rule of PRE49
Table B4. Rule of MOB tree with 1st layer of mobDRF51
參考文獻 References
Adadi, A., & Berrada, M. (2018). Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access, 6, 52138–52160. https://doi.org/10.1109/ACCESS.2018.2870052
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., & Verkamo, A. I. (1996). Fast discovery of association rules. In Advances in knowledge discovery and data mining (pp. 307–328). American Association for Artificial Intelligence.
Allaire, J. J., & Chollet, F. (2022). keras: R Interface to “Keras.” https://keras.rstudio.com
Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning, 2(1), 1–127. https://doi.org/10.1561/2200000006
Bengio, Y., Courville, A., & Vincent, P. (2013). Representation Learning: A Review and New Perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 1798–1828. https://doi.org/10.1109/TPAMI.2013.50
Bengio, Y., Delalleau, O., & Simard, C. (2010). Decision Trees Do Not Generalize to New Variations. Computational Intelligence, 26(4), 449–467. https://doi.org/10.1111/j.1467-8640.2010.00366.x
Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and Regression Trees. Taylor & Francis.
Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., Chen, K., Mitchell, R., Cano, I., Zhou, T., Li, M., Xie, J., Lin, M., Geng, Y., Li, Y., Yuan, J., & implementation), Xgb. contributors (base Xgb. (2022). xgboost: Extreme Gradient Boosting (1.6.0.1). https://CRAN.R-project.org/package=xgboost
Cutler, F. original by L. B. and A., & Wiener, R. port by A. L. and M. (2022). randomForest: Breiman and Cutler’s Random Forests for Classification and Regression (4.7-1.1). https://CRAN.R-project.org/package=randomForest
Deng, L. (2012). The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6), 141–142.
Dua, D., & Graff, C. (2017). UCI Machine Learning Repository. University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml
Fokkema, M. (2020). Fitting Prediction Rule Ensembles with R Package pre. Journal of Statistical Software, 92(12). https://doi.org/10.18637/jss.v092.i12
Fokkema, M., Edbrooke-Childs, J., & Wolpert, M. (2021). Generalized linear mixed-model (GLMM) trees: A flexible decision-tree method for multilevel and longitudinal data. Psychotherapy Research, 31(3), 329–341. https://doi.org/10.1080/10503307.2020.1785037
Friedman, J. H., & Popescu, B. E. (2008). Predictive Learning via Rule Ensembles. The Annals of Applied Statistics, 2(3), 916–954.
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1). https://doi.org/10.18637/jss.v033.i01
Fürnkranz, J., Gamberger, D., & Lavrač, N. (2012). Foundations of Rule Learning. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-75197-7
Geirhos, R., Jacobsen, J.-H., Michaelis, C., Zemel, R., Brendel, W., Bethge, M., & Wichmann, F. A. (2020). Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11), 665–673. https://doi.org/10.1038/s42256-020-00257-z
Goodman, B., & Flaxman, S. (2017). European Union Regulations on Algorithmic Decision-Making and a “Right to Explanation.” AI Magazine, 38(3), 50–57. https://doi.org/10.1609/aimag.v38i3.2741
GuidottiRiccardo, MonrealeAnna, RuggieriSalvatore, TuriniFranco, GiannottiFosca, & PedreschiDino. (2018). A Survey of Methods for Explaining Black Box Models. ACM Computing Surveys (CSUR). https://doi.org/10.1145/3236009
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. 770–778. https://openaccess.thecvf.com/content_cvpr_2016/html/He_Deep_Residual_Learning_CVPR_2016_paper.html
Hothorn, T., & Zeileis, A. (2015). partykit: A Modular Toolkit for Recursive Partytioning in R. Journal of Machine Learning Research, 16, 3905–3909.
Hutson, G., Laldin, A., & Velásquez, I. (2022). MLDataR: Collection of Machine Learning Datasets for Supervised Machine Learning (0.1.3). https://CRAN.R-project.org/package=MLDataR
Jean, S., Cho, K., Memisevic, R., & Bengio, Y. (2015). On Using Very Large Target Vocabulary for Neural Machine Translation (arXiv:1412.2007). arXiv. https://doi.org/10.48550/arXiv.1412.2007
Jeong, M., Nam, J., & Ko, B. C. (2020). Lightweight Multilayer Random Forests for Monitoring Driver Emotional Status. IEEE Access, 8, 60344–60354. https://doi.org/10.1109/ACCESS.2020.2983202
Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images.
Kuhn, M., & Quinlan, R. (2022). C50: C5.0 Decision Trees and Rule-Based Models. https://topepo.github.io/C5.0/
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
Miller, K., Hettinger, C., Humpherys, J., Jarvis, T., & Kartchner, D. (2017). Forward Thinking: Building Deep Random Forests. ArXiv:1705.07366 [Cs, Stat]. http://arxiv.org/abs/1705.07366
Qiao, L., Wang, W., & Lin, B. (2021). Learning Accurate and Interpretable Decision Rule Sets from Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 35(5), 4303–4311.
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106. https://doi.org/10.1007/BF00116251
Quinlan, J. R. (2014). C4.5: Programs for Machine Learning. Elsevier.
R Core Team. (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. https://www.R-project.org/
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144. https://doi.org/10.1145/2939672.2939778
Ripley, B., Venables, B., Bates, D. M., ca 1998), K. H. (partial port, ca 1998), A. G. (partial port, & Firth, D. (2021). MASS: Support Functions and Datasets for Venables and Ripley’s MASS (7.3-54). https://CRAN.R-project.org/package=MASS
RStudio Team. (2020). RStudio: Integrated Development Environment for R. RStudio, PBC. http://www.rstudio.com/
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206–215. https://doi.org/10.1038/s42256-019-0048-x
Rudin, C., Chen, C., Chen, Z., Huang, H., Semenova, L., & Zhong, C. (2021). Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges. ArXiv:2103.11251 [Cs, Stat]. http://arxiv.org/abs/2103.11251
Seibold, H., Zeileis, A., & Hothorn, T. (2016). Model-Based Recursive Partitioning for Subgroup Analyses. The International Journal of Biostatistics, 12(1), 45–63. https://doi.org/10.1515/ijb-2015-0032
Simonyan, K., & Zisserman, A. (2015). Very Deep Convolutional Networks for Large-Scale Image Recognition (arXiv:1409.1556). arXiv. https://doi.org/10.48550/arXiv.1409.1556
Su, G., Wei, D., Varshney, K. R., & Malioutov, D. M. (2016). Interpretable Two-level Boolean Rule Learning for Classification. ArXiv:1606.05798 [Cs, Stat]. http://arxiv.org/abs/1606.05798
Taiwan Longitudinal Study on Aging (TLSA). (n.d.). Health Promotion Administration, Ministry of Health and Welfare. Retrieved June 30, 2022, from https://www.hpa.gov.tw/EngPages/Detail.aspx?nodeid=1077&pid=6197
Therneau, T., Atkinson, B., port, B. R. (producer of the initial R., & maintainer 1999-2017). (2022). rpart: Recursive Partitioning and Regression Trees (4.1.16). https://CRAN.R-project.org/package=rpart
Wagner, C. H. (1982). Simpson’s Paradox in Real Life. The American Statistician, 36(1), 46–48. https://doi.org/10.2307/2684093
Wang, T., Rudin, C., Doshi-Velez, F., Liu, Y., Klampfl, E., & MacNeille, P. (2017). A Bayesian Framework for Learning Rule Sets for Interpretable Classification. Journal of Machine Learning Research, 18(70), 1–37.
Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org
Wright, M. N., Wager, S., & Probst, P. (2022). ranger: A Fast Implementation of Random Forests (0.14.1). https://CRAN.R-project.org/package=ranger
Zeileis, A., Hothorn, T., & Hornik, K. (2008). Model-Based Recursive Partitioning. Journal of Computational and Graphical Statistics, 17(2), 492–514. https://doi.org/10.1198/106186008X319331
Zhou, Z.-H., & Feng, J. (2017). Deep Forest: Towards An Alternative to Deep Neural Networks. 3553–3559.
Zhou, Z.-H., & Feng, J. (2019). Deep forest. National Science Review, 6(1), 74–86. https://doi.org/10.1093/nsr/nwy108

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code