Responsive image
博碩士論文 etd-0217122-131544 詳細資訊
Title page for etd-0217122-131544
論文名稱
Title
多分支深度規則森林演算法的集成分類學習
Ensemble Classification Using Multi-split Deep Rule Forests
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
47
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2021-09-10
繳交日期
Date of Submission
2022-03-17
關鍵字
Keywords
深度模型結構、解釋性、隨機森林、表徵學習、規則學習、深度規則森林
Deep Architecture, Interpretability, Random Forest, Representation Learning, Rule Learning, Deep Rule Forest
統計
Statistics
本論文已被瀏覽 569 次,被下載 104
The thesis/dissertation has been browsed 569 times, has been downloaded 104 times.
中文摘要
近年來類神經網路一直都有卓越的表現,也因此受到許多關注,但受限於模型的不可解釋性,導致人們不敢將它使用在一些高風險應用上,像是醫療和犯罪領域。決策樹演算法可以利用規則來讓人們了解決策的過程提供模型解釋性,然而,面對複雜的問題時,決策樹演算法的模型太過簡單以致於無法處理。我們將深度規則森林演算法的概念做延伸,提出多分支深度規則森林演算法。多分支深度規則森林演算法可以藉由多層的規則森林提高模型的能力來產生更複雜的規則,而實驗結果顯示,多分支深度規則森林能夠用規則來學到更好的資料表徵,並用學到的表徵來更精準的做分類預測。而我們提出的模型也能夠利用多分支規則森林的特性來用更少的樹表現得一樣競爭力,更重要的是產生的規則也將預測的過程用可以被人類理解的方式呈現。
Abstract
Throughout these years, the neural network has been the leading and receiving lots of attention, while the drawback of its interpretability holds people back from applying it in high-stake applications (e.g. medical field and criminal justice). The decision tree algorithms offer the transparency of reasoning with rules, however, the model capacity of decision trees is too low to deal with complex problems. The tree-based DRF algorithms inspire us to further extend the idea and propose the multi-split DRF. The proposed multi-split DRF increases the model capacity to generate the more complex rules with multiple layers of rule forests. The experimental results show that the multi-split DRF is able to learn better data representations, which can be used to classify the data more accurately. Also, the proposed model performs competitively with fewer trees compared to DRF with the multi-split rule forests. More importantly, the generated rules of multi-split DRF present the prediction-making process in a human-comprehensible way. As a result, the proposed approach is an interpretable representation learning algorithm.
目次 Table of Contents
論文審定書 ............................................................................. i
摘要 ....................................... ii
Abstract ......................................................................................................... iii
List of Figures ................................................................. v
List of Tables ...................................................................... vi
1 Introduction .......................................................................... 1
2 Background and Related Work ........................................ 2
2.1 Tree-based model ..................................................... 2
2.2 Two-level Rule in Tree ............................................ 6
2.3 Representation Learning ............................................................... 9
2.4 Deep Architecture ...................................................................... 11
2.5 Explainable AI .................................................................................... 13
3 Building Multi-split Deep Rule Forests ............................................ 15
3.1 Building the Multi-split DRF ..................................................................... 15
3.2 Interpretability of Multi-split DRF ............................................................ 18
3.3 Rule-encoding for Data ....................................................................... 20
3.4 Hyper-parameters Tuning in Multi-split DRF .......................................... 22
4 Experiment and Discussion ....................................................... 24
4.1 Experiment Setup .................................................................................. 24
4.2 Prediction Accuracy ............................................................................... 25
4.3 Backtracking Rules ............................................................................ 29
4.4 Influence of Hyper-parameters ........................................................... 31
5 Conclusion ................................................................................................. 35
6 References .................................................................................. 36
參考文獻 References
Anil K., J., & Jianchang, M. (1996). Artificial Neural Networks: A tutorial. IEEE Computer 29 (Mar.), 31–44.
Arnould, L., Boyer, C., & Scornet, E. (2021). Analyzing the tree-layer structure of Deep Forests. ArXiv:2010.15690 [Cs, Math, Stat]. http://arxiv.org/abs/2010.15690
Bengio, Y. (n.d.). Learning Deep Architectures for AI. 56.
Bengio, Y., Courville, A., & Vincent, P. (2013). Representation Learning: A Review and New Perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1798–1828. https://doi.org/10.1109/TPAMI.2013.50
Bengio, Y., & Delalleau, O. (2011). On the Expressive Power of Deep Architectures. In J. Kivinen, C. Szepesvári, E. Ukkonen, & T. Zeugmann (Eds.), Algorithmic Learning Theory (Vol. 6925, pp. 18–36). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-24412-4_3
Bengio, Y., Delalleau, O., & Simard, C. (2010). DECISION TREES DO NOT GENERALIZE TO NEW VARIATIONS. Computational Intelligence, 26(4), 449–467. https://doi.org/10.1111/j.1467-8640.2010.00366.x
Bergmeir, C., & Benitez, J. M. (2012). Neural Network in R using the Stuttgart Neural Network Simulator: RSNNS. https://doi.org/10.18637/jss.v046i07
Bernhard E., B., Isabelle M., G., & Vladimir N., V. (n.d.). A training Algorithm for optimal Margin Classifiers. Proceedings of the Fifth Annual Workshop on Computational Learning Theory, 144–152. https://doi.org/10.1145/130385.130401
Breiman. (2001). Random forest. Machine Learning, 45, 5-32,. https://doi.org/10.023/A:1010933404324
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. https://doi.org/10.1007/BF00058655
Breiman, L. (2017). Classification and Regression Trees. https://doi.org/10.1201/9781315139470
Bunn, A., & Korpela, M. (n.d.). An introduction to dplR. 16.
Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785
Comon, P. (1994). Independent component analysis, A new concept? Signal Processing, 36(3), 287–314. https://doi.org/10.1016/0165-1684(94)90029-9
Cover, T. M., & Thomas, J. A. (n.d.). ELEMENTS OF INFORMATION THEORY. 774.
DARPA. (2016). Defense Advanced Research Projects Agency. Broad Agency Announcement,Explainable Artificial Intelligence(XAI). DARPA-BAA-16-53. https://www.darpa.mil/attachments/DARPA-BAA-16-53.pdf
Dietterich, T. G. (2002). Ensemble Learning. The Handbook of Brain Theory and Neural Network, 2, 110–125.
Dua, D., & Graff, C. (2019). UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
Erhan, D., Courville, A., Bengio, Y., & Box, P. O. (n.d.). Understanding Representations Learned in Deep Architectures. 26.
Freund, Y., & Schapire, R. E. (n.d.). A Short Introduction to Boosting. 14.
Fürnkranz, J., Gamberger, D., & Lavrač, N. (2012). Foundations of Rule Learning. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-75197-7
Geirhos, R., Jacobsen, J.-H., Michaelis, C., Zemel, R., Brendel, W., Bethge, M., & Wichmann, F. A. (2020). Shortcut Learning in Deep Neural Networks. Nature Machine Intelligence, 2(11), 665–673. https://doi.org/10.1038/s42256-020-00257-z
Hinton, G. E. (2006). Reducing the Dimensionality of Data with Neural Networks. Science, 313(5786), 504–507. https://doi.org/10.1126/science.1127647
Kang, Y., Huang, S.-T., & Wu, P.-H. (2021). Detection of Drug–Drug and Drug–Disease Interactions Inducing Acute Kidney Injury Using Deep Rule Forests. SN Computer Science, 2(4), 299. https://doi.org/10.1007/s42979-021-00670-0
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T.-Y. (n.d.). LightGBM: A Highly Efficient Gradient Boosting Decision Tree. 9.
Kohavi, R., John, G., Long, R., & Manley, D. (1994). In Proceedings Sixth International Conference on TOols with Aritficial Intelligence.
Koitka, S., & Friedrich, C., M. (2016). nmfgpu4R: GPU-Accelerated Computation of the Non-Negative Matrix Factorization (NMF) Using CUDA Capable Hardware. The R Journal, 8(2), 382. https://doi.org/10.32614/RJ-2016-053
Kuhn, M., Steve, W., & Mark, C. (2014). C50: C5.0 decision trees and rule-based models. R Package Version 0.1.5. https://cran.r-project.org/web/packages/C50/C50.pdf
Kukreja, S. L., Löfberg, J., & Brenner, M. J. (2006). A LEAST ABSOLUTE SHRINKAGE AND SELECTION OPERATOR (LASSO) FOR NONLINEAR SYSTEM IDENTIFICATION. IFAC Proceedings Volumes, 39(1), 814–819. https://doi.org/10.3182/20060329-3-AU-2901.00128
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
Li Deng. (2012). The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web]. IEEE Signal Processing Magazine, 29(6), 141–142. https://doi.org/10.1109/MSP.2012.2211477
Lipton, Z. C. (n.d.). The mythos of model interpretability. Machine Learning, 28.
Miller, G. A. (n.d.). The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for ProcessingInformation. 10.
Miller, K., Hettinger, C., Humpherys, J., Jarvis, T., & Kartchner, D. (2017). Forward Thinking: Building Deep Random Forests. ArXiv:1705.07366 [Cs, Stat]. http://arxiv.org/abs/1705.07366
Miller, T. (2019). Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267, 1–38. https://doi.org/10.1016/j.artint.2018.07.007
Molnar, C. (n.d.). Interpretable Machine Learning. 185.
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106. https://doi.org/10.1007/BF00116251
Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
Radley-Gardner, O., Beale, H., & Zimmermann, R. (Eds.). (2016). Fundamental Texts On European Private Law. Hart Publishing. https://doi.org/10.5040/9781782258674
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. ArXiv:1602.04938 [Cs, Stat]. http://arxiv.org/abs/1602.04938
Rstudio Team. (2015). RStudio: Integrated Developement for R. http://www.rstudio.com/
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206–215. https://doi.org/10.1038/s42256-019-0048-x
Samek, W. (2020). Learning with explainable trees. Nature Machine Intelligence, 2(1), 16–17. https://doi.org/10.1038/s42256-019-0142-0
Shlens, J. (2014). A Tutorial on Principal Component Analysis. ArXiv:1404.1100 [Cs, Stat]. http://arxiv.org/abs/1404.1100
Su, G., Wei, D., Varshney, K. R., & Malioutov, D. M. (2016). Interpretable Two-level Boolean Rule Learning for Classification. ArXiv:1606.05798 [Cs, Stat]. http://arxiv.org/abs/1606.05798
Therneau, T. M., Atkinson, E. J., & Foundation, M. (n. d. ). (n.d.). An Introduction to Recursive Partitioning Using the RPART Routines. 60.
Trigeorgis, G., Bousmalis, K., Zafeiriou, S., & Schuller, B. W. (2015). A deep matrix factorization method for learning attribute representations. ArXiv:1509.03248 [Cs, Stat]. http://arxiv.org/abs/1509.03248
Wright, M. N., & Zeigler, A. (2015). Ranger: A fast implementation of random forest for high dimensional data in C++ and R.
Zhong, G., Wang, L.-N., Ling, X., & Dong, J. (2016). An overview on data representation learning: From traditional feature learning to recent deep learning. The Journal of Finance and Data Science, 2(4), 265–278. https://doi.org/10.1016/j.jfds.2017.05.001
Zhou, Z.-H., & Feng, J. (2020). Deep Forest. ArXiv:1702.08835 [Cs, Stat]. http://arxiv.org/abs/1702.08835
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code