Responsive image
博碩士論文 etd-0612121-152838 詳細資訊
Title page for etd-0612121-152838
Human-guided Machine Learning by Enabling Expert-in-the-Loop
Year, semester
Number of pages
Advisory Committee
Date of Exam
Date of Submission
Human-in-the-Loop Machine Learning, Rule Learning, Shortcut Learning, Mixed-effects Model, Interpretability, User Trust
本論文已被瀏覽 501 次,被下載 111
The thesis/dissertation has been browsed 501 times, has been downloaded 111 times.

在本研究中,我們提出新的 human-in-the-loop machine learning 架構, 利用混合效應模型樹(Generalized Linear Mixed-Effects Model Tree)這種可解釋性模型讓專家可以透過模型產生的規則清楚了解模型是如何做預測,並且根據專家自身的知識及經驗給予模型更合適的要預測值,其目的是希望透過人的介入來改善機器學習模型潛在的捷徑學習問題,其次藉由規則的呈現也能減少專家標註資料的時間成本。實驗結果顯示,該方法可以從專家回饋的資料中學習新的符合實務應用場景的資料規則,並且能夠找到可解釋並更適合用來預測目標變數的特徵,以提升模型整體表現。此外標註者也可以從模型規則中找出以往沒有思 考過的判斷邏輯、消除思考的偏見,進而使得模型及人類雙方在決策判斷上都可以更加完備。
Along with the development of technology, machine learning has been widely used in various fields, but most of the time people only focus on the accuracy of the model. Although many powerful algorithms have been developed, the architecture of these models has become more and more complex, which makes it impossible for humans to understand predictive factors in models. Therefore, it is impossible to know whether the models accurately learn the appropriate relationship, so in some certain fields, (e.g., medical, industrial), such models are not explanatory enough to be trusted or even be used. In addition, the accuracy of the model is not only affected by model complexity, but also by the training data. The most common problem is the lack of data, or there is no correlation between the target variable and the input features.

In this study, we propose a novel human-in-the-loop machine learning architecture, which uses an interpretable algorithm, the Generalized Linear Mixed-Effects Model Tree, so that experts can clearly understand how the model makes predictions through the rules generated by the model, and give more appropriate predicted values to the model according to their knowledge and experience. The purpose of our method is to improve the potential shortcut learning problems of machine learning models through human intervention. Secondly, to reduce the time cost of annotating data by experts through the representation of rules. Experimental results show that the proposed method can learn new data rules in line with practical application scenarios from the feedback of experts, and can find the features that can be interpreted and are more suitable for predicting target variables, so as to improve the overall performance of the model. In addition, the annotator can also find out the judgment logic that has not been considered before and eliminates the thinking bias from the model rules, so that both the model and human beings can be more complete in decision-making and judgment.
目次 Table of Contents
論文審定書 i
摘要 ii
Abstract iv
List of Figures vii
List of Table viii
1. Introduction 1
2. Background and Related Work 2
2.2 Mixed Effects Model 2
2.2 Generalized Linear Mixed-Effects Model Tree 6
2.3 Shortcut Learning 8
2.4 Data Annotation and Active Learning 10
2.5 Human-in-the-loop Machine Learning (HitL–ML) 14
3. Methodology 17
3.1 Mixed Effects Model for Longitudinal Data 17
3.2 Rules Assistance for Data Annotation 19
3.3 Krippendorff's Alpha Coefficient 20
3.4 Retraining Model with Experts' Feedback 21
4. Experimental Result and Discussion 22
4.1 Data Description 22
4.2 Experiments 26
5. Discussion 31
6. Conclusion 31
7. References 32
參考文獻 References
Andriluka, Mykhaylo, Jasper R. R. Uijlings, and Vittorio Ferrari. “Fluid Annotation: A Human-Machine Collaboration Interface for Full Image Annotation.” In 2018 ACM Multimedia Conference on Multimedia Conference - MM ’18, 1957–66. Seoul, Republic of Korea: ACM Press, 2018. .

Aroyo, Lora, and Chris Welty. “Truth Is a Lie: Crowd Truth and the Seven Myths of Human Annotation.” AI Magazine 36, no. 1 (March 25, 2015): 15-24. .

Bates, Douglas, Martin Maechler, Ben Bolker [aut, cre, Steven Walker, Rune Haubo Bojesen Christensen, Henrik Singmann, et al. Lme4: Linear Mixed-Effects Models

32 Using “Eigen” and S4 (version 1.1-27.1), 2021.



Branley-Bell, Dawn, Rebecca Whitworth, and Lynne Coventry. “User Trust and

Understanding of Explainable AI: Exploring Algorithm Visualisations and User Biases.” In Human-Computer Interaction. Human Values and Quality of Life, edited by Masaaki Kurosu, 382–99. Lecture Notes in Computer Science. Cham: Springer International Publishing, 2020.



Breiman, Leo. “Bagging Predictors.” Machine Learning 24, no. 2 (August 1, 1996):



Bryk, Anthony S., and Stephen W. Raudenbush. Hierarchical Linear

Models: Applications and Data Analysis Methods. Hierarchical Linear Models: Applications and Data Analysis Methods. Thousand Oaks, CA, US: Sage Publications, Inc, 1992.

Endert, Alex, M. Shahriar Hossain, Naren Ramakrishnan, Chris North, Patrick Fiaux, and Christopher Andrews. “The Human Is the Loop: New Directions for Visual Analytics.” Journal of Intelligent Information Systems 43, no. 3 (December 1, 2014): 411–35. .

33 Fishbane, S., and A. R. Nissenson. “The New FDA Label for Erythropoietin Treatment:

How Does It Affect Hemoglobin Target?” Kidney International 72, no. 7 (October 1, 2007): 806–13. .

Fokkema, M., N. Smits, A. Zeileis, T. Hothorn, and H. Kelderman. “Detecting

Treatment-Subgroup Interactions in Clustered Data with Generalized Linear Mixed-Effects Model Trees.” Behavior Research Methods 50, no. 5 (October 1, 2018): 2016–34. .

Fokkema, Marjolein, and Achim Zeileis. Glmertree: Generalized Linear Mixed Model Trees (version 0.2-0), 2019. .

Geirhos, Robert, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland

Brendel, Matthias Bethge, and Felix A. Wichmann. “Shortcut Learning in Deep Neural Networks.” Nature Machine Intelligence 2, no. 11 (November 2020): 665–73. .

Hilgard, Sophie, Nir Rosenfeld, Mahzarin R. Banaji, Jack Cao, and David C. Parkes.

“Learning Representations by Humans, for Humans.” ArXiv:1905.12686 [Cs,

Stat], October 9, 2020.


Holzinger, Andreas. “Human-Computer Interaction and Knowledge Discovery (HCI-

KDD): What Is the Benefit of Bringing Those Two Fields to Work Together?” In Availability, Reliability, and Security in Information Systems and HCI, edited by

34 Alfredo Cuzzocrea, Christian Kittl, Dimitris E. Simos, Edgar Weippl, and Lida Xu, 319–28. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer, 2013. .

Holzinger, Andreas. “Interactive Machine Learning for Health Informatics: When Do We Need the Human-in-the-Loop?” Brain Informatics 3, no. 2 (June 1, 2016): 119–31. .

Holzinger, Andreas, Chris Biemann, Constantinos S. Pattichis, and Douglas B. Kell.

“What Do We Need to Build Explainable AI Systems for the Medical Domain?” ArXiv:1712.09923 [Cs, Stat], December 28, 2017. .

Holzinger, Andreas, Markus Plass, Michael Kickmeier-Rust, Katharina Holzinger, Gloria Cerasela Crişan, Camelia-M. Pintea, and Vasile Palade. “Interactive Machine Learning: Experimental Evidence for the Human in the Algorithmic Loop.” Applied Intelligence 49, no. 7 (July 1, 2019): 2401–14. .

Honeycutt, Donald R., Mahsan Nourani, and Eric D. Ragan. “Soliciting Human-in-the-

Loop User Feedback for Interactive Machine Learning Reduces User Trust and Impressions of Model Accuracy.” ArXiv:2008.12735 [Cs], August 28, 2020. .

35 Huang, Sheng-jun, Rong Jin, and Zhi-Hua Zhou. “Active Learning by Querying

Informative and Representative Examples.” In Advances in Neural Information Processing Systems 23, edited by J. D. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. S. Zemel, and A. Culotta, 892–900. Curran Associates, Inc., 2010.


Kieseberg, Peter, Edgar Weippl, and Andreas Holzinger. “Trust for the ‘Doctor in the Loop,’” n.d., 2.

Krippendorff, Klaus. “Computing Krippendorff’s Alpha-Reliability.” Departmental

Papers (ASC), January 25, 2011.


Lage, Isaac, Andrew Slavin Ross, Been Kim, Samuel J. Gershman, and Finale Doshi-Velez. “Human-in-the-Loop Interpretability Prior.” ArXiv:1805.11571 [Cs, Stat], October 30, 2018. .

Liu, Ce, William T. Freeman, Edward H. Adelson, and Yair Weiss. “Human-Assisted Motion Annotation.” In 2008 IEEE Conference on Computer Vision and Pattern Recognition, 1–8, 2008. .

Ngai, Grace, and David Yarowsky. “Rule Writing or Annotation: Cost-Efficient

Resource Usage for Base Noun Phrase Chunking.” ArXiv:Cs/0105003, May 2, 2001. .

36 Nguyen, Dung H. M., and Jon D. Patrick. “Supervised Machine Learning and Active Learning in Classification of Radiology Reports.” Journal of the American Medical Informatics Association: JAMIA 21, no. 5 (October 2014): 893–901. .

Nourani, Mahsan, Joanie T. King, and Eric D. Ragan. “The Role of Domain Expertise

in User Trust and the Impact of First Impressions with Intelligent Systems.” ArXiv:2008.09100 [Cs], August 20, 2020. .

Papenmeier, Andrea, Gwenn Englebienne, and Christin Seifert. “How Model Accuracy and Explanation Fidelity Influence User Trust.” ArXiv:1907.12652 [Cs], July 26, 2019. .

“R: A Language and Environment for Statistical Computing.” Accessed July 7,




“R Interface to Keras.” Accessed July 7, 2021.


Ristoski, Petar, Dmitry Yu Zubarev, Anna Lisa Gentile, Nathaniel Park, Daniel

Sanders, Daniel Gruhl, Linda Kato, and Steve Welch. “Expert-in-the-Loop AI for Polymer Discovery.” In Proceedings of the 29th ACM International Conference on

37 Information & Knowledge Management, 2701–8. Virtual Event Ireland: ACM, 2020. .

Settles, Burr. “Active Learning Literature Survey.” Technical Report. University of

Wisconsin-Madison Department of Computer Sciences, 2009. .

Therneau, Terry M, Elizabeth J Atkinson, and Mayo Foundation. “An Introduction to Recursive Partitioning Using the RPART Routines,” n.d., 60.

Wang, Meng, and Xian-Sheng Hua. “Active Learning in Multimedia Annotation and

Retrieval: A Survey.” ACM Transactions on Intelligent Systems and Technology 2, no. 2 (February 2011): 1–21. .

Ware, James H. “Linear Models for the Analysis of Longitudinal Studies.” The

American Statistician 39, no. 2 (May 1985): 95–101.

“Welcome · Human-in-the-Loop Machine Learning MEAP V10.” Accessed November



Wright, Marvin N., and Andreas Ziegler. “Ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.” Journal of Statistical Software 77, no. 1 (2017). .

38 Yu, Kun, Shlomo Berkovsky, Ronnie Taib, Dan Conway, Jianlong Zhou, and Fang

Chen. “User Trust Dynamics: An Investigation Driven by Differences in System Performance.” In Proceedings of the 22nd International Conference on Intelligent User Interfaces, 307–17. Limassol Cyprus: ACM, 2017. .

Zanzotto, Fabio Massimo. “Viewpoint: Human-in-the-Loop Artificial

Intelligence.” Journal of Artificial Intelligence Research 64 (February 10, 2019): 243–52. .

“Shiny.” Accessed July 7, 2021.


Marcus, Gary. “Deep Learning: A Critical Appraisal.” ArXiv:1801.00631 [Cs,

Stat], January 2, 2018.


Zeileis, Achim, Torsten Hothorn, and Kurt Hornik. “Model-Based Recursive

Partitioning.” Journal of Computational and Graphical Statistics 17, no. 2 (June 2008): 492–514. .

LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. “Deep Learning.” Nature 521, no.

7553 (May 2015): 436–44.


Hughes, John. “Krippendorffsalpha: An R Package for Measuring Agreement Using

Krippendorff’s Alpha Coefficient.” ArXiv:2103.12170 [Stat], March 22,


電子全文 Fulltext
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available

紙本論文 Printed copies
開放時間 available 已公開 available

QR Code