Human-guided Machine Learning by Enabling Expert-in-the-Loop
Human-in-the-Loop Machine Learning, Rule Learning, Shortcut Learning, Mixed-effects Model, Interpretability, User Trust
在本研究中,我們提出新的 human-in-the-loop machine learning 架構, 利用混合效應模型樹(Generalized Linear Mixed-Effects Model Tree)這種可解釋性模型讓專家可以透過模型產生的規則清楚了解模型是如何做預測,並且根據專家自身的知識及經驗給予模型更合適的要預測值,其目的是希望透過人的介入來改善機器學習模型潛在的捷徑學習問題,其次藉由規則的呈現也能減少專家標註資料的時間成本。實驗結果顯示,該方法可以從專家回饋的資料中學習新的符合實務應用場景的資料規則,並且能夠找到可解釋並更適合用來預測目標變數的特徵,以提升模型整體表現。此外標註者也可以從模型規則中找出以往沒有思 考過的判斷邏輯、消除思考的偏見,進而使得模型及人類雙方在決策判斷上都可以更加完備。
Along with the development of technology, machine learning has been widely used in various fields, but most of the time people only focus on the accuracy of the model. Although many powerful algorithms have been developed, the architecture of these models has become more and more complex, which makes it impossible for humans to understand predictive factors in models. Therefore, it is impossible to know whether the models accurately learn the appropriate relationship, so in some certain fields, (e.g., medical, industrial), such models are not explanatory enough to be trusted or even be used. In addition, the accuracy of the model is not only affected by model complexity, but also by the training data. The most common problem is the lack of data, or there is no correlation between the target variable and the input features.

In this study, we propose a novel human-in-the-loop machine learning architecture, which uses an interpretable algorithm, the Generalized Linear Mixed-Effects Model Tree, so that experts can clearly understand how the model makes predictions through the rules generated by the model, and give more appropriate predicted values to the model according to their knowledge and experience. The purpose of our method is to improve the potential shortcut learning problems of machine learning models through human intervention. Secondly, to reduce the time cost of annotating data by experts through the representation of rules. Experimental results show that the proposed method can learn new data rules in line with practical application scenarios from the feedback of experts, and can find the features that can be interpreted and are more suitable for predicting target variables, so as to improve the overall performance of the model. In addition, the annotator can also find out the judgment logic that has not been considered before and eliminates the thinking bias from the model rules, so that both the model and human beings can be more complete in decision-making and judgment.
目次 Table of Contents
論文審定書 i
摘要 ii
Abstract iv
List of Figures vii
List of Table viii
1. Introduction 1
2. Background and Related Work 2
2.2 Mixed Effects Model 2
2.2 Generalized Linear Mixed-Effects Model Tree 6
2.3 Shortcut Learning 8
2.4 Data Annotation and Active Learning 10
2.5 Human-in-the-loop Machine Learning (HitL–ML) 14
3. Methodology 17
3.1 Mixed Effects Model for Longitudinal Data 17
3.2 Rules Assistance for Data Annotation 19
3.3 Krippendorff's Alpha Coefficient 20
3.4 Retraining Model with Experts' Feedback 21
4. Experimental Result and Discussion 22
4.1 Data Description 22
4.2 Experiments 26
5. Discussion 31
6. Conclusion 31
7. References 32
