國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,以BERT結合傳統機器學習探討文本情緒影響因素,Use BERT with traditional machine learning methods to investigate the factor of text emotion classification

論文名稱 Title	以BERT結合傳統機器學習探討文本情緒影響因素 Use BERT with traditional machine learning methods to investigate the factor of text emotion classification
系所名稱 Department	資訊管理學系 Department of Information Management
畢業學年期 Year, semester	109 學年度第 1 學期 The fall semester of Academic Year 109	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	82
研究生 Author	張博瑋 Bo-Wei Chang
指導教授 Advisor	李偉柏 Wei-Po Lee
召集委員 Convenor	許育峰 Yu-Feng Hsu
口試委員 Advisory Committee	楊宗憲 Tsung-Hsien Yang
口試日期 Date of Exam	2020-12-04	繳交日期 Date of Submission	2021-01-25
關鍵字 Keywords	BERT、情緒分類、機器學習、注意力、隨機森林 BERT, emotion classification, random forest, attention, machine learning
統計 Statistics	本論文已被瀏覽 618 次，被下載 3 次 The thesis/dissertation has been browsed 618 times, has been downloaded 3 times.

中文摘要
深度學習的訓練過程有如黑盒子一般，提取的資料特徵只能被電腦所認知，人類難以理解訓練出來的特徵實質意義是什麼。然而傳統機器學習當中的決策樹及隨機森林則有著讓人能理解分類過程，使模型具有高解釋能力。因此本研究目的是藉由合併Bidirectional Encoder Representations from Transformers(BERT)的文本轉換，以及隨機森林等傳統機器學習的結果，比較加入BERT文本轉換前後的效能，使得傳統機器學習能達到深度學習的成效。並嘗試找出是否具有除了對話內容以外的因素，可以幫助模型訓練，使之預測資料集文本訊息情緒的時候，能有更高的準確率。本研究設計分別從加入額外因素對於預測的影響、注意力對於模型微調成效的影響，及字詞注意力與預測間的關係三個方面進行探討，所獲結果簡述如下： 1.BERT組合模型的整體預測效能，無論有無加入額外資訊，在三個資料集內文本的情緒分類整體預測效能皆優於n-grams組合模型或TF-IDF組合模型。在加入對話者資訊、性別、對話者組合、對話時間長度及場景等額外資訊後，對於BERT組合模型在三個資料集內文本的情緒分類，整體預測效能並沒有明顯提升。 2.將資料集分類的項目由4個情緒類別增加至7個類別，隨著分類類別變多，加入額外資訊後，對於各模型分類的整體預測效能提升或變差的幅度更趨明顯。 3.以往處理文本資料的流程之一--去除標點符號，是造成降低模型預測能力的因素之一，本研究分析注意力對於模型微調成效的影響發現，標點符號不會造成模型訓練的干擾，並成功以遮蓋部分字詞注意力回饋的方式，了解到一昧的去除中性詞以及保留情緒字詞，並不是訓練出成效最好的方法。 4.探討在字詞注意力與預測的關係中，藉由BERT的斷詞器，將每一句的對話資料斷詞後，與各個字詞與該句子的預測結果做組合，再與實際結果做預測對照。當注意力在小於0.5時，注意力越高的時候準確率也會越高，但是當注意力高於0.5的時候，準確率就不一定會越高，有可能是會降低的。針對可能出現的此情形的原因進行探討後認為字詞注意力越高，準確率就越高，換言之，注意力與預測準確率成正比。而分析字詞注意力與隨機森林特徵重要性的關係發現，平均注意力與特徵重要性之間應該沒有對應的規則可循。
Abstract
The training process of deep learning is like a black box, and the extracted features are difficult to understand. In contrast, the traditional machine learning methods, decision tree and random forest are able to derive classification models. This study is to compare the performance of methods with and without adding features of BERT (Bidirectional Encoder Representations from Transformers). By combining the features of BERT and traditional machine learning methods such as random forest, the traditional machine learning method can achieve the same performance as deep learning. In addition, I tried to find factors, except the conversation contents, which can improve the accuracy of model training when they were used to predict the emotion of texts. In this study, the experiments were designed and discussed from three aspects: the influence of adding additional factors on the results of prediction, the influence of fine-tuning model according to the attention mechanism, and the relation between attention of words and prediction. The results are summarized as following: 1. For the performance of model integrated with BERT, regardless of additional factor was added or not, the performance of emotion classification of the text in three datasets outperformed models integrated with n-grams and TF-IDF. 2. When the number of categories increased from 4 to 7 in the datasets, more information was added, and the performance of each model changed more obviously. 3. Removing punctuations is one of the factors that degrade the model performance. In the experiment, after analyzing the influence of fine-tuning model performance according to attention, I found that punctuations didn’t obstruct the model fine-tuning process. I used the method of masking part of attention of words to realize the fact that removing neutral words and keeping sentiment words cannot lead to the best performance. 4. I explored the relation between attention of words and model prediction. After tokenizing every dialogue by the tokenizer of BERT, I combined these words with sentence results, and then I compared them with the actual results. When attention value is less than 0.5, the results show that the higher attention value, the higher accuracy. In contrast, when attention value greater than 0.5, the attention value may not have influence on the accuracy. After analyzing the possible reasons for this situation, I conclude that the higher the word attention, the higher the accuracy. In other words, the attention is proportional to the accuracy. The analysis of the relation between the attention of word and the feature importance of random forest shows that the average attention and feature importance are irreverent.

目次 Table of Contents
論文審定書-----------------------------------------------------------------------------------------i 誌謝-------------------------------------------------------------------------------------------------ii 摘要------------------------------------------------------------------------------------------------iii Abstract---------------------------------------------------------------------------------------------v 第一章、緒論 ------------------------------------------------------------------------------------1 1.1研究背景----------------------------------------------------------------------------------1 1.2研究動機及目的-------------------------------------------------------------------------1 1.3研究架構----------------------------------------------------------------------------------2 第二章、文獻探討--------------------------------------------------------------------------------4 2.1 情感分析及情緒感知------------------------------------------------------------------4 2.2傳統機器學習----------------------------------------------------------------------------4 2.3近代的機器學習-------------------------------------------------------------------------5 第三章、研究方法--------------------------------------------------------------------------------8 3.1 資料蒐集及介紹-------------------------------------------------------------------------9 3.1.1 EmotionLines---------------------------------------------------------------------9 3.1.2 MELD----------------------------------------------------------------------------10 3.2 資料前處理------------------------------------------------------------------------------12 3.3分類模型建立---------------------------------------------------------------------------13 第四章、研究結果及討論----------------------------------------------------------------------15 4.1探討是否有更多幫助分類的因素---------------------------------------------------15 4.1.1加入對話者的對話資料------------------------------------------------------15 4.1.2加入對話者的資訊------------------------------------------------------------17 4.1.3加入對話者性別---------------------------------------------------------------20 4.1.4加入對話者組合---------------------------------------------------------------22 4.1.5加入對話時間長度------------------------------------------------------------24 4.1.6加入場景------------------------------------------------------------------------27 4.1.7以各類別的效能差異探討額外資訊的影響------------------------------30 4.1.8討論------------------------------------------------------------------------------36 4.2注意力對於模型微調成效的影響---------------------------------------------------39 4.2.1成效分析------------------------------------------------------------------------39 4.2.2討論------------------------------------------------------------------------------40 4.3 字詞注意力與預測隨機森林特徵重要性與預測的相互比較-----------------49 4.3.1字詞注意力分別與預測以及隨機森林特徵重要性的關係------------49 4.3.2字詞注意力與隨機森林特徵重要性的關係------------------------------51 4.3.3討論------------------------------------------------------------------------------52 第五章、結論與建議----------------------------------------------------------------------------53 5.1結論---------------------------------------------------------------------------------------53 5.2建議---------------------------------------------------------------------------------------54 第六章、參考文獻-------------------------------------------------------------------------------55 附錄-------------------------------------------------------------------------------------------------57

參考文獻 References
1. Vaswani, A., et al., Attention Is All You Need. CoRR, 2017. abs/1706.03762. 2. Devlin, J., et al., BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR, 2018. abs/1810.04805. 3. Cotfas, L.A., et al. Grey sentiment analysis using SentiWordNet. in 2017 International Conference on Grey Systems and Intelligent Services (GSIS). 2017. 4. Ding, X., B. Liu, and P.S. Yu. A Holistic Lexicon-Based Approach to Opinion Mining. in Proceedings of the 2008 International Conference on Web Search and Data Mining. 2008. New York, NY, USA: Association for Computing Machinery. 5. Kim, Y. Convolutional Neural Networks for Sentence Classification. in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014. Doha, Qatar: Association for Computational Linguistics. 6. Breiman, L., Random Forests. Mach. Learn., 2001. 45(1): p. 5–32. 7. Huang, Y.-., Hao, et al., EmotionX-IDEA: Emotion BERT - an Affectional Model for Conversation. CoRR, 2019. abs/1908.06264. 8. Sun, X., X. Peng, and S. Ding, Emotional Human-Machine Conversation Generation Based on Long Short-Term Memory. Cognitive Computation, 2017. 10. 9. Serban, I.V., et al., Building end-to-end dialogue systems using generative hierarchical neural network models, in Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. 2016, AAAI Press: Phoenix, Arizona. p. 3776–3783. 10. Hassan, A. and A. Mahmood. Deep Learning approach for sentiment analysis of short texts. in 2017 3rd International Conference on Control, Automation and Robotics (ICCAR). 2017. 11. Jianqiang, Z. and G. Xiaolin, Deep Convolution Neural Networks for Twitter Sentiment Analysis. IEEE Access, 2018. PP: p. 1-1. 12. Nio, L. and K. Murakami. Japanese Sentiment Classification Using Bidirectional Long Short-Term Memory Recurrent Neural Network. 2018. 13. Vig, J., Visualizing Attention in Transformer-Based Language Representation Models. CoRR, 2019. abs/1904.02679. 14. Chen, S.-., Yeh, et al., EmotionLines: An Emotion Corpus of Multi-Party Conversations. CoRR, 2018. abs/1802.08379. 15. Hsu, C.-C. and L.-W. Ku. SocialNLP 2018 EmotionX Challenge Overview: Recognizing Emotions in Dialogues. 2018. 16. Luo, L., H. Yang, and F.Y.L. Chin, EmotionX-DLC: Self-Attentive BiLSTM for Detecting Sequential Emotions in Dialogue. CoRR, 2018. abs/1806.07039. 17. Shmueli, B. and L.-. Ku, Wei, SocialNLP EmotionX 2019 Challenge Overview: Predicting Emotions in Spoken Dialogues and Chats. CoRR, 2019. abs/1909.07734. 18. Khosla, S. EmotionX-AR: CNN-DCNN autoencoder based Emotion Classifier. 2018. 19. Poria, S., et al., MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations. CoRR, 2018. abs/1810.02508. 20. Tripathy, A., A. Agrawal, and S. Rath, Classification of Sentiment Reviews using N-gram Machine Learning Approach. Expert Systems with Applications, 2016. 57.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0025121-165628.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2453 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2453 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS