國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,應用提示學習擴充資料於自然語言理解任務,Data Augmentation by Prompt Tuning on Natural Language Understanding Task

論文名稱 Title	應用提示學習擴充資料於自然語言理解任務 Data Augmentation by Prompt Tuning on Natural Language Understanding Task
系所名稱 Department	資訊管理學系 Department of Information Management
畢業學年期 Year, semester	111 學年度第 2 學期 The spring semester of Academic Year 111	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	53
研究生 Author	王鈺豪 Yu-Hao Wang
指導教授 Advisor	黃三益 Hwang, San-Yih
召集委員 Convenor	林福仁 FU-REN LIN
口試委員 Advisory Committee	羅珮綺 Pei chi Lou
口試日期 Date of Exam	2023-07-07	繳交日期 Date of Submission	2023-07-26
關鍵字 Keywords	實體萃取、意圖分類、資料擴充、自然語言理解、提示學習 Entity Extraction, Intent Classification, Data Augmentation, Natural Language Understanding, Prompt tuning
統計 Statistics	本論文已被瀏覽 265 次，被下載 0 次 The thesis/dissertation has been browsed 265 times, has been downloaded 0 times.

中文摘要
隨著自然語言技術的進步，現在許多客服會採用聊天機器人來輔助使用者取得資訊或是提供服務，聊天機器人又可分為從使用者輸入中獲取資訊的自然語言理解模組以及對話流程控制模組，其中自然語言理解任務又包含實體辨識以及意圖分類。在訓練自然語言模型時需要大量資料，然而，我們可以獲取的訓練資料卻不多，此時便需要資料擴充技術來協助資料的產生。本研究藉由預訓練語言模型，透過對預訓練語言模型再訓練，來生成目標領域內的資料，並在後續透過分類器的篩選來提升資料品質，最後再用篩選過的資料訓練分類器。我們基於 PromDA (Wang & Xu, 2022) 提出了多任務的生成架構。藉由整合意圖分類以及實體辨識兩個任務，使生成的資料能夠用於多任務的訓練，並用以證明此類整合可以提升兩個任務的準確度。
Abstract
With the advancement of natural language technology, many customer service systems now employ chatbots to assist users in obtaining information or providing services. Chatbots can be divided into two main modules: natural language understanding (NLU) modules, which extract information from user inputs, and dialogue flow control modules. The NLU task includes entity recognition and intent classification. Training natural language models requires a large amount of data, but the available training data is often limited. In such cases, data augmentation techniques are employed to generate additional data. In this study, we leverage pre-trained language models and further train them to generate data in the target domain. The generated data is then enhanced by a classifier filtering process to improve its quality. Finally, the filtered data is used to train the classifier. Building upon PromDA (Wang et al., 2022), we proposes a multi-task generation framework. By integrating intent classification and entity recognition tasks, the generated data can be used for multi-task training, and it demonstrates that such integration can enhance the accuracy of both tasks.

目次 Table of Contents
審定書 i 摘要 ii Abstract iii 目錄 iv 圖目錄 v 表目錄 vi 1. Introduction 1 2. Related Work 5 2.1 Few-shot Learning 5 2.2 Prompt Tuning 5 2.2.1 Discrete Prompt 6 2.2.2 Continuous Prompt 7 2.3 Data Augmentation 9 2.3.1 Ruled-based 9 2.3.2 Interpolation 9 2.3.3 Model-based 10 2.4 Pre-train Language Model 10 2.4.1 T5 10 2.4.2 BERT 11 2.4.3 GPT and ChatGPT 12 3. Method 15 3.1 Prompt-based Generator 17 3.2 Pre-train for Prompt Initialization 17 3.3 Finetune Generator 18 3.4 Generative DA 21 3.5 Consistency Filtering 24 3.6 Finetune Classifier 25 4. Experiments 27 4.1 Experiment Settings 27 4.2 Result 30 4.3 Discussion 32 4.3.1 Slot Type 32 4.3.2 LLM as Generator 35 5. Conclusion 37 Reference 38 Appendix 43 A. Performance summary 43 B. Prompt with LLM 44

參考文獻 References
Anaby-Tavor, A., Carmeli, B., Goldbraich, E., Kantor, A., Kour, G., Shlomov, S., Tepper, N., & Zwerdling, N. (2020). Do Not Have Enough Data? Deep Learning to the Rescue! Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), 7383-7390. https://doi.org/10.1609/aaai.v34i05.6233 Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., & Askell, A. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877-1901. Chen, J., Yang, Z., & Yang, D. (2020). MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification. Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.194 Dai, X., & Adel, H. (2020, December). An Analysis of Simple Data Augmentation for Named Entity Recognition.Proceedings of the 28th International Conference on Computational Linguistics Barcelona, Spain (Online). De Cao, N., Izacard, G., Riedel, S., & Petroni, F. (2020). Autoregressive entity retrieval. arXiv preprint arXiv:2010.00904. Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019, June). BERT Pre-training of Deep Bidirectional Transformers for Language Understanding.Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies, Volume 1 (Long and Short Papers) Minneapolis, Minnesota. DeVries, T., & Taylor, G. W. (2017). Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552. Ding, B., Liu, L., Bing, L., Kruengkrai, C., Nguyen, T. H., Joty, S., Si, L., & Miao, C. (2020). DAGA: Data Augmentation with a Generation Approach for Low-resource Tagging Tasks. Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.emnlp-main.488 FitzGerald, J., Hench, C., Peris, C., Mackie, S., Rottmann, K., Sanchez, A., Nash, A., Urbach, L., Kakarala, V., Singh, R., Ranganath, S., Crist, L., Britan, M., Leeuwis, W., Tur, G., & Natarajan, P. (2023). MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.235 Ghiasi, G., Cui, Y., Srinivas, A., Qian, R., Lin, T.-Y., Cubuk, E. D., Le, Q. V., & Zoph, B. (2021). Simple copy-paste is a strong data augmentation method for instance segmentation. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Gu, Y., Han, X., Liu, Z., & Huang, M. (2022). PPT: Pre-trained Prompt Tuning for Few-shot Learning. Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.576 Hariharan, B., & Girshick, R. (2017). Low-shot visual recognition by shrinking and hallucinating features. Proceedings of the IEEE international conference on computer vision, Houlsby, N., Giurgiu, A., Jastrzebski, S., Morrone, B., De Laroussilhe, Q., Gesmundo, A., Attariyan, M., & Gelly, S. (2019). Parameter-efficient transfer learning for NLP. International Conference on Machine Learning, Jiang, Z., Xu, F. F., Araki, J., & Neubig, G. (2020). How Can We Know What Language Models Know? (Vol. 8). MIT Press. https://doi.org/10.1162/tacl_a_00324 Kobayashi, S. (2018). Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations. Association for Computational Linguistics. https://doi.org/10.18653/v1/N18-2072 Kumar, V., Glaude, H., de Lichy, C., & Campbell, W. (2019, November). A Closer Look At Feature Space Data Augmentation For Few-Shot Intent Classification.Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019) Hong Kong, China. Lester, B., Al-Rfou, R., & Constant, N. (2021). The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691. Li, X. L., & Liang, P. (2021). Prefix-Tuning: Optimizing Continuous Prompts for Generation. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.acl-long.353 Liu, X., Ji, K., Fu, Y., Tam, W., Du, Z., Yang, Z., & Tang, J. (2022). P-tuning: Prompt tuning can be comparable to fine-tuning across scales and tasks. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Liu, X., Zheng, Y., Du, Z., Ding, M., Qian, Y., Yang, Z., & Tang, J. (2021). GPT understands, too. arXiv preprint arXiv:2103.10385. Ng, N., Cho, K., & Ghassemi, M. (2020). SSMBA: Self-supervised manifold based data augmentation for improving out-of-domain robustness. arXiv preprint arXiv:2009.10195. Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9. Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., & Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1), 5485-5551. Rose, S., Engel, D., Cramer, N., & Cowley, W. (2010). Automatic keyword extraction from individual documents. Text mining: applications and theory, 1-20. Schick, T., & Schütze, H. (2021). Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.eacl-main.20 Schwartz, E., Karlinsky, L., Shtok, J., Harary, S., Marder, M., Kumar, A., Feris, R., Giryes, R., & Bronstein, A. (2018). Delta-encoder: an effective sample synthesis method for few-shot object recognition. Advances in neural information processing systems, 31. Sellam, T., Das, D., & Parikh, A. (2020). BLEURT: Learning Robust Metrics for Text Generation. Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.704 Sennrich, R., Haddow, B., & Birch, A. (2016). Improving Neural Machine Translation Models with Monolingual Data. Association for Computational Linguistics. https://doi.org/10.18653/v1/P16-1009 Wang, Y., Xu, C., Sun, Q., Hu, H., Tao, C., Geng, X., & Jiang, D. (2022). PromDA: Prompt-based Data Augmentation for Low-Resource NLU Tasks. Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.292 Wei, J., & Zou, K. (2019). Eda: Easy data augmentation techniques for boosting performance on text classification tasks. arXiv preprint arXiv:1901.11196. Xu, X., Wang, G., Kim, Y.-B., & Lee, S. (2021). AugNLG: Few-shot Natural Language Generation using Self-trained Data Augmentation. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.acl-long.95 Yang, Y., Malaviya, C., Fernandez, J., Swayamdipta, S., Le Bras, R., Wang, J.-P., Bhagavatula, C., Choi, Y., & Downey, D. (2020). Generative Data Augmentation for Commonsense Reasoning. Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.90 Yun, S., Han, D., Oh, S. J., Chun, S., Choe, J., & Yoo, Y. (2019). Cutmix: Regularization strategy to train strong classifiers with localizable features. Proceedings of the IEEE/CVF international conference on computer vision, Zhang, H., Cisse, M., Dauphin, Y. N., & Lopez-Paz, D. (2017). mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412. Zhang, R., Yu, Y., & Zhang, C. (2020). SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup. Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.emnlp-main.691

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：開放下載的時間 available 2025-07-26 校外 Off-campus：開放下載的時間 available 2025-07-26 您的 IP(校外) 位址是 13.59.91.46 現在時間是 2025-04-26 論文校外開放下載的時間是 2025-07-26 Your IP address is 13.59.91.46 The current date is 2025-04-26 This thesis will be available to you on 2025-07-26.
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 2025-07-26

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS