Responsive image
博碩士論文 etd-0705121-172815 詳細資訊
Title page for etd-0705121-172815
論文名稱
Title
跨語言以知識庫為基礎的聊天機器人設計:以偏頭痛衛教機器人為例
A Cross-lingual Knowledge Based Chatbot Design: A Case of Migraine Education
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
51
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2021-07-22
繳交日期
Date of Submission
2021-08-05
關鍵字
Keywords
跨語言空間、字彙轉換、聊天機器人、本體、詞嵌入
Cross-Lingual Word Space, Vocabulary Mapping, Chatbot, Ontology, Migraine
統計
Statistics
本論文已被瀏覽 614 次,被下載 130
The thesis/dissertation has been browsed 614 times, has been downloaded 130 times.
中文摘要
現今引用 Ontology 在各領域開發的應用系統越來越多,開發者可導入網路上公 開的 Ontology 到他們的開發系統中以縮短開發過程和時間。由於網路上公開的 Ontology 都是領域專家所建構且都以英語為主,特別是醫學領域,雖然現今已有許多 公開的醫學 Ontology,但由於語言問題都不適合引用到中文系統裡,因此中文建構的 應用系統無法直接引用專家所定義的領域邏輯概念 。本論文提出結合 RASA 聊天機 器人框架、醫學 Ontology、和跨語言空間來實現利用英語 Ontology 回答中文使用者問 題,在語言空間我們首先利用 CBOW + Hierarchical Softmax 分別建構單語空間,再 映射結合成跨語言空間實現轉換使用者輸入的中文常用字到英語專業字彙,從而自英 語的 Ontology 獲取資料回應使用者。我們的實驗證明在建構 RASA NLU 模型,和 Ontology 能夠有效提供資訊給使用者。
Abstract
Nowadays, there are many application systems developed with ontology in various fields. Developers can import the ontologies to reduce the development process and time. Most ontologies are available in English, especially in the field of medicine. Although there are many open medical ontologies, due to the language differences, it is not suitable for the Chinese system, and the Chinese application developers cannot directly import these expert logical concepts in their applications. This thesis proposes to combine chatbot framework, medical ontology and language space to enable English ontology to answer Chinese user questions. In the language space, we first use CBOW + Hierarchical Softmax to combine the monolingual space, and then map and combine it into a cross-lingual space to achieve conversion. The common Chinese words entered by the user can be converted to English vocabulary, so as to access the English ontology to obtain data and reply to the user. Our experiments show that our approach of constructing the RASA NLU model using ontology can effectively provide information to users.
目次 Table of Contents
論文審定書 i
誌謝 ii
摘要 iii
Abstract iv
Table of Contents v
CHAPTER 1 - Introduction 1
CHAPTER 2 - Related work 5
2.1 Ontology based question answering system 5
2.2 Ontology based chatbot 5
2.3 Entity Detection 6
2.4 Target vector retrieve 7
CHAPTER 3 - Methodology 8
3.1 Ontology - Migraine Patient Education 9
3.2 RASA 10
3.2.1 RASA - Components 11
3.2.2 RASA - Intent design 11
3.3 Cross-lingual embedding space 13
3.3.1 Preprocessing 14
3.3.2 Word Space Building 15
3.3.3 Continues Bag-of-Words Model 15
3.3.4 Hierarchical Softmax 17
3.3.5 Cross-lingual word space 19
3.4 Response template 21
CHAPTER 4 - Experiments 23
4.1 Dataset description 23
4.1.1 Experimental Settings 26
4.1.2 Compared Methods for Monolingual and Cross-lingual Word Space 27
4.2 Experimental Results 27
4.2.1 RASA - Intent Evaluation 27
4.2.2 RASA - Entity Recognition Evaluation 29
4.2.3 Monolingual Word Space 30
4.2.4 Cross-Lingual Word Space 32
4.3 Chatbot Prototype Demonstration and Discussion 35
CHAPTER 5 - Conclusion and future work 39
Reference 41
參考文獻 References
Tomas, M., Kai, C., Greg, C., & Jeffrey, D. (2013). Efficient Estimation of Word Representations in Vector Space. Computation and Language arXiv:1301.3781 [cs.CL]
https://arxiv.org/pdf/1301.3781.pdf

Xin, R (2016). word2vec Parameter Learning Explained. Computation and Language arXiv:1411.2738 [cs.CL] https://arxiv.org/pdf/1411.2738.pdf

Abdul, A. M., Venkatesh, U (2018). Effectiveness of Hierarchical Softmax in Large Scale Classification Tasks Machine Learning (cs.LG); Machine Learning (stat.ML) arXiv:1812.05737. https://arxiv.org/pdf/1812.05737.pdf

Xiao H, Jingyuan Z, Dingcheng Li, Ping Li (2019). Knowledge Graph Embedding Based Question Answering. Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining , Pages 105–113. https://doi.org/10.1145/3289600.3290956

Apoorv Saxena, Aditay Tripathi, Partha Talukdar(2020). Improving Multi-hop Question Answering over Knowledge Graphs using Knowledge Base Embeddings. Computer Science
https://aclanthology.org/2020.acl-main.412/Asad Abdi , Norisma Idris & Zahrah Ahmad (2018). QAPD: an ontology-based question answering system in the physics domain. Soft Computing 22, pages 213–230.
https://link.springer.com/article/10.1007/s00500-016-2328-2
41

Amir H. Asiaee, Todd Minning, Prashant Doshi & Rick L. Tarleton(2015). A framework for ontology-based question answering with application to parasite immunology. Journal of
Biomedical Semantics volume 6. https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-015-0029-x

Hadeel Al-Zubaide; Ayman A. Issa (2011). OntBot: Ontology based chatbot. IEEE 09 February 2012. https://ieeexplore.ieee.org/document/6149594

Weiguo Zheng, Jeffrey Xu Yu, Lei Zou , Hong Cheng(2018). Question answering over knowledge graphs: question understanding via template decomposition. http://www.vldb.org/pvldb/vol11/p1373-zheng.pdf

Nigel, C., Chikashi, N ., & Jun-ichi, T. (2000). Extracting the Names of Genes and Gene Products with a Hidden Markov Model. https://aclanthology.org/C00-1030/

Anusha, V., Pranjal, J., & Dhruv, P (2018). Ontology based Chatbot (For E-commerce Website). International Journal of Computer Applications 174(14):51-55 https://www.ijcaonline.org/archives/volume179/number14/vegesna-2018-ijca-916215.pdf


Alexis, C., Guillaume, L ., Marc’ Aurelio, R ., Ludovic, D., & Herve, J (2018). WORD TRANSLATION WITHOUT PARALLEL DATA. ICLR 2018, arXiv: 1710.04084[cs.CL].
https://arxiv.org/abs/1710.04087

Lynn, M., Cesar, A., & Suvama, N.. (2012). Disease Ontology: a backbone for disease semantic integration. Nucleic Acids Research 40(Database issue): D940-6, PubMed https://pubmed.ncbi.nlm.nih.gov/22080554/

Finkel, J., Dingare, S, Nguyen, H & Nissim, M., (2004). Exploiting Context for Biomedical Entity Recognition: From Syntax to the Web. Proceedings of the international Joint Workshop on Natural Language Processing in Biomedicine and Its Applications Association for Computational Linguistics; p. 88-91. https://aclanthology.org/W04-1217.pdf

Asahara M, Matsumoto Y .(2003) Japanese named entity extraction with redundant morphological analysis. Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1. Association for Computational Linguistics; 2003. p. 8–15.
https://aclanthology.org/N03-1002.pdf

McCallum A, Li W (2003). Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. Proceedings of the Seventh
Conference on Natural Language Learning at HLT-NAACL 2003-Volume 4. Association for Computational Linguistics; 2003. p. 188–91. https://aclanthology.org/W03-0430.pdf

Mike Bennett (2013). The financial industry business ontology: Best practice for big data. Journal of Physics Conference Series 803(1). https://www.researchgate.net/publication/263328952_The_financial_industry_business_ontology_Best_practice_for_big_data

George A. Miller (1995). WordNet: a lexical database for English. Communications of the ACM; Volume 38; pp 39–41. https://doi.org/10.1145/219717.219748

Shuu-jiun Wang, Jong-Ling Fuh, Y-H Young, S-R Lu (2000). Prevalence of Migraine in Taipei, Taiwan: A Population-Based Survey. Cephalalgia 20(6):566-72. https://journals.sagepub.com/doi/10.1046/j.1468-2982.2000.00085.x
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code