論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available
論文名稱 Title |
以深度學習為基礎的自然語言交談系統:以金融領域為例 A Deep Learning Based Natural Language Conversation System: Studies on Financial Datasets |
||
系所名稱 Department |
|||
畢業學年期 Year, semester |
語文別 Language |
||
學位類別 Degree |
頁數 Number of pages |
64 |
|
研究生 Author |
|||
指導教授 Advisor |
|||
召集委員 Convenor |
|||
口試委員 Advisory Committee |
|||
口試日期 Date of Exam |
2017-06-22 |
繳交日期 Date of Submission |
2017-08-16 |
關鍵字 Keywords |
長短期記憶、循環神經網路、問答系統、交談系統、深度學習、自然語言 Long Short-Term Memory (LSTM), Answering system, Recurrent Neural Network (RNN), Natural Language Processing (NLP), Conversation system, Deep learning |
||
統計 Statistics |
本論文已被瀏覽 5953 次,被下載 35 次 The thesis/dissertation has been browsed 5953 times, has been downloaded 35 times. |
中文摘要 |
近年因為深度學習的發展對資料預測、分類、辨識都取得進步,循環神經網路中的長短期記憶也被證明在自然語言處理的工作中有良好的成果,本研究希望利用深度學習模型建立自然語言交談系統,用問答資料訓練模型,使系統可以跟使用者對話、解決問題,並使用推理和擴充知識庫的方式讓系統成長。 本研究利用word2vec、深度學習模型、外部知識庫來建構一個完整的自然語言交談系統,從StackExchange服務中搜集金融問答集,將文字資料處理成數字向量後,使用word2vec訓練成關係網路嵌入至深度學習模型中,讓模型可以更好的訓練預測模型。使用DBpedia作為外部知識庫則可以讓系統回答原本不存在知識庫中的問題,進一步擴充知識庫,讓系統可以藉由交談的過程獲得更多知識。 在研究過程中探討了模型訓練採用不同錯誤答案的選擇策略、不同的問答資料、不同的模型結構與資料覆蓋度都對模型預測能力有相當的影響,訓練過程中接觸越多的錯誤答案使模型可以有更好的預測能力,外部知識庫也發現使用上的一些限制,符合的情形下則可以順利回答問題並擴充知識庫,建立問題相似模型可以成功實作推理功能,讓系統接收問題後可以先在知識庫中找到相似的問題在進而尋找答案。 |
Abstract |
Over the last few years, many studies have shown that the performance of data prediction, classification and identification can be largely improved using the deep learning model. Recurrent neural networks (RNNs) and long short-term memory (LSTM) models have also been confirmed to have good results in natural language processing. Following the success of these works, this study attempts to use the deep learning model to develop a natural language conversation system. The system adopts a question-answering data training model to perform conversations with user, solves the corresponding problems, and uses reasoning techniques and expands the knowledge base to enrich the human-machine dialogue. In this thesis, I employed several computational methods, including word2vec, deep learning model, and external knowledge transfer to construct a complete natural language conversation system. In addition, I conducted extensive sets of experiments with different strategies of choosing wrong answers in model training, organizing different question-answer data, testing different model structures and analyzing the factor of data coverage to investigate their considerable influences on the model predictive ability. From the results, I found that the more wrong answers used in the training process, the better predictive ability of a model can be obtained. Also, I found that an external knowledge base was beneficial to the system but there were some restrictions to meet. Extra efforts were required to meet these restrictions and the knowledge base can thus be expanded successfully. Additionally, by creating a model of similar questions, the system can perform the reasoning function to alternatively look for a similar question and retrieve its corresponding answer from the knowledge based in response to the user’s query. |
目次 Table of Contents |
目錄 論文審定書 i 誌謝 ii 摘要 iii Abstract iv 第一章 緒論 1 1.1. 研究背景 1 1.2. 研究動機與目的 2 第二章 文獻探討 4 2.1. 問答系統(Question Answering Systems) 4 2.2. 深度學習(Deep Learning) 5 2.2.1. 循環神經網路(Recurrent neural networks, RNNs) 6 2.2.2. 長短期記憶(long short-term memory, LSTM) 6 2.3. 基於深度學習的問答系統 7 第三章 研究方法與步驟 8 3.1. 系統架構 8 3.2. 自然語言處理與問答模型建立 9 3.2.1. 深度學習模型 9 3.2.2. 研究流程及進行步驟 10 3.3. 推理與擴充知識庫 14 3.3.1. 推理 14 3.3.2. 擴充知識庫 15 3.4. 對話處理 16 第四章 研究結果 19 4.1資料整理與文字處理 19 4.1.1.金融問答集 19 4.1.2保險問答集 21 4.1.3相似問題集 22 4.2訓練資料洗牌 23 4.3錯誤答案策略影響探討 24 4.3.1.金融資料-Embedding模型結果 24 4.3.2.金融資料-ConvolutionalLSTM模型結果 27 4.3.3每個策略錯誤問題覆蓋率 31 4.3.4保險資料-ConvolutionalLSTM模型結果 32 4.4覆蓋度問題探討 35 4.4.1.Embedding模型結果 36 4.4.2.ConvolutionalLSTM模型結果 38 4.5外部知識庫:DBpedia 41 4.6推理:問題相似度模型 42 4.7 綜合討論 44 第五章 結論與未來研究 46 5.1 結論 46 5.2未來研究 46 參考文獻 48 |
參考文獻 References |
參考文獻 Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning, 2(1), 1–127. Bengio, Y., Courville, A., & Vincent, P. (2012). Representation Learning: A Review and New Perspectives. arXiv:1206.5538 [Cs]. Retrieved from http://arxiv.org/abs/1206.5538 Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2), 157–166. Bordes, A., Chopra, S., & Weston, J. (2014). Question Answering with Subgraph Embeddings. arXiv:1406.3676 [Cs]. Retrieved from http://arxiv.org/abs/1406.3676 Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural Language Processing (Almost) from Scratch. J. Mach. Learn. Res., 12, 2493–2537. Deng, L. (2014). Deep Learning: Methods and Applications. Foundations and Trends® in Signal Processing, 7(3–4), 197–387. Graves, A. (2013). Generating Sequences With Recurrent Neural Networks. arXiv:1308.0850 [Cs]. Retrieved from http://arxiv.org/abs/1308.0850 Graves, A., & Jaitly, N. (2014). Towards End-To-End Speech Recognition with Recurrent Neural Networks (pp. 1764–1772). Presented at the Proceedings of the 31st International Conference on Machine Learning (ICML-14). Retrieved from http://machinelearning.wustl.edu/mlpapers/papers/icml2014c2_graves14 Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., & Schmidhuber, J. (2009). A Novel Connectionist System for Unconstrained Handwriting Recognition. IEEE Trans. Pattern Anal. Mach. Intell., 31(5), 855–868. Graves, A., Mohamed, A., & Hinton, G. (2013). Speech Recognition with Deep Recurrent Neural Networks. arXiv:1303.5778 [Cs]. Retrieved from http://arxiv.org/abs/1303.5778 Green, B. F., Jr., Wolf, A. K., Chomsky, C., & Laughery, K. (1961). Baseball: An Automatic Question-answerer. In Papers Presented at the May 9-11, 1961, Western Joint IRE-AIEE-ACM Computer Conference (pp. 219–224). New York, NY, USA: ACM. Green, B., Wolf, A., Chomsky, C., & Laughery, K. (1986). Readings in Natural Language Processing. In B. J. Grosz, K. Sparck-Jones, & B. L. Webber (Eds.) (pp. 545–549). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. Retrieved from http://dl.acm.org/citation.cfm?id=21922.24354 Han, L., Yu, Z.-T., Qiu, Y.-X., Meng, X.-Y., Guo, J.-Y., & Si, S.-T. (2008). Research on passage retrieval using domain knowledge in Chinese question answering system. In 2008 International Conference on Machine Learning and Cybernetics (Vol. 5, pp. 2603–2606). Hao, X., Chang, X., & Liu, K. (2007). A Rule-based Chinese Question Answering System for Reading Comprehension Tests. In Third International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2007. IIHMSP 2007 (Vol. 2, pp. 325–329). Hihi, S. E., & Bengio, Y. (1996). Hierarchical Recurrent Neural Networks for Long-Term Dependencies. In D. S. Touretzky & M. E. Hasselmo (Eds.), Advances in Neural Information Processing Systems 8 (pp. 493–499). MIT Press. Retrieved from http://papers.nips.cc/paper/1102-hierarchical-recurrent-neural-networks-for-long-term-dependencies.pdf Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780. Hu, B., Lu, Z., Li, H., & Chen, Q. (2014). Convolutional Neural Network Architectures for Matching Natural Language Sentences. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 27 (pp. 2042–2050). Curran Associates, Inc. Retrieved from http://papers.nips.cc/paper/5550-convolutional-neural-network-architectures-for-matching-natural-language-sentences.pdf Huang, J., Zhou, M., & Yang, D. (2007). Extracting Chatbot Knowledge from Online Discussion Forums. In Proceedings of the 20th International Joint Conference on Artifical Intelligence (pp. 423–428). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. Retrieved from http://dl.acm.org/citation.cfm?id=1625275.1625342 Ittycheriah, A., Franz, M., Zhu, W., Ratnaparkhi, A., & Mammone, R. J. (2001). IBM’s Statistical Question Answering System. ResearchGate. Retrieved from https://www.researchgate.net/publication/2875435_IBM’s_Statistical_Question_Answering_System Jean, S., Cho, K., Memisevic, R., & Bengio, Y. (2014). On Using Very Large Target Vocabulary for Neural Machine Translation. arXiv:1412.2007 [Cs]. Retrieved from http://arxiv.org/abs/1412.2007 Kiros, R., Salakhutdinov, R., & Zemel, R. S. (2014). Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models. arXiv:1411.2539 [Cs]. Retrieved from http://arxiv.org/abs/1411.2539 Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 25 (pp. 1097–1105). Curran Associates, Inc. Retrieved from http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf Meng, Y., Rumshisky, A., & Romanov, A. (2017). Temporal Information Extraction for Question Answering Using Syntactic Dependencies in an LSTM-based Architecture. arXiv:1703.05851 [Cs]. Retrieved from http://arxiv.org/abs/1703.05851 Mikolov, T., Deoras, A., Povey, D., Burget, L., & Černocký, J. (2011). Strategies for training large scale neural network language models. In 2011 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (pp. 196–201). Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed Representations of Words and Phrases and their Compositionality. arXiv:1310.4546 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1310.4546 Nguyen, M.-T., Phan, V.-A., Nguyen, T.-S., & Nguyen, M.-L. (2016). Learning to rank questions for community question answering with ranking svm. arXiv Preprint arXiv:1608.04185. Retrieved from https://arxiv.org/abs/1608.04185 Pascanu, R., Mikolov, T., & Bengio, Y. (2012). On the difficulty of training Recurrent Neural Networks. arXiv:1211.5063 [Cs]. Retrieved from http://arxiv.org/abs/1211.5063 Ravichandran, D., & Hovy, E. (2002). Learning Surface Text Patterns for a Question Answering System. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (pp. 41–47). Stroudsburg, PA, USA: Association for Computational Linguistics. Riloff, E., & Thelen, M. (2000). A Rule-based Question Answering System for Reading Comprehension Tests. In Proceedings of the 2000 ANLP/NAACL Workshop on Reading Comprehension Tests As Evaluation for Computer-based Language Understanding Sytems - Volume 6 (pp. 13–19). Stroudsburg, PA, USA: Association for Computational Linguistics. Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1. In D. E. Rumelhart, J. L. McClelland, & C. PDP Research Group (Eds.) (pp. 318–362). Cambridge, MA, USA: MIT Press. Retrieved from http://dl.acm.org/citation.cfm?id=104279.104293 Sainath, T. N., Mohamed, A.-R., Kingsbury, B., & Ramabhadran, B. (2013). Deep convolutional neural networks for LVCSR. In ResearchGate (pp. 8614–8618). Schmidhuber, J. (2015). Deep Learning in Neural Networks: An Overview. Neural Networks, 61, 85–117. Song, H. A., & Lee, S.-Y. (n.d.). Hierarchical Representation Using NMF. In SpringerLink (pp. 466–473). Springer Berlin Heidelberg. SPARQL Query Language for RDF. (n.d.). Retrieved November 19, 2016, from https://www.w3.org/TR/rdf-sparql-query/ Sutskever, I. (2013). Training recurrent neural networks. University of Toronto. Retrieved from https://www.cs.utoronto.ca/~ilya/pubs/ilya_sutskever_phd_thesis.pdf Sutskever, I., Martens, J., & Hinton, G. E. (2011). Generating Text with Recurrent Neural Networks. In ResearchGate (pp. 1017–1024). Retrieved from https://www.researchgate.net/publication/221345823_Generating_Text_with_Recurrent_Neural_Networks Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to Sequence Learning with Neural Networks. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 27 (pp. 3104–3112). Curran Associates, Inc. Retrieved from http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., … Rabinovich, A. (2014). Going Deeper with Convolutions. arXiv:1409.4842 [Cs]. Retrieved from http://arxiv.org/abs/1409.4842 Unger, C., Bühmann, L., Lehmann, J., Ngonga Ngomo, A.-C., Gerber, D., & Cimiano, P. (2012). Template-based Question Answering over RDF Data. In Proceedings of the 21st International Conference on World Wide Web (pp. 639–648). New York, NY, USA: ACM. Wang, B., Liu, K., & Zhao, J. (2016). Inner attention based recurrent neural networks for answer selection. In The Annual Meeting of the Association for Computational Linguistics. Retrieved from http://www.aclweb.org/anthology/P/P16/P16-1122.pdf Woods, W. A. (1973). Progress in Natural Language Understanding: An Application to Lunar Geology. In Proceedings of the June 4-8, 1973, National Computer Conference and Exposition (pp. 441–450). New York, NY, USA: ACM. Yih, W., Chang, M.-W., Meek, C., Pastusiak, A., Yih, S. W., & Meek, C. (2013). Question Answering Using Enhanced Lexical Semantic Models. Microsoft Research. Retrieved from https://www.microsoft.com/en-us/research/publication/question-answering-using-enhanced-lexical-semantic-models/ Yu, L., Hermann, K. M., Blunsom, P., & Pulman, S. (2014). Deep Learning for Answer Sentence Selection. arXiv:1412.1632 [Cs]. Retrieved from http://arxiv.org/abs/1412.1632 Zhang, K., & Zhao, J. (2010). A Chinese question-answering system with question classification and answer clustering. In 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD) (Vol. 6, pp. 2692–2696). |
電子全文 Fulltext |
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。 論文使用權限 Thesis access permission:自定論文開放時間 user define 開放時間 Available: 校內 Campus: 已公開 available 校外 Off-campus: 已公開 available |
紙本論文 Printed copies |
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。 開放時間 available 已公開 available |
QR Code |