Responsive image
博碩士論文 etd-0710117-150036 詳細資訊
Title page for etd-0710117-150036
論文名稱
Title
以深度學習為基礎的自然語言交談系統:以金融領域為例
A Deep Learning Based Natural Language Conversation System: Studies on Financial Datasets
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
64
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2017-06-22
繳交日期
Date of Submission
2017-08-16
關鍵字
Keywords
長短期記憶、循環神經網路、問答系統、交談系統、深度學習、自然語言
Long Short-Term Memory (LSTM), Answering system, Recurrent Neural Network (RNN), Natural Language Processing (NLP), Conversation system, Deep learning
統計
Statistics
本論文已被瀏覽 5953 次,被下載 35
The thesis/dissertation has been browsed 5953 times, has been downloaded 35 times.
中文摘要
近年因為深度學習的發展對資料預測、分類、辨識都取得進步,循環神經網路中的長短期記憶也被證明在自然語言處理的工作中有良好的成果,本研究希望利用深度學習模型建立自然語言交談系統,用問答資料訓練模型,使系統可以跟使用者對話、解決問題,並使用推理和擴充知識庫的方式讓系統成長。
本研究利用word2vec、深度學習模型、外部知識庫來建構一個完整的自然語言交談系統,從StackExchange服務中搜集金融問答集,將文字資料處理成數字向量後,使用word2vec訓練成關係網路嵌入至深度學習模型中,讓模型可以更好的訓練預測模型。使用DBpedia作為外部知識庫則可以讓系統回答原本不存在知識庫中的問題,進一步擴充知識庫,讓系統可以藉由交談的過程獲得更多知識。
在研究過程中探討了模型訓練採用不同錯誤答案的選擇策略、不同的問答資料、不同的模型結構與資料覆蓋度都對模型預測能力有相當的影響,訓練過程中接觸越多的錯誤答案使模型可以有更好的預測能力,外部知識庫也發現使用上的一些限制,符合的情形下則可以順利回答問題並擴充知識庫,建立問題相似模型可以成功實作推理功能,讓系統接收問題後可以先在知識庫中找到相似的問題在進而尋找答案。
Abstract
Over the last few years, many studies have shown that the performance of data prediction, classification and identification can be largely improved using the deep learning model. Recurrent neural networks (RNNs) and long short-term memory (LSTM) models have also been confirmed to have good results in natural language processing. Following the success of these works, this study attempts to use the deep learning model to develop a natural language conversation system. The system adopts a question-answering data training model to perform conversations with user, solves the corresponding problems, and uses reasoning techniques and expands the knowledge base to enrich the human-machine dialogue.
In this thesis, I employed several computational methods, including word2vec, deep learning model, and external knowledge transfer to construct a complete natural language conversation system. In addition, I conducted extensive sets of experiments with different strategies of choosing wrong answers in model training, organizing different question-answer data, testing different model structures and analyzing the factor of data coverage to investigate their considerable influences on the model predictive ability. From the results, I found that the more wrong answers used in the training process, the better predictive ability of a model can be obtained. Also, I found that an external knowledge base was beneficial to the system but there were some restrictions to meet. Extra efforts were required to meet these restrictions and the knowledge base can thus be expanded successfully. Additionally, by creating a model of similar questions, the system can perform the reasoning function to alternatively look for a similar question and retrieve its corresponding answer from the knowledge based in response to the user’s query.
目次 Table of Contents
目錄
論文審定書 i
誌謝 ii
摘要 iii
Abstract iv
第一章 緒論 1
1.1. 研究背景 1
1.2. 研究動機與目的 2
第二章 文獻探討 4
2.1. 問答系統(Question Answering Systems) 4
2.2. 深度學習(Deep Learning) 5
2.2.1. 循環神經網路(Recurrent neural networks, RNNs) 6
2.2.2. 長短期記憶(long short-term memory, LSTM) 6
2.3. 基於深度學習的問答系統 7
第三章 研究方法與步驟 8
3.1. 系統架構 8
3.2. 自然語言處理與問答模型建立 9
3.2.1. 深度學習模型 9
3.2.2. 研究流程及進行步驟 10
3.3. 推理與擴充知識庫 14
3.3.1. 推理 14
3.3.2. 擴充知識庫 15
3.4. 對話處理 16
第四章 研究結果 19
4.1資料整理與文字處理 19
4.1.1.金融問答集 19
4.1.2保險問答集 21
4.1.3相似問題集 22
4.2訓練資料洗牌 23
4.3錯誤答案策略影響探討 24
4.3.1.金融資料-Embedding模型結果 24
4.3.2.金融資料-ConvolutionalLSTM模型結果 27
4.3.3每個策略錯誤問題覆蓋率 31
4.3.4保險資料-ConvolutionalLSTM模型結果 32
4.4覆蓋度問題探討 35
4.4.1.Embedding模型結果 36
4.4.2.ConvolutionalLSTM模型結果 38
4.5外部知識庫:DBpedia 41
4.6推理:問題相似度模型 42
4.7 綜合討論 44
第五章 結論與未來研究 46
5.1 結論 46
5.2未來研究 46
參考文獻 48
參考文獻 References
參考文獻
Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning, 2(1), 1–127.
Bengio, Y., Courville, A., & Vincent, P. (2012). Representation Learning: A Review and New Perspectives. arXiv:1206.5538 [Cs]. Retrieved from http://arxiv.org/abs/1206.5538
Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2), 157–166.
Bordes, A., Chopra, S., & Weston, J. (2014). Question Answering with Subgraph Embeddings. arXiv:1406.3676 [Cs]. Retrieved from http://arxiv.org/abs/1406.3676
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural Language Processing (Almost) from Scratch. J. Mach. Learn. Res., 12, 2493–2537.
Deng, L. (2014). Deep Learning: Methods and Applications. Foundations and Trends® in Signal Processing, 7(3–4), 197–387.
Graves, A. (2013). Generating Sequences With Recurrent Neural Networks. arXiv:1308.0850 [Cs]. Retrieved from http://arxiv.org/abs/1308.0850
Graves, A., & Jaitly, N. (2014). Towards End-To-End Speech Recognition with Recurrent Neural Networks (pp. 1764–1772). Presented at the Proceedings of the 31st International Conference on Machine Learning (ICML-14). Retrieved from http://machinelearning.wustl.edu/mlpapers/papers/icml2014c2_graves14
Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., & Schmidhuber, J. (2009). A Novel Connectionist System for Unconstrained Handwriting Recognition. IEEE Trans. Pattern Anal. Mach. Intell., 31(5), 855–868.
Graves, A., Mohamed, A., & Hinton, G. (2013). Speech Recognition with Deep Recurrent Neural Networks. arXiv:1303.5778 [Cs]. Retrieved from http://arxiv.org/abs/1303.5778
Green, B. F., Jr., Wolf, A. K., Chomsky, C., & Laughery, K. (1961). Baseball: An Automatic Question-answerer. In Papers Presented at the May 9-11, 1961, Western Joint IRE-AIEE-ACM Computer Conference (pp. 219–224). New York, NY, USA: ACM.
Green, B., Wolf, A., Chomsky, C., & Laughery, K. (1986). Readings in Natural Language Processing. In B. J. Grosz, K. Sparck-Jones, & B. L. Webber (Eds.) (pp. 545–549). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. Retrieved from http://dl.acm.org/citation.cfm?id=21922.24354
Han, L., Yu, Z.-T., Qiu, Y.-X., Meng, X.-Y., Guo, J.-Y., & Si, S.-T. (2008). Research on passage retrieval using domain knowledge in Chinese question answering system. In 2008 International Conference on Machine Learning and Cybernetics (Vol. 5, pp. 2603–2606).
Hao, X., Chang, X., & Liu, K. (2007). A Rule-based Chinese Question Answering System for Reading Comprehension Tests. In Third International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2007. IIHMSP 2007 (Vol. 2, pp. 325–329).
Hihi, S. E., & Bengio, Y. (1996). Hierarchical Recurrent Neural Networks for Long-Term Dependencies. In D. S. Touretzky & M. E. Hasselmo (Eds.), Advances in Neural Information Processing Systems 8 (pp. 493–499). MIT Press. Retrieved from http://papers.nips.cc/paper/1102-hierarchical-recurrent-neural-networks-for-long-term-dependencies.pdf
Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780.
Hu, B., Lu, Z., Li, H., & Chen, Q. (2014). Convolutional Neural Network Architectures for Matching Natural Language Sentences. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 27 (pp. 2042–2050). Curran Associates, Inc. Retrieved from http://papers.nips.cc/paper/5550-convolutional-neural-network-architectures-for-matching-natural-language-sentences.pdf
Huang, J., Zhou, M., & Yang, D. (2007). Extracting Chatbot Knowledge from Online Discussion Forums. In Proceedings of the 20th International Joint Conference on Artifical Intelligence (pp. 423–428). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. Retrieved from http://dl.acm.org/citation.cfm?id=1625275.1625342
Ittycheriah, A., Franz, M., Zhu, W., Ratnaparkhi, A., & Mammone, R. J. (2001). IBM’s Statistical Question Answering System. ResearchGate. Retrieved from https://www.researchgate.net/publication/2875435_IBM’s_Statistical_Question_Answering_System
Jean, S., Cho, K., Memisevic, R., & Bengio, Y. (2014). On Using Very Large Target Vocabulary for Neural Machine Translation. arXiv:1412.2007 [Cs]. Retrieved from http://arxiv.org/abs/1412.2007
Kiros, R., Salakhutdinov, R., & Zemel, R. S. (2014). Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models. arXiv:1411.2539 [Cs]. Retrieved from http://arxiv.org/abs/1411.2539
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 25 (pp. 1097–1105). Curran Associates, Inc. Retrieved from http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Meng, Y., Rumshisky, A., & Romanov, A. (2017). Temporal Information Extraction for Question Answering Using Syntactic Dependencies in an LSTM-based Architecture. arXiv:1703.05851 [Cs]. Retrieved from http://arxiv.org/abs/1703.05851
Mikolov, T., Deoras, A., Povey, D., Burget, L., & Černocký, J. (2011). Strategies for training large scale neural network language models. In 2011 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (pp. 196–201).
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed Representations of Words and Phrases and their Compositionality. arXiv:1310.4546 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1310.4546
Nguyen, M.-T., Phan, V.-A., Nguyen, T.-S., & Nguyen, M.-L. (2016). Learning to rank questions for community question answering with ranking svm. arXiv Preprint arXiv:1608.04185. Retrieved from https://arxiv.org/abs/1608.04185
Pascanu, R., Mikolov, T., & Bengio, Y. (2012). On the difficulty of training Recurrent Neural Networks. arXiv:1211.5063 [Cs]. Retrieved from http://arxiv.org/abs/1211.5063
Ravichandran, D., & Hovy, E. (2002). Learning Surface Text Patterns for a Question Answering System. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (pp. 41–47). Stroudsburg, PA, USA: Association for Computational Linguistics.
Riloff, E., & Thelen, M. (2000). A Rule-based Question Answering System for Reading Comprehension Tests. In Proceedings of the 2000 ANLP/NAACL Workshop on Reading Comprehension Tests As Evaluation for Computer-based Language Understanding Sytems - Volume 6 (pp. 13–19). Stroudsburg, PA, USA: Association for Computational Linguistics.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1. In D. E. Rumelhart, J. L. McClelland, & C. PDP Research Group (Eds.) (pp. 318–362). Cambridge, MA, USA: MIT Press. Retrieved from http://dl.acm.org/citation.cfm?id=104279.104293
Sainath, T. N., Mohamed, A.-R., Kingsbury, B., & Ramabhadran, B. (2013). Deep convolutional neural networks for LVCSR. In ResearchGate (pp. 8614–8618).
Schmidhuber, J. (2015). Deep Learning in Neural Networks: An Overview. Neural Networks, 61, 85–117.
Song, H. A., & Lee, S.-Y. (n.d.). Hierarchical Representation Using NMF. In SpringerLink (pp. 466–473). Springer Berlin Heidelberg.
SPARQL Query Language for RDF. (n.d.). Retrieved November 19, 2016, from https://www.w3.org/TR/rdf-sparql-query/
Sutskever, I. (2013). Training recurrent neural networks. University of Toronto. Retrieved from https://www.cs.utoronto.ca/~ilya/pubs/ilya_sutskever_phd_thesis.pdf
Sutskever, I., Martens, J., & Hinton, G. E. (2011). Generating Text with Recurrent Neural Networks. In ResearchGate (pp. 1017–1024). Retrieved from https://www.researchgate.net/publication/221345823_Generating_Text_with_Recurrent_Neural_Networks
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to Sequence Learning with Neural Networks. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 27 (pp. 3104–3112). Curran Associates, Inc. Retrieved from http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., … Rabinovich, A. (2014). Going Deeper with Convolutions. arXiv:1409.4842 [Cs]. Retrieved from http://arxiv.org/abs/1409.4842
Unger, C., Bühmann, L., Lehmann, J., Ngonga Ngomo, A.-C., Gerber, D., & Cimiano, P. (2012). Template-based Question Answering over RDF Data. In Proceedings of the 21st International Conference on World Wide Web (pp. 639–648). New York, NY, USA: ACM.
Wang, B., Liu, K., & Zhao, J. (2016). Inner attention based recurrent neural networks for answer selection. In The Annual Meeting of the Association for Computational Linguistics. Retrieved from http://www.aclweb.org/anthology/P/P16/P16-1122.pdf
Woods, W. A. (1973). Progress in Natural Language Understanding: An Application to Lunar Geology. In Proceedings of the June 4-8, 1973, National Computer Conference and Exposition (pp. 441–450). New York, NY, USA: ACM.
Yih, W., Chang, M.-W., Meek, C., Pastusiak, A., Yih, S. W., & Meek, C. (2013). Question Answering Using Enhanced Lexical Semantic Models. Microsoft Research. Retrieved from https://www.microsoft.com/en-us/research/publication/question-answering-using-enhanced-lexical-semantic-models/
Yu, L., Hermann, K. M., Blunsom, P., & Pulman, S. (2014). Deep Learning for Answer Sentence Selection. arXiv:1412.1632 [Cs]. Retrieved from http://arxiv.org/abs/1412.1632
Zhang, K., & Zhao, J. (2010). A Chinese question-answering system with question classification and answer clustering. In 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD) (Vol. 6, pp. 2692–2696).
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code