博碩士論文 etd-0728119-135427 詳細資訊


[回到前頁查詢結果 | 重新搜尋]

姓名 黃國忠(Kuo-Chung Huang) 電子郵件信箱 E-mail 資料不公開
畢業系所 資訊管理學系研究所(Department of Information Management)
畢業學位 碩士(Master) 畢業時期 107學年第2學期
論文名稱(中) 基於深度卷積遞歸神經網路的文本分類
論文名稱(英) Document Classification based on Deep Convolutional Recurrent Neural Networks
檔案
  • etd-0728119-135427.pdf
  • 本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
    請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
    論文使用權限

    紙本論文:立即公開

    電子論文:校內校外完全公開

    論文語文/頁數 中文/69
    統計 本論文已被瀏覽 5655 次,被下載 125 次
    摘要(中) 由於深度學習的崛起,所以本文嘗試藉由資料集20 newsgroups dataset,在Scikit-learn 和Keras的框架下,透過機器學習和深度學習的手法,進行電腦的模擬,我們看見了經由機器學習與深度學習不同手法,作用在文本分類上得到不同的準確度(accuracy),利用深度學習的方法準確度甚至高達96%。
    在機器學習上我們利用Naive Bayes classifier,SVM(Support Vector Machines) classifier,Logistic Regression classifier,RandomForest classifier,……..等等模型進行文本分類,並執行量測準確度。
    深度學習上我們搭建神經網路,Fully Connect Neural Network,LSTM(Long short-term memory) Neural Network,BiLSTM(Bidirectional Long short-term memory) Neural Network,CNN(Convolutional Neural Network),CNN_LSTM Neural Network,LSTM_CNN,CNN_ BiLSTM Neural Network,BiLSTM_CNN等模型,並使用預先訓練的詞向量崁入神經網路,這裡的詞向量我們使用Glove6b,Word2vec,fastText等模型建立的詞向量,並且進行文本分類及準確度的量測。
    我們也發現神經網路模型能夠實現卓越的文本分類,特別是卷積神經網路(CNN),及具記憶的遞歸神經網路(LSTM)這兩種主流架構,我們也結合這兩種架構的優勢,提出了文本分類基於深度卷積遞歸神經網路模型。
    利用卷積(CNN)來抽取更高級別的短語,及將短語送進具記憶的遞歸神經網路(LSTM)中獲得全局句子的表示。或者是利用LSTM來獲取全局句子的表示,在送進CNN中抽取更高級的短語。我們評估提出的架構在文本分類任務上。
    實驗表明深度卷積具雙向記憶的遞歸神經網路(CNN¬_ BiLSTM)這個模型優於文本分類上其它模型的實驗結果,我們得到接近96%的準確度。
    同時我們也歸納了特徵圖Feature Maps和最大池化Max pooling輸出的遞迴關係式。
    摘要(英) Due to the rise of deep learning, our study aims at using the 20 Newsgroups data to run computer simulation with machine learning and deep learning techniques and the frameworks of scikit-learn and Keras. With different machine learning and deep learning techniques, the accuracies of text classification are various. 96% accuracy can be achieved via deep learning. 
    Regarding deep learning, we constructed neural networks. Further, we utilized pre-trained word vectors as inputs of the neural networks, which are generated by glove6b, word2vec, and fasttext, and carried out text classification and measured their accuracies.
    Our study shows that neural networks are able to achieve outstanding performances in document classification, in particular the two mainstream methods, CNN and LSTM. we combine the strengths of both architectures and propose a model called deep convolutional recurrent neural networks for for document classification.
    Thus, we integrated the advantages of these two methods. CNN extracts high-level phrases, which serve as the input of LSTM to generate sentences, and vice versa. We evaluate the proposed architecture on document classification tasks.
    The experimental results show that the deep convolutional recurrent neural networks outperforms other models and can achieve excellent performance on these tasks.
    Particularly, CNN_BiLSTM achieved 96% accuracy in document classification.
    At the same time,we also summarize the recursive relationship between the feature map and the max pooling output.
    關鍵字(中)
  • 遞歸神經網路
  • 深度學習
  • 文本分類
  • 卷積神經網路
  • 特徵圖和最大池化輸出的遞迴關係式
  • 長短期記憶
  • 深度卷積遞歸神經網路
  • 關鍵字(英)
  • Deep Learning
  • Document Classification
  • Long short-term memory
  • CNN_BiLSTM
  • Deep Convolutional Recurrent Neural Networks
  • Convolutional Neural Network
  • 論文目次 論文審定書 i
    誌謝 ii
    中文摘要 iii
    英文摘要 iv
    第 一 章 介紹 1
    第 二 章 背景及相關工作 3
    2.1 機器學習模型 3
    2.2 深度學習模型 4
    第 三 章 方法及架構 7
    3.1 機器學習的方法與架構 7
    3.1.1 文檔-詞頻矩陣(Document-Term Matrix) 7
    3.1.2 TF-IDF演算法 8
    3.2 深度學習的方法與架構 11
    3.3 CNN_BiLSTM 模型 15
    第 四 章 實驗結果 18
    4.1 資料集 18
    4.2 詞向量資料集: 19
    4.3 機器學習方法的實驗結果 20
    4.4 深度學習方法的實驗結果 21
    第 五 章 結論 30
    第 六 章 參考文獻 33
    附 錄 36
    參考文獻 Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks
    5(2), 157-166.
    Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
    Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078
    Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011).
    Natural language processing (almost) from scratch. Journal of machine learning research, 12(Aug), 2493-2537.
    Devlin, J., Zbib, R., Huang, Z., Lamar, T., Schwartz, R., & Makhoul, J. (2014). Fast and
    robust neural network joint models for statistical machine translation. Paper
    presented at the Proceedings of the 52nd Annual Meeting of the Association
    for Computational Linguistics (Volume 1: Long Papers),page 1370-1380.
    Haonan, L., Huang, S. H., Ye, T., & Xiuyan, G. (2019). Graph Star Net for Generalized
    Multi-Task Learning. arXiv preprint arXiv:1906.12330v1.
    Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J., & Scholkopf, B. (1998). Support vector machines. IEEE Intelligent Systems 13(4), 18-28.
    Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural
    computation, 9(8), 1735-1780.
    Ikonomakis, M., Kotsiantis, S., & Tampakas, V. (2005). Text classification using
    machine learning techniques. WSEAS transactions on computers
    4(8), 966-974.
    Kalchbrenner, N., Grefenstette, E., & Blunsom, P. (2014). A convolutional neural
    network for modelling sentences. arXiv preprint arXiv:1404.2188.
    Kilimci, Z. H., & Akyokus, S. (2018). Deep Learning-and Word Embedding-Based
    Heterogeneous Classifier Ensembles for Text Classification. Complexity, 2018.
    Kim, S.-B., Rim, H.-C., Yook, D., & Lim, H.-S. (2002). Effective methods for improving naive bayes text classifiers. Paper presented at the Pacific Rim International Conference on Artificial Intelligence.LNAI 2417,2002,pp.414-423.
    Kim, Y. (2014). Convolutional neural networks for sentence classification. arXiv
    preprint arXiv:1408.5882.
    Klema, J., Almonayyes, A., & engineering. (2006). Automatic categorization of fanatic
    texts using random forests. Kuwait journal of science 33(2),pp.1-18.
    Kowsari, K., Heidarysafa, M., Brown, D. E., Meimandi, K. J., & Barnes, L. E. (2018).
    Rmdl: Random multimodel deep learning for classification. Paper presented at the Proceedings of the 2nd International Conference on Information System and Data Mining. arXiv preprint arXiv:1805.01890v2.
    Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents.
    Paper presented at the 31st International conference on machine
    learning.pages 1188-1196.
    LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature521(7553), 436.
    McCallum, A., & Nigam, K. (1998). A comparison of event models for naive bayes text
    classification. Paper presented at the AAAI-98 Workshop on Learning for Text
    Categorization,vol.752,pp41-48.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed
    representations of words and phrases and their compositionality. Paper presented at the Advances in neural information processing systems,pages 3111-3119.
    Sainath, T. N., Vinyals, O., Senior, A., & Sak, H. (2015). Convolutional, long short-term
    memory, fully connected deep neural networks. Paper presented at the 2015
    IEEE International Conference on Acoustics, Speech and Signal Processing
    (ICASSP).
    Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE
    Transactions on Signal Processing 45(11), 2673-2681.
    Sebastiani, F. (2002). Machine learning in automated text categorization. ACM
    computing surveys 34(1), 1-47.
    Shanahan, J. G., & Roma, N. (2003). Improving SVM text classification performance
    through threshold adjustment. Paper presented at the 14th European
    Conference on Machine Learning.LNAI 2837,361-372.
    Sundermeyer, M., Ney, H., Schlüter, R., & Processing, L. (2015). From feedforward to
    recurrent LSTM neural networks for language modeling. IEEE/ACM Transactions on Audio, Speech 23(3), 517-529.
    Tang, D., Qin, B., & Liu, T. (2015). Document modeling with gated recurrent neural
    network for sentiment classification. Paper presented at the Proceedings of
    the 2015 conference on empirical methods in natural language processing.
    Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., . . . Bengio, Y.
       (2015).
    Show, attend and tell: Neural image caption generation with visual attention.
    Paper presented at the 2015th International conference on machine learning.
    Kilimci, Z. H., & Akyokus, S. (2018). Deep Learning-and Word Embedding-Based Heterogeneous Classifier Ensembles for Text Classification. Complexity, 2018.
    Wu, F., Zhang, T., Souza Jr, A. H. d., Fifty, C., Yu, T., & Weinberger, K. Q. (2019).
    Simplifying graph convolutional networks. arXiv preprint arXiv:1902.07153v2.
    Yao, L., Mao, C., & Luo, Y. (2018). Graph convolutional networks for text
    classification. arXiv preprint arXiv:1809.05679.

    These data sets are all available on the Internet.
    The 20 Newsgroups Dataset.http://qwone.com/~jason/20Newsgroups/
    Jeffrey Pennington, Richard Socher, Christopher D. Manning. 
    GloVe: Global Vectors for Word Representation.https://nlp.stanford.edu/projects/glove/

    GloVe: Global Vectors for Word Representation. https://nlp.stanford.edu/projects/glove/
    Word2vec: https://code.google.com/archive/p/word2vec/
    Fasttext: https://fasttext.cc/docs/en/english-vectors.html
    scikit-learn Machine Learning in Python. https://scikit-learn.org/stable/
    Keras Documentation. https://keras.io/
    口試委員
  • 黃三益 - 召集委員
  • 李珮如 - 委員
  • 康藝晃 - 指導教授
  • 口試日期 2019-07-22 繳交日期 2019-08-28

    [回到前頁查詢結果 | 重新搜尋]


    如有任何問題請與論文審查小組聯繫