國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,應用文字探勘分析建立知識分類之研究- 以伺服器管理韌體問題追蹤系統為例,The Research of Knowledge Classification by Using Text Mining Technologies: Application on Server Management Firmware Issue Tracking System

論文名稱 Title	應用文字探勘分析建立知識分類之研究- 以伺服器管理韌體問題追蹤系統為例 The Research of Knowledge Classification by Using Text Mining Technologies: Application on Server Management Firmware Issue Tracking System
系所名稱 Department	資訊管理學系 Department of Information Management
畢業學年期 Year, semester	109 學年度第 2 學期 The spring semester of Academic Year 109	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	49
研究生 Author	吳善恆 Shan-Heng Wu
指導教授 Advisor	林信惠 Lin,Hsin-hui
召集委員 Convenor	林芬慧 Lin,Fen-Hui
口試委員 Advisory Committee	鄭菲菲 Cheng,Fei-Fei
口試日期 Date of Exam	2021-06-17	繳交日期 Date of Submission	2021-07-07
關鍵字 Keywords	伺服器遠端管理晶片、文字探勘、詞頻-逆檔案頻率、問題追蹤系統、階層式分群法、知識模型、知識分類 Baseboard Management Controller, Text Mining, TF-IDF, Bug Tracking System, Hierarchy Clustering Algorithm, Knowledge Model, Knowledge Classification
統計 Statistics	本論文已被瀏覽 322 次，被下載 129 次 The thesis/dissertation has been browsed 322 times, has been downloaded 129 times.

中文摘要
由於伺服器遠端管理晶片 (BMC, Baseboard Management Controller)韌體開發的過程中，開發人員會將功能架構、研究思路、問題癥結、解決方法、測試方法、測試結果保留下來，累積了許多伺服器管理晶片軟體的研發知識文件。但這些技術文件並未針對技術、功能、應用等特性，建立足夠的索引(Index)及標籤(Tags)以及分類，以至於缺乏有效的檢索方式且無法使相關技術文件互相產生關聯。面對過往曾經發生過的問題時，研發人員僅憑藉個人經驗埋頭研究問題，或者在諸多未分類的技術文件中憑藉自身與他人記憶尋找有用的參考資料，因此，如何利用分類及標籤精準地取用現有知識庫(Knowledge Base)中累積的大量知識，輔助簡化問題，甚至更迅速地解決問題，是對於BMC研發團隊更重要的議題，也是本研究所關注的主要議題。本研究藉由詞頻-逆檔案頻率(TF-IDF，Term Frequency-Inverse Document Frequency)之文字探勘(Text Mining)方法以及聚合型的階層式分群法 (AGNES，Agglomerate Hierarchy Clustering Algorithm)將相似的研發技術文件分門別類，同時賦予每一份技術文件相關的知識類別與特徵關鍵字。利用此方法使技術文件不再提供單純的文字資訊，亦將該技術文件之隱含知識及特徵關鍵字做為文件的屬性，同時，本研究建立了BMC研發技術之知識模型，使研發人員能更快速地獲取相關知識，分享知識並重複利用。
Abstract
During Baseboard Management Controller (BMC) firmware development, developers always write down a lot of technical documents including system structure, ideas, bug symptoms, root cause, solution, and test case on a Bug Tracking System, however there is not enough information to identify what kind of the knowledge type in each document, such as classification, keywords, or tags. Developers can only search related documents by themselves in a huge document database when encountered a problem. So, how to find useful information in a large amount of documents efficiently is an import topic in this research. This paper used TF-IDF (Term Frequency-Inverse Document Frequency) algorithm to fetch features in each document, and cluster the documents by AGNES (Agglomerate Hierarchy Clustering Algorithm Agglomerate Hierarchy Clustering Algorithm),then named those clustering as knowledge types, so that documents not only provide text information but also represent as a technical knowledge document, meanwhile, a knowledge model of BMC development is also created.It lets developers search knowledge faster and share easily.

目次 Table of Contents
論文審定書 i 誌謝 ii 摘要 iii Abstract iv 目錄 v 圖次 vii 表次 viii 第一章緒論 1 第一節研究背景與環境 1 第二節研究動機與目的 2 第二章文獻探討 4 第一節基板管理控制器 (BMC，Baseboard Management Controller) 4 第二節 IPMI (Intelligent Platform Management Interface) 7 第三節缺陷追蹤系統(Bug Tracking System) 9 第四節文字探勘(Text mining) 11 第五節文件分群(Document clustering) 13 第六節資料-資訊-知識-智慧階層(Data-Information-Knowledge-Wisdom Hierarchy) 15 第三章研究設計 18 第一節研究流程 18 第二節研究範圍與資料收集 18 第三節擷取資料特徵並量化 20 第四章研究結果分析 27 第一節分析方法 27 第二節資料分群方法與分群效果的關係 27 第三節研究結果呈現及其相關意義 29 第四節分群內容分析 32 第五節樹狀圖 34 第五章結論與未來方向 35 第一節研究方法之選用 35 第二節知識模型與標籤(Tag) 36 第三節輔助決策之效果 37 第四節未來方向 38 第六章參考文獻 39

參考文獻 References
[1] 李勝斌. (2012). 以累積經驗知識為目的之問題追蹤系統. 交通大學資訊學院資訊學程學位論文, 1-39. [2] 莊豐誌. (2015). 由分群而得的近似最近鄰居之搜尋法. 交通大學多媒體工程研究所學位論文, 1-65. [3] Bafna, P., Pramod, D. & Vaidya, A. (2016, March). Document clustering: TF-IDF approach. In 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT) (pp. 61-66). IEEE. [4] Froud, H. & Lachkar, A. (2013). Agglomerative hierarchical clustering techniques for arabic documents. In Advances in Computational Science, Engineering and Information Technology (pp. 255-267). Springer, Heidelberg. [5] Gao, H., Wang, S., Sun, L. & Nian, F. (2014, May). Hierarchical clustering based web service discovery. In International conference on informatics and semiotics in organisations (pp. 281-291). Springer, Berlin, Heidelberg. [6] Inzalkar, S. & Sharma, J. (2015). A survey on text mining-techniques and application. International Journal of Research In Science & Engineering, 24, 1-14. [7] Intel, H. P. & NEC, D. (2013). Intelligent Platform Management Interface Specifications Second Generation [OL]. [8] Jifa, G. & Lingling, Z. (2014). Data, DIKW, Big data and Data science. Procedia Computer Science, 31, 814-821. [9] Oleiwi, A. A. H., Adebayo, A. O. & Hussein, A. A. (2019). DATAMINING USING HIERARCHICALCLUSTERING TECHNIQUES ON THEPOSITION OFEMPLOYEES IN AN INFORMATION TECHNOLOGYFIRM. GSJ, 7(6). [10] Popat, S. K., Deshmukh, P. B. & Metre, V. A. (2017, October). Hierarchical document clustering based on cosine similarity measure. In 2017 1st International Conference on Intelligent Systems and Information Management (ICISIM) (pp. 153-159). IEEE. [11] Qaiser, S. & Ali, R. (2018). Text mining: use of TF-IDF to examine the relevance of words to documents. International Journal of Computer Applications, 181(1), 25-29. [12] Rowley, J. (2007). The wisdom hierarchy: representations of the DIKW hierarchy. Journal of Information Science, 33, 163 - 180. [13] Yang, Y. & Pedersen, J. O. (1997, July). A comparative study on feature selection in text categorization. In Icml (Vol. 97, No. 412-420, p. 35). [14] Yusof, W. S. E. Y. W., Zakaria, O., Zainol, Z. & Ananthan, S. (2018). DIKW application on knowledge based framework with situational awareness. International Journal of Academic Research in Business and Social Sciences, 8(6), 1110-1120.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：校內校外完全公開 unrestricted 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0607121-094954.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2453 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2453 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS