國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,用於輔助網路安全異常偵測的快速訓練和可支援多分類之集成式學習晶片設計與實現,Design and Implementation of Fast Trainer and Multi-label-supported Classifier for Ensemble Learning Chip for assisting anomaly detection in Cybersecurity

論文名稱 Title	用於輔助網路安全異常偵測的快速訓練和可支援多分類之集成式學習晶片設計與實現 Design and Implementation of Fast Trainer and Multi-label-supported Classifier for Ensemble Learning Chip for assisting anomaly detection in Cybersecurity
系所名稱 Department	電機工程學系 Department of Electrical Engineering
畢業學年期 Year, semester	113 學年度第 2 學期 The spring semester of Academic Year 113	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	77
研究生 Author	蔡明憲 Ming-Xian Cai
指導教授 Advisor	施信毓 Shih,Xin-Yu
召集委員 Convenor	鍾菁哲 Chung,Ching-Che
口試委員 Advisory Committee	黃英哲, 徐瑞壕, 朱紹儀 Huang,Ing-Jer; Hsu,Ruei-Hau; Chu,Shao-I
口試日期 Date of Exam	2025-03-03	繳交日期 Date of Submission	2025-03-05
關鍵字 Keywords	網路資安攻擊、機器學習、快速訓練、支援多分類、晶片實現 Cybersecurity Attacks, Machine Learning, Quick Training, Multi-label-supported, Chip Implementation
統計 Statistics	本論文已被瀏覽 42 次，被下載 3 次 The thesis/dissertation has been browsed 42 times, has been downloaded 3 times.

中文摘要
台灣近年面臨到各樣網路資安攻擊，包括勒索軟體和分散式阻斷服務攻擊、網路釣魚等，多樣且快速演變的攻擊手法使得傳統人工分析方法難以應對。為此，結合機器學習(ML)技術進行自動偵測防禦已成為重要研究方向。為了因應快速演變的網路攻擊手法，並加速模型訓練，本研究建構一個兼具快速訓練與支援多分類的機器學習(ML)原型(Prototype)，並據此設計實現了一個輔助資訊安全威脅偵測的晶片架構，以提供快速應對能力。在本論文提出的架構當中，主要含有三項核心技術：(1)快速分裂點搜尋演算法：藉由二階段分裂點搜尋演算法縮減搜尋分裂點所需的時間，加快特徵的處理速度。(2)可支援多分類之硬體共享策略：針對硬體進行共享與分析，使其在支援多分類的同時，亦能在面積與訓練時間中取得平衡。(3)特徵篩選與自我驗證機制：利用特徵篩選與自我驗證，在減少特徵儲存量的同時，亦能保有一定的準確度。在晶片實現的部分，本研究使用台積電40奈米製程進行晶片佈局，佈局後面積為 0.635mm2，並且操作頻率最大可達500MHz。在多個資安資料集上進行了準確度和運算效能的評估後，測試結果亦證明所提出的架構具極高的訓練效率，並具有一定的準確度，可輔助系統進行資訊安全威脅偵測。
Abstract
In recent years, Taiwan has faced a variety of cyber security attacks, including ransomware and distributed denial-of-service attacks, phishing, etc., and the diverse and rapidly evolving attack methods make it difficult for traditional manual analysis methods to deal with them. Therefore, the combination of machine learning (ML) technology for automatic detection and defense has become an important research direction. In order to meet the needs of rapidly evolving cyber attack methods and accelerate model training, this study constructs a machine learning (ML) prototype that combines rapid training and multi-label classification, and designs and implements a chip architecture to assist in information security threat detection to provide rapid response capabilities. The architecture proposed in this paper mainly contains three core technologies: (1) Fast Split Point Search Algorithm: the two-stage split-point search algorithm shortens the time required to search for split-point and accelerates the processing speed of features. (2) Hardware Sharing Scheme for Multi-label supporting: Hardware sharing and analysis are carried out so that it can balance area and training time while supporting multi-label classifications. (3) The Mechanism of Feature Selection and Self Validation: The use of feature selection and self-verification can reduce the amount of feature storage while maintaining a certain degree of accuracy. In terms of chip implementation, TSMC's 40nm process was used for chip layout, with an area of 0.635mm2 and a maximum frequency of 500MHz. After evaluating the accuracy and computing performance on multiple information security datasets, the test results also prove that the proposed architecture has high training efficiency and certain accuracy, which can assist the system in information security threat detection.

目次 Table of Contents
論文審定書 i 誌謝 ii 摘要 iii Abstract iv 目錄 v 圖次 vii 表次 ix 第1章緒論 1 1.1 研究背景 1 1.2 研究動機與目標 2 1.3 論文架構 3 第2章機器學習與資料集簡介 4 2.1 機器學習簡介 4 2.1.1 最佳分裂點計算 5 2.2 資料集簡介 7 第3章快速訓練和可支援多分類之集成式學習晶片設計與實現 8 3.1 系統架構 8 3.1.1 資料集預處理 8 3.1.2 系統架構介紹 10 3.1.3 操作指令輸入格式設定 12 3.1.4 運作模式說明 14 3.2 技術一：快速分裂點搜尋演算法 17 3.2.1 一般分裂點搜尋演算法 17 3.2.2 快速分裂點搜尋演算法 19 3.2.3 硬體架構 23 3.2.4 效能分析與比較 27 3.3 技術二：可支援多分類之硬體共享策略 30 3.3.1 一對多的分類方法 30 3.3.2 模型描述 31 3.3.3 硬體設計 33 3.3.4 效能分析與面積比較 36 3.4 技術三：特徵篩選與自我驗證機制 38 3.4.1 特徵篩選機制 38 3.4.2 自我驗證機制 41 3.4.3 效能分析 46 第4章晶片實現 49 4.1 晶片設計流程 49 4.2 標準電路元件合成結果分析 50 4.2.1 可測試性電路(DFT) 50 4.2.2 合成結果分析 52 4.3 自動測試圖樣生成軟體之結果 55 4.4 系統晶片布局結果 56 4.5 邏輯等價性驗證 58 4.6 效能比較 59 第5章結論與未來展望 62 5.1 結論 62 5.2 未來展望 62 參考文獻 64

參考文獻 References
[1] [online] https://devco.re/blog/2024/06/06/security-alert-cve-2024-4577-php-cgi-argument-injection-vulnerability/ [2] [online] https://www.trendmicro.com/en_us/research/17/e/massive-wannacrywcry-ransomware-attack-hits-various-countries.html [3] [online] https://blog.cloudflare.com/cloudflare-mitigates-record-breaking-71-million-request-per-second-ddos-attack [4] [online] https://www.malwarebytes.com/blog/news/2025/01/the-great-google-ads-heist-criminals-ransack-advertiser-accounts-via-fake-google-ads [5] [online] https://news.microsoft.com/zh-tw/digital-defense-report/ [6] [online] https://www.microsoft.com/en-us/security/security-insider/intelligence-reports/microsoft-digital-defense-report-2024 [7] [online] https://www.fortinet.com/tw/corporate/about-us/newsroom/press-releases/2023/fortinet-report-in-the-first-half-of-2023-taiwan-will-be-attacked-an-average-of-nearly-15-000-times-per-second-ranking-first-in-asia-pacific [8] N. T. R. and R. Gupta, "A survey on machine learning approaches and its techniques," in 2020 IEEE International Students' Conference on Electrical, Electronics and Computer Science (SCEECS), Bhopal, India, Feb. 2020, pp. 1-6. [9] M. Eskandari, Z. H. Janjua, M. Vecchio, and F. Antonelli, "Passban IDS: An intelligent anomaly-based intrusion detection system for IoT edge devices," IEEE Internet of Things Journal, vol. 7, no. 8, pp. 6882-6897, Aug. 2020. [10] Z. Ma, H. Ge, Y. Liu, M. Zhao, and J. Ma, "A combination method for Android malware detection based on control flow graphs and machine learning algorithms," IEEE Access, vol. 7, pp. 21235-21245, 2019. [11] S. Saad, W. Briguglio, and H. Elmiligi, "The curious case of machine learning in malware detection," arXiv, 1905.07573, 2019. [12] R. S. Rao and A. R. Pais, "Detection of phishing websites using an efficient feature-based machine learning framework," Neural Computing and Applications, vol. 31, no. 8, pp. 3851-3873, Aug. 2019. [13] O. K. Sahingoz, E. Buber, O. Demir, and B. Diri, "Machine learning based phishing detection from URLs," Expert Systems with Applications, vol. 117, pp.345-357, Mar. 2019. [14] Z. Xiao and S. Pan, "Analysis of Intelligent Information Security Risk Assessment Based on Decision Tree," 2021 IEEE 4th International Conference on Information Systems and Computer Aided Education (ICISCAE), Dalian, China, 2021, pp. 506-509. [15] K. Ghanem, F. J. Aparicio-Navarro, K. G. Kyriakopoulos, S. Lambotharan and J. A. Chambers, "Support Vector Machine for Network Intrusion and Cyber-Attack Detection," 2017 Sensor Signal Processing for Defence Conference (SSPD), London, UK, 2017, pp. 1-5. [16] M. Agarwal, K. S. Gill, R. Chauhan, A. Kapruwan and D. Banerjee, "Classification of Network Security Attack using KNN (K-Nearest Neighbour) and Comparison of different Attacks through different Machine Learning Techniques," 2024 3rd International Conference for Innovation in Technology (INOCON), Bangalore, India, 2024, pp. 1-7 [17] J. B. MacQueen, "Some methods for classification and analysis of multivariate observations," in Proc. 5th Berkeley Symposium on Mathematical Statistics & Probability, Berkeley, CA, USA, 1967, vol. 1, pp. 281-297. [18] Z. Ding, D. Cao, L. Liu, D. Yu, H. Ma and F. Wang, "A Method for Discovering Hidden Patterns of Cybersecurity Knowledge Based on Hierarchical Clustering," 2021 IEEE Sixth International Conference on Data Science in Cyberspace (DSC), Shenzhen, China, 2021, pp. 334-338. [19] T. M. Thang and J. Kim, "The Anomaly Detection by Using DBSCAN Clustering with Multiple Parameters," 2011 International Conference on Information Science and Applications, Jeju, Korea (South), 2011, pp. 1-5. [20] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein, "Introduction to Algorithms," 2009. [21] J. R. Quinlan, "Induction of decision trees," Machine Learning, vol. 1, no. 1, pp. 81-106, Mar. 1986. [22] Leo Breiman, Jerome Friedman, Charles J. Stone, R.A. Olshen, "Classification And Regression Trees," 1984. [23] R. Narayanan, D. Honbo, G. Memik, A. Choudhary, and J. Zambreno, "An FPGA implementation of decision tree classification," in 2007 Design, Automation & Test in Europe Conference & Exhibition, 2007: IEEE, pp. 1-6. [24] Tan, Choon Lin (2018), "Phishing Dataset for Machine Learning: Feature Evaluation," Mendeley Data, V1. [25] N. Abdelhamid. "Website Phishing," UCI Machine Learning Repository, 2014. [Online]. Available: https://doi.org/10.24432/C5B301. [26] Arvind Prasad, Shalini Chandra, "PhiUSIIL: A diverse security profile empowered phishing URL detection framework based on similarity index and incremental learning. Computers & Security," Computers and Security, vol. 136, no. C, Jan. 2024. [27] R. Mohammad and L. McCluskey. "Phishing Websites," UCI Machine Learning Repository, 2012. [Online]. Available: https://doi.org/10.24432/C51W2X. [28] Tristan Carrier, Princy Victor, Ali Tekeoglu, Arash Habibi Lashkari, "Detecting Obfuscated Malware using Memory Feature Engineering", The 8th International Conference on Information Systems Security and Privacy (ICISSP), 2022. [29] J. H. Friedman, "Another approach to polychotomous classification," Technical Report, Statistics Department, Stanford University, 1996. [30] C. Fan, P. Zhang, Y. Sun, C. Liu, and X. Shi, "Natural disaster information statisticsstudy based on stratified random sampling survey statistical methods," 2012 IEEE International Conference on Granular Computing, 11-13 Aug. 2012, pp. 1-4. [31] X. Y. Shih, Y. Chiu, and H. E. Wu, "Design and Implementation of Decision-Tree (DT) Online Training Hardware Using Divider-Free GI Calculation and Speeding-Up Double-Root Classifier," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 70, no. 2, pp. 759-771, Feb. 2023. [32] C. Blum and A. Roli, "Metaheuristics in combinatorial optimization: Overview and conceptual comparison," ACM Computing Surveys., vol. 35, no. 3, pp. 268–308, Sep. 2003. [33] K. K. Parhi, "VLSI digital signal processing systems," 1999. [34] S. A. Imtiaz, Z. Jiang and E. Rodriguez-Villegas, "An Ultralow Power System on Chip for Automatic Sleep Staging," in IEEE Journal of Solid-State Circuits, vol. 52, no. 3, pp. 822-833, March 2017. [35] M. Kang, S. K. Gonugondla, S. Lim and N. R. Shanbhag, "A 19.4-nJ/Decision, 364-K Decisions/s, In-Memory Random Forest Multi-Class Inference Accelera-tor," in IEEE Journal of Solid-State Circuits, vol. 53, no. 7, pp. 2126-2135, July 2018.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：校內校外完全公開 unrestricted 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0205125-103155.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2453 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2453 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS