國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,基於偵測規則與事件通報之攻擊趨勢分析,Attack Trend Analysis Based on Detection Rules and Incident Reports

論文名稱 Title	基於偵測規則與事件通報之攻擊趨勢分析 Attack Trend Analysis Based on Detection Rules and Incident Reports
系所名稱 Department	資訊管理學系 Department of Information Management
畢業學年期 Year, semester	111 學年度第 2 學期 The spring semester of Academic Year 111	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	75
研究生 Author	陳禹傑 Yu-Jie Chen
指導教授 Advisor	陳嘉玫 Chen,Chia-Mei
召集委員 Convenor	郭文中 Kuo,Wen-Chung
口試委員 Advisory Committee	楊惠芳, 范俊逸, 李宗南 Yang,Huei-Fang; Fan, Chun-I; Lee,Chung-Nan
口試日期 Date of Exam	2023-07-04	繳交日期 Date of Submission	2023-07-07
關鍵字 Keywords	網路威脅情資、偵測規則、事件通報、自然語言處理、短文分群 CTI, Detection Rules, Incident Report, NLP, Short Text Clustering
統計 Statistics	本論文已被瀏覽 466 次，被下載 0 次 The thesis/dissertation has been browsed 466 times, has been downloaded 0 times.

中文摘要
大型網路架構和組織為抵禦各式資安威脅，會在網路環境中部署入侵偵測系統（Intrusion Detection System；簡稱IDS），透過IDS中採用的偵測規則以檢測網路中是否出現可疑行為。而資安人員在統整分析時，需要從大量觸發的偵測規則中人工檢視其攻擊行為，才能了解攻擊趨勢。此外，若環境中部署不同廠商的IDS，相同攻擊行為在不同IDS間有不同描述，對於同一攻擊行為觸發次數較低的偵測規則在統計上可能遭忽略，無法完整呈現攻擊趨勢全貌。本研究提出STARS（Security Threat Analysis and Related Service）攻擊趨勢分析系統以解決上述問題，透過將組織部署的IDS所採用的偵測規則分群並定義群集的攻擊行為，之後資安人員便可直接將IDS觸發的偵測規則對應到所屬攻擊行為。同時，本系統爬取IDS設備商網站，收集偵測規則關聯的CVE漏洞編號與資訊資產，如此能針對趨勢上升之攻擊行為，對其中的偵測規則提供相關修補對象。本研究於單一網路環境中分析過往案例，對於資安人員而言能減少人工檢視各條偵測規則，更容易統整內部威脅。於整體網路環境之攻擊行為趨勢分析中，本研究將系統產生之攻擊行為趨勢與資安年度報告相比較，證實本系統也能針對整體網路環境提出可參考之攻擊趨勢。
Abstract
Large-scale network architectures and organizations deploy Intrusion Detection Systems (IDS) in their network environments to counter various cyber threats. IDS utilizes detection rules to detect malicious behaviors within the network. In order to understand the attack trend, security experts need to manually examine the attack behaviors from a large number of triggered detection rules when consolidating the analysis. Moreover, if different IDSs from various vendors are deployed in the environment, the same attack behavior may have different descriptions among different IDSs. It can lead to the statistical neglect of detection rules with a lower number of trigger frequencies for the same attack behavior, resulting in an incomplete representation of the overall attack trends. Thus, this research proposes the STARS (Security Threat Analysis and Related Service) attack trend analysis system to address the aforementioned issues. By clustering the detection rules adapted in the IDSs and defining the attack of each cluster, security experts can directly respond to the triggered detection rules to their respective attack behaviors. At the same time, STARS crawls the IDS vendors’ website to collect CVE IDs and information assets associated with the detection rules, so that it can provide relevant patches to the detection rules in response to the rising trend of attack behaviors. By analyzing past cases in a single network environment, this research makes it easier for security experts to consolidate internal threats by reducing the need to manually review each triggered rule. In the analysis of attack behavior trends in the overall network environment, this research compares the STARS system-generated attack trends with the cybersecurity annual report, it demonstrates that the system can also provide useful attack trends intelligence for the overall network environment.

目次 Table of Contents
論文審定書 i 摘要 ii Abstract iii 目錄 iv 圖次 vi 表次 vii 第一章緒論 1 1.1 研究背景 1 1.2 研究動機 2 第二章文獻探討 5 2.1 網路威脅情資與背景相關研究 5 2.2 自然語言處理 6 2.2.1 Word2Vec 6 2.2.2 Doc2Vec 8 2.2.3 Transformer 10 2.2.4 BERT與SBERT 10 2.3 分群演算法 12 2.3.1 切割式分群 12 2.3.2 階層式分群 14 2.3.3 基於密度分群 15 2.4 短文資料分析 16 2.4.1 Word co-occurrence 17 2.4.2 Autoencoder 18 第三章研究方法 20 3.1 偵測規則分群 21 3.1.1 規則嵌入模組 23 3.1.2 規則分群模組 24 3.2 內部威脅分析 24 3.3 外部威脅情資收集 25 第四章系統評估 27 4.1 實驗一：規則嵌入與分群方法比較 30 4.1.1 實驗 1-1、S+F-Rules + Affinity Propagation 30 4.1.2 實驗 1-2、S+F-Rules + Hierarchical Clustering 33 4.1.3 實驗 1-3、S+F-Rules + HDBSCAN 36 4.1.4 實驗 1-4、Snort-Rules + Affinity Propagation 38 4.1.5 實驗 1-5、Snort-Rules + Hierarchical Clustering 40 4.1.6 實驗 1-6、Snort-Rules + HDBSCAN 43 4.1.7 實驗 1-7、F廠商-Rules + Affinity Propagation 44 4.1.8 實驗 1-8、F廠商-Rules + Hierarchical Clustering 47 4.1.9 實驗 1-9、F廠商-Rules + HDBSCAN 49 4.1.10 實驗一小結 51 4.2 實驗二：學網內部案例比較 52 4.3 實驗三：真實攻擊趨勢分析 54 第五章研究貢獻與未來展望 56 參考文獻 58 附錄 62

參考文獻 References
[1] Check Point. "Check Point Software’s 2022 Security Report: Global Cyber Pandemic’s Magnitude Revealed." https://www.checkpoint.com/press/2022/check-point-softwares-2022-security-report-global-cyber-pandemics-magnitude-revealed/ (accessed Auguest 28, 2022). [2] iThome. "【iThome 2022 資安大調查（下）資安弱點】資安老手不足問題日益嚴重，系統老舊也成難擋威脅的資安債." https://www.ithome.com.tw/article/153104 (accessed May 2, 2023). [3] iThome. "2022年TeamT5臺灣APT攻擊研究出爐：觀察109起APT攻擊行動，26個受駭單位." https://www.ithome.com.tw/news/154758 (accessed May 3, 2023). [4] 法務部調查局. "國內重要企業遭勒索軟體攻擊事件調查說明." https://www.mjib.gov.tw/news/Details/1/607 (accessed October 21, 2022). [5] The MITRE Corporation. "CVE." https://cve.mitre.org/ (accessed November 1, 2022). [6] The MITRE Corporation. "CVE Numbering Authority (CNA) Rules." https://www.cve.org/ResourcesSupport/AllResources/CNARules (accessed November 1, 2022). [7] M. Shojafar, R. Taheri, Z. Pooranian, R. Javidan, A. Miri, and Y. Jararweh, "Automatic clustering of attacks in intrusion detection systems," in 2019 IEEE/ACS 16th International Conference on Computer Systems and Applications (AICCSA), 2019: IEEE, pp. 1-8. [8] R. Azevedo, I. Medeiros, and A. Bessani, "PURE: Generating quality threat intelligence by clustering and correlating OSINT," in 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), 2019: IEEE, pp. 483-490. [9] T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013. [10] Q. Le and T. Mikolov, "Distributed representations of sentences and documents," in International conference on machine learning, 2014: PMLR, pp. 1188-1196. [11] A. Vaswani et al., "Attention is all you need," Advances in neural information processing systems, vol. 30, 2017. [12] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint arXiv:1810.04805, 2018. [13] A. S. Gautam, Y. Gahlot, and P. Kamat, "Hacker forum exploit and classification for proactive cyber threat intelligence," in Inventive Computation Technologies 4, 2020: Springer, pp. 279-285. [14] Z. Long, L. Tan, S. Zhou, C. He, and X. Liu, "Collecting indicators of compromise from unstructured text of cybersecurity articles using neural-based sequence labelling," in 2019 international joint conference on neural networks (IJCNN), 2019: IEEE, pp. 1-8. [15] J. Kong, A. Scott, and G. M. Goerg, "Improving topic clustering on search queries with word co-occurrence and bipartite graph co-clustering," 2016. [16] C. Johnson, L. Badger, D. Waltermire, J. Snyder, and C. Skorupka, "Guide to cyber threat information sharing," NIST special publication, vol. 800, no. 150, 2016. [17] S. C. De Alvarenga, S. Barbon Jr, R. S. Miani, M. Cukier, and B. B. Zarpelão, "Process mining and hierarchical clustering to help intrusion alert visualization," Computers & Security, vol. 73, pp. 474-491, 2018. [18] X. Liao, K. Yuan, X. Wang, Z. Li, L. Xing, and R. Beyah, "Acing the ioc game: Toward automatic discovery and analysis of open-source cyber threat intelligence," in Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, 2016, pp. 755-766. [19] J. Zhao, Q. Yan, J. Li, M. Shao, Z. He, and B. Li, "TIMiner: Automatically extracting and analyzing categorized cyber threat intelligence from social data," Computers & Security, vol. 95, p. 101867, 2020. [20] T. Mikolov, W.-t. Yih, and G. Zweig, "Linguistic regularities in continuous space word representations," in Proceedings of the 2013 conference of the north american chapter of the association for computational linguistics: Human language technologies, 2013, pp. 746-751. [21] J. Sun, X. Luo, H. Gao, W. Wang, Y. Gao, and X. Yang, "Categorizing malware via A Word2Vec-based temporal convolutional network scheme," Journal of Cloud Computing, vol. 9, no. 1, pp. 1-14, 2020. [22] A. K. Sharma, S. Chaurasia, and D. K. Srivastava, "Sentimental short sentences classification by using CNN deep learning model with fine tuned Word2Vec," Procedia Computer Science, vol. 167, pp. 1139-1147, 2020. [23] W. Yang and K.-Y. Lam, "Automated cyber threat intelligence reports classification for early warning of cyber attacks in next generation SOC," in International Conference on Information and Communications Security, 2019: Springer, pp. 145-164. [24] M. S El-Rahmany, E. Hussein Mohamed, and M. H Haggag, "Semantic detection of targeted attacks using DOC2VEC embedding," Journal of Communications Software and Systems, vol. 17, no. 4, pp. 334-341, 2021. [25] N. Reimers and I. Gurevych, "Sentence-bert: Sentence embeddings using siamese bert-networks," arXiv preprint arXiv:1908.10084, 2019. [26] S. Zhou, J. Liu, X. Zhong, and W. Zhao, "Named entity recognition using BERT with whole world masking in cybersecurity domain," in 2021 IEEE 6th International Conference on Big Data Analytics (ICBDA), 2021: IEEE, pp. 316-320. [27] M. S. Asyaky and R. Mandala, "Improving the performance of HDBSCAN on short text clustering by using word embedding and UMAP," in 2021 8th International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA), 2021: IEEE, pp. 1-6. [28] A. Haddad, N. Aaraj, P. Nakov, and S. F. Mare, "Automated Mapping of CVE Vulnerability Records to MITRE CWE Weaknesses," arXiv preprint arXiv:2304.11130, 2023. [29] B. J. Frey and D. Dueck, "Clustering by passing messages between data points," science, vol. 315, no. 5814, pp. 972-976, 2007. [30] T. Raja Sree and S. M. Saira Bhanu, "HAP: detection of HTTP flooding attacks in cloud using diffusion map and affinity propagation clustering," IET Information Security, vol. 13, no. 3, pp. 188-200, 2019. [31] J. Chen, S. Guo, W. Li, J. Shen, X. Qiu, and S. Shao, "Network abnormal behavior detection method based on affinity propagation," in Artificial Intelligence and Security: 6th International Conference, ICAIS 2020, Hohhot, China, July 17–20, 2020, Proceedings, Part II 6, 2020: Springer, pp. 582-591. [32] B. S. Meyers and A. Meneely, "An automated post-mortem analysis of vulnerability relationships using natural language word embeddings," Procedia Computer Science, vol. 184, pp. 953-958, 2021. [33] R. J. Campello, D. Moulavi, and J. Sander, "Density-based clustering based on hierarchical density estimates," in Advances in Knowledge Discovery and Data Mining: 17th Pacific-Asia Conference, PAKDD 2013, Gold Coast, Australia, April 14-17, 2013, Proceedings, Part II 17, 2013: Springer, pp. 160-172. [34] R. Zhang, S. Wang, R. Burton, M. Hoang, J. Hu, and A. C. Nascimento, "Clustering Analysis of Email Malware Campaigns," in 2021 IEEE International Conference on Cyber Security and Resilience (CSR), 2021: IEEE, pp. 95-102. [35] L. Shahbandayeva, U. Mammadzada, I. Manafova, S. Jafarli, and A. Z. Adamov, "Network Intrusion Detection using Supervised and Unsupervised Machine Learning," in 2022 IEEE 16th International Conference on Application of Information and Communication Technologies (AICT), 2022: IEEE, pp. 1-7. [36] F. Figueiredo, L. Rocha, T. Couto, T. Salles, M. A. Gonçalves, and W. Meira Jr, "Word co-occurrence features for text classification," Information Systems, vol. 36, no. 5, pp. 843-858, 2011. [37] G. E. Hinton and R. R. Salakhutdinov, "Reducing the dimensionality of data with neural networks," science, vol. 313, no. 5786, pp. 504-507, 2006. [38] A. Hadifar, L. Sterckx, T. Demeester, and C. Develder, "A self-training approach for short text clustering," in Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), 2019, pp. 194-199. [39] H. Yin, X. Song, S. Yang, G. Huang, and J. Li, "Representation learning for short text clustering," in Web Information Systems Engineering–WISE 2021: 22nd International Conference on Web Information Systems Engineering, WISE 2021, Melbourne, VIC, Australia, October 26–29, 2021, Proceedings, Part II 22, 2021: Springer, pp. 321-335. [40] D. Bo, X. Wang, C. Shi, M. Zhu, E. Lu, and P. Cui, "Structural deep clustering network," in Proceedings of the web conference 2020, 2020, pp. 1400-1410. [41] FortiGuard Labs. "Fortinet Threat Encyclopedia." https://www.fortiguard.com/encyclopedia?type=ips (accessed April 7, 2023). [42] Snort. "Snort Rule Documents." https://www.snort.org/documents#latest_rule_documents (accessed April 7, 2023). [43] FortiGuard Labs. "Apache.Log4j.Error.Log.Remote.Code.Execution." https://www.fortiguard.com/encyclopedia/ips/51006 (accessed May 16, 2023). [44] P. J. Rousseeuw, "Silhouettes: a graphical aid to the interpretation and validation of cluster analysis," Journal of computational and applied mathematics, vol. 20, pp. 53-65, 1987. [45] T. Caliński and J. Harabasz, "A dendrite method for cluster analysis," Communications in Statistics-theory and Methods, vol. 3, no. 1, pp. 1-27, 1974. [46] D. L. Davies and D. W. Bouldin, "A cluster separation measure," IEEE transactions on pattern analysis and machine intelligence, no. 2, pp. 224-227, 1979. [47] D. Moulavi, P. A. Jaskowiak, R. J. Campello, A. Zimek, and J. Sander, "Density-based clustering validation," in Proceedings of the 2014 SIAM international conference on data mining, 2014: SIAM, pp. 839-847. [48] 趨勢科技. "重新調整作法 - 趨勢科技 2022年度網路資安報告." https://www.trendmicro.com/zh_tw/security-intelligence/threat-report/2022-annual-cybersecurity-report.html (accessed May 3, 2023). [49] MITRE ATT&CK. "T1021 Remote Services." https://attack.mitre.org/techniques/T1021/ (accessed May 3, 2023). [50] NIST. "Official Common Platform Enumeration (CPE) Dictionary." https://nvd.nist.gov/products/cpe (accessed May 23, 2023).

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：開放下載的時間 available 2028-07-07 校外 Off-campus：開放下載的時間 available 2028-07-07 您的 IP(校外) 位址是 216.73.216.54 現在時間是 2025-06-17 論文校外開放下載的時間是 2028-07-07 Your IP address is 216.73.216.54 The current date is 2025-06-17 This thesis will be available to you on 2028-07-07.
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 2028-07-07

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2453 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2453 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS