國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,植基於圖異常偵測並以情資為特徵之APT攻擊偵測系統,Graph-Based Anomaly APT Attack Detection via Threat Intelligence

論文名稱 Title	植基於圖異常偵測並以情資為特徵之APT攻擊偵測系統 Graph-Based Anomaly APT Attack Detection via Threat Intelligence
系所名稱 Department	資訊工程學系資訊安全碩士班 Master Program in Information Security, Department of Computer Science and Engineering
畢業學年期 Year, semester	110 學年度第 2 學期 The spring semester of Academic Year 110	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	59
研究生 Author	張郢展 Ying-Chan Chang
指導教授 Advisor	范俊逸 Fan, Chun-I
召集委員 Convenor	陳嘉玫 Chen,Chia-Mei
口試委員 Advisory Committee	王智弘, 黃政嘉 Wang,Chih-Hung; Huang,Jheng-Jia
口試日期 Date of Exam	2022-07-12	繳交日期 Date of Submission	2022-07-16
關鍵字 Keywords	進階持續性威脅、離地攻擊、端點偵測及回應、威脅情資、出處圖、異常偵測、圖神經網路 Advanced Persistent Threat (APT), Living Off the Land (LOL), Endpoint Detection and Response (EDR), Threat Intelligence, Provenance Graph, Anomaly Detection, Graph Neural Network
統計 Statistics	本論文已被瀏覽 379 次，被下載 0 次 The thesis/dissertation has been browsed 379 times, has been downloaded 0 times.

中文摘要
現今進階持續性威脅攻擊中，駭客結合多種防禦迴避技術來躲避傳統防毒軟體偵測，例如無檔案式惡意程式（Fileless Malware）結合離地攻擊（Living Off the Land，LOL）以及合法雲服務濫用，使得企業紛紛轉而採用端點偵測及回應（Endpoint Detection and Response，EDR），EDR 工具將事件記錄與已知的攻擊技術相匹配，藉此偵測出潛在的威脅。然而，EDR 工具卻存在產生大量警告誤報的缺點，使得資安維護人員與分析人員被迫增加額外大量的分析成本。在本文中，我們提出植基於圖的異常偵測系統，藉由輸入正常行為所建構具有威脅情資的出處圖至本系統，學習該圖中的潛在結構化資訊，達到偵測主機上的異常行為。結果顯示所提出的系統能有效偵測出異常的事件記錄，此外將警告誤報數量減少了高達 97.67%，除了大幅降低資安維護人員因分析記錄而造成的龐大負擔，並且說明基於圖神經網路的異常偵測能力優於傳統神經網路。
Abstract
Among Advanced Persistent Threats (APTs) in recent years, hackers have combined multiple defense evasion techniques to hide themselves from the detection of traditional antivirus software. For example, the combination of fileless malware and Living Off the Land (LOL) techniques and abusing legitimate cloud services force the enterprises have gradually adopted the Endpoint Detection and Response (EDR) instead. EDR then matches the event logs with known attack techniques to detect the potential threats. However, EDR has the disadvantage that this tool may produce massive false alarms. This situation force security maintainer and analysts to be burdened with a large amount of additional analyses. In this research, we proposed an anomaly detection system based on graphs. First, we input a provenance graph containing threat intelligence constructed by the normal behaviors of the system. After that, the system learns the potential structured information from the provenance graph for detecting the abnormal behavior of a host. The results show that the proposed system can effectively detect abnormal event logs. Moreover, we reduce the number of false alarms by up to 97.67%. The improvement dramatically reduces the heavy burdens on the security maintainers from the analyses of the records. Furthermore, the performance of the designed system shows that the abnormal detection based on the graph neural network is superior to a traditional neural network.

目次 Table of Contents
論文審定書 i 摘要 iii Abstract v 圖目錄 viii 表目錄 ix 列表目錄 x 第一章緒論 1 1.1 論文貢獻 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 論文架構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 第二章相關研究 5 2.1 基於型樣匹配之方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 基於異常偵測之方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 第三章背景知識 8 3.1 MITRE ATT&CK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 記錄收集 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2.1 Sysmon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2.2 Sysmon Modular . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2.3 Open Source Security Events Metadata(OSSEM) . . . . . . . . . . 12 3.3 機器學習演算法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.3.1 Multi-layer Perceptron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.3.2 GraphSAGE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.3.3 Sentence-BERT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 第四章威脅模型與系統架構 16 4.1 威脅模型 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.2 系統設計 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2.1 事件記錄收集 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2.2 出處圖資料庫 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2.3 資料預處理 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2.3.1 特徵萃取 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2.4 模型 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2.4.1 訓練階段 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.2.4.2 執行階段 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 第五章實驗結果及分析 23 5.1 實作 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.2 實驗設計與資料集 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.3 評估方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.4 有效性 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5.5 執行負擔 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.6 比較 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.7 攻擊案例研究 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5.7.1 程序載入模組 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5.7.2 程序建立登錄檔 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 第六章結論與未來展望 34 參考文獻 36 附錄資料集 42

參考文獻 References
[1] A. Alshamrani, S. Myneni, A. Chowdhary, and D. Huang, “A Survey on Advanced Per-sistent Threats: Techniques, Solutions, Challenges, and Research Opportunities,” IEEECommunications Surveys Tutorials, vol. 21, no. 2, pp. 1851–1877, 2019. [2] PurpleSec, “Cyber Security Statistics,” 2022. https://purplesec.us/resources/cyber-security-statistics/#APTs, [Online; accessed 03-June-2022]. [3] CyCraft, “Chimera APT Threat Report,” 4 2021. https://cycraft.com/download/CyCraft-Whitepaper-Chimera_V4.2.pdf, [Online; accessed 03-June-2022]. [4] Cybereason Nocturnus, “Operation GhostShell: Novel RAT Targets Global Aerospaceand Telecoms Firms,” 10 2021. https://www.cybereason.com/blog/research/operation-ghostshell-novel-rat-targets-global-aerospace-and-telecoms-firms,[Online; accessed 03-June-2022]. [5] NCC Group, “Abusing cloud services to fly under the radar,”01 2021. https://research.nccgroup.com/2021/01/12/abusing-cloud-services-to-fly-under-the-radar/, [Online; accessed 03-June-2022]. [6] Symantec, “Living off the Land,” 4 2021. https://learn.broadcom.com/hubfs/Living-off-the-Land_2021.pdf, [Online; accessed 03-June-2022]. [7] TeamT5, “APT Threat Landscape of Taiwan in 2020,” 5 2021. https://teamt5.org/en/posts/apt-threat-landscape-of-taiwan-in-2020/, [Online; accessed03-June-2022]. [8] UpGuard, “What is an Advanced Persistent Threat (APT)?,” 6 2022. https://www.upguard.com/blog/what-is-an-advanced-persistent-threat, [Online; accessed03-June-2022]. [9] “The MITRE Corporation,” 2022. https://attack.mitre.org/, [Online; accessed03-June-2022]. [10] The MITRE Corporation, “T1053.005 Scheduled Task/Job: Scheduled Task.” https://attack.mitre.org/techniques/T1053/005/, [Online; accessed 03-June-2022]. [11] DARKReading, “56Day,” 7 2020. https://www.darkreading.com/risk/56-of-large-companies-handle-1-000-security-alerts-each-day, [Online;accessed 03-June-2022]. [12] Bitdefender, “Half of Alerts Signaled by EDR Tools Are False Alarms; Lack of Person-nel Prevents Rapid Detection and Response,” 5 2018. https://businessinsights.bitdefender.com/alerts-signaled-edr-tools-false-alarms, [Online; accessed03-June-2022]. [13] M. N. Hossain, S. M. Milajerdi, J. Wang, B. Eshete, R. Gjomemo, R. Sekar, S. Stoller,and V. Venkatakrishnan, “SLEUTH: Real-time attack scenario reconstruction from COTSaudit data,” in 26th USENIX Security Symposium (USENIX Security 17), (Vancouver,BC), pp. 487–504, USENIX Association, Aug. 2017. [14] S. M. Milajerdi, R. Gjomemo, B. Eshete, R. Sekar, and V. Venkatakrishnan, “HOLMES:Real-Time APT Detection through Correlation of Suspicious Information Flows,” in 2019IEEE Symposium on Security and Privacy (SP), pp. 1137–1152, May 2019. [15] S. M. Milajerdi, B. Eshete, R. Gjomemo, and V. Venkatakrishnan, “POIROT: AligningAttack Behavior with Kernel Audit Records for Cyber Threat Hunting,” in Proceedingsof the 2019 ACM SIGSAC Conference on Computer and Communications Security, CCS’19, (New York, NY, USA), p. 1795–1812, Association for Computing Machinery, 2019. [16] M. N. Hossain, S. Sheikhi, and R. Sekar, “Combating Dependence Explosion in ForensicAnalysis Using Alternative Tag Propagation Semantics,” in 2020 IEEE Symposium onSecurity and Privacy (SP), pp. 1139–1155, May 2020. [17] W. U. Hassan, A. Bates, and D. Marino, “Tactical Provenance Analysis for EndpointDetection and Response Systems,” in 2020 IEEE Symposium on Security and Privacy(SP), pp. 1172–1189, May 2020. [18] R. Wei, L. Cai, L. Zhao, A. Yu, and D. Meng, “DeepHunter: A Graph Neural NetworkBased Approach for Robust Cyber Threat Hunting,” in International Conference on Se-curity and Privacy in Communication Systems, pp. 3–24, Springer, 2021. [19] K. Pei, Z. Gu, B. Saltaformaggio, S. Ma, F. Wang, Z. Zhang, L. Si, X. Zhang, and D. Xu,“HERCULE: Attack Story Reconstruction via Community Discovery on Correlated LogGraph,” in Proceedings of the 32nd Annual Conference on Computer Security Applica-tions, ACSAC ’16, (New York, NY, USA), p. 583–595, Association for Computing Ma-chinery, 2016. [20] Y. Liu, M. Zhang, D. Li, K. Jee, Z. Li, Z. Wu, J. Rhee, and P. Mittal, “Towards a TimelyCausality Analysis for Enterprise Security,” in NDSS, 2018. [21] W. U. Hassan, S. Guo, D. Li, Z. Chen, K. Jee, Z. Li, and A. Bates, “NoDoze: CombattingThreat Alert Fatigue with Automated Provenance Triage,” in NDSS, 2019. [22] X. Han, T. Pasquier, A. Bates, J. Mickens, and M. Seltzer, “Unicorn: Runtimeprovenance-based detector for advanced persistent threats,” in NDSS, 2020. [23] N. Shervashidze, P. Schweitzer, E. J. Van Leeuwen, K. Mehlhorn, and K. M. Borgwardt,“Weisfeiler-lehman graph kernels,” Journal of Machine Learning Research, vol. 12, no. 9,2011. [24] Q. Wang, W. U. Hassan, D. Li, K. Jee, X. Yu, K. Zou, J. Rhee, Z. Chen, W. Cheng, C. A.Gunter, et al., “You Are What You Do: Hunting Stealthy Malware via Data ProvenanceAnalysis,” in NDSS, 2020. [25] S. Wang, Z. Wang, T. Zhou, X. Yin, D. Han, H. Zhang, H. Sun, X. Shi, and J. Yang,“threaTrace: Detecting and Tracing Host-based Threats in Node Level Through Prove-nance Graph Learning,” arXiv preprint arXiv:2111.04333, 2021. [26] The MITRE Corporation, “Enterprise Matrix,” 2022. https://attack.mitre.org/versions/v11/matrices/enterprise/, [Online; accessed 03-June-2022]. [27] The MITRE Corporation, “T1004 OS Credential Dumping.” https://attack.mitre.org/techniques/T1003/, [Online; accessed 03-June-2022]. [28] The MITRE Corporation, “G0007 APT28.” https://attack.mitre.org/groups/G0007/, [Online; accessed 03-June-2022]. [29] The MITRE Corporation, “S0002 Mimikatz.” https://attack.mitre.org/software/S0002/, [Online; accessed 03-June-2022]. [30] Sysinternals, “Sysmon.” https://docs.microsoft.com/en-us/sysinternals/downloads/sysmon, [Online; accessed 03-June-2022]. [31] Olaf Hartong, “sysmon-modular.” https://github.com/stavhaygn/sysmon-modular, [Online; accessed 03-June-2022]. [32] Open Threat Research, “OSSEM.” https://github.com/OTRF/OSSEM, [Online; ac-cessed 03-June-2022]. [33] “Open Threat Research.” https://spark.apache.org/docs/latest/api/python/,[Online; accessed 03-June-2022]. [34] The MITRE Corporation, “Data Sources.” https://attack.mitre.org/datasources/, [Online; accessed 03-June-2022]. [35] W. L. Hamilton, R. Ying, and J. Leskovec, “Inductive Representation Learning on LargeGraphs,” in NIPS, 2017. [36] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirec-tional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018. [37] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representa-tions in vector space,” arXiv preprint arXiv:1301.3781, 2013. [38] J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word represen-tation,” in Proceedings of the 2014 conference on empirical methods in natural languageprocessing (EMNLP), pp. 1532–1543, 2014. [39] N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using siamese bert-networks,” arXiv preprint arXiv:1908.10084, 2019. [40] R. Paccagnella, P. Datta, W. U. Hassan, A. Bates, C. Fletcher, A. Miller, and D. Tian,“Custos: Practical tamper-evident auditing of operating systems using trusted execution,”in NDSS, 2020. [41] “Winlogbeat.” https://www.elastic.co/downloads/beats/winlogbeat, [Online;accessed 03-June-2022]. [42] “Apache Kafka.” https://kafka.apache.org/, [Online; accessed 03-June-2022]. [43] “Logstash.” https://www.elastic.co/logstash/, [Online; accessed 03-June-2022]. [44] “Elasticsearch.” https://www.elastic.co/elasticsearch/, [Online; accessed 03-June-2022]. [45] “PySpark.” https://spark.apache.org/docs/latest/api/python/, [Online; ac-cessed 03-June-2022]. [46] “Neo4j.” https://neo4j.com/, [Online; accessed 03-June-2022]. [47] N. Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddings using SiameseBERT-Networks,” in Proceedings of the 2019 Conference on Empirical Methods in Nat-ural Language Processing, Association for Computational Linguistics, 11 2019. [48] “All-MiniLM-L6-v2.” https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2, [Online; accessed 03-June-2022]. [49] M. Fey and J. E. Lenssen, “Fast Graph Representation Learning with PyTorch Geometric,”in ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019. [50] “Kibana.” https://www.elastic.co/kibana/, [Online; accessed 03-June-2022]. [51] BC Security, “Empire.” https://github.com/BC-SECURITY/Empire, [Online; ac-cessed 03-June-2022]. [52] The MITRE Corporation, “S0363 Empire.” https://attack.mitre.org/software/S0363/, [Online; accessed 03-June-2022]. [53] J. Griffith, D. Kong, A. Caro, B. Benyo, J. Khoury, T. Upthegrove, T. Christovich,S. Ponomorov, A. Sydney, A. Saini, et al., “Scalable Transparency Architecture for Re-search Collaboration (STARC)-DARPA Transparent Computing (TC) Program,” 2020. [54] “Sigma.” https://github.com/SigmaHQ/sigma, [Online; accessed 03-June-2022]. [55] The MITRE Corporation, “DS0026 Active Directory.” https://attack.mitre.org/datasources/DS0026/, [Online; accessed 03-June-2022]. [56] Z. Hu, Y. Dong, K. Wang, and Y. Sun, “Heterogeneous graph transformer,” in Proceedingsof The Web Conference 2020, pp. 2704–2710, 2020. [57] X. Wang, H. Ji, C. Shi, B. Wang, Y. Ye, P. Cui, and P. S. Yu, “Heterogeneous graphattention network,” in The world wide web conference, pp. 2022–2032, 2019.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：開放下載的時間 available 2027-07-16 校外 Off-campus：開放下載的時間 available 2027-07-16 您的 IP(校外) 位址是 216.73.216.218 現在時間是 2025-06-06 論文校外開放下載的時間是 2027-07-16 Your IP address is 216.73.216.218 The current date is 2025-06-06 This thesis will be available to you on 2027-07-16.
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 2027-07-16

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2453 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2453 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS