Responsive image
博碩士論文 etd-0630122-032559 詳細資訊
Title page for etd-0630122-032559
論文名稱
Title
對抗式攻擊擾動異常偵測模型的穩健性與防禦
Robustness and Defense of Anomaly Detection Model Against Adversarial Attack
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
70
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2022-07-19
繳交日期
Date of Submission
2022-07-30
關鍵字
Keywords
對抗式攻擊、黑箱攻擊、表格式資料、超參數調整演算法、異常事件偵測系統
Adversarial Attack, Black-box Attack, Tabular Data, Hyperparameter Tuning Algorithm, Outlier Detection
統計
Statistics
本論文已被瀏覽 283 次,被下載 0
The thesis/dissertation has been browsed 283 times, has been downloaded 0 times.
中文摘要
隨著資料量增長開啟大數據時代,機器學習與深度學習等人工智慧方法受到重視,
並在資料探勘、自然語言處理、電腦視覺與異常事件偵測各領域開啟廣泛的研究及深
入的應用。人工智慧方法比起人類專家更具優勢解決複雜且高重複性的任務,但根據
研究[1, 2]指出此類模型容易受到對抗式攻擊影響,攻擊者能擾動偵測結果甚至操弄預
測標籤,一旦瞄準目標系統中關鍵且脆弱的核心演算法發起對抗式攻擊,便會威脅整
體系統的完整性與資訊安全。
系統開發人員開發基於機器學習的資訊系統時,為求加速開發時程並具有完整的
理論證明方法的有效性,因而採用其他研究者公開的標記資料集、預訓練模型、套件
及文獻程式碼。倘若上述開源專案受到網路駭客汙染、植入後門或本身存在漏洞,開
發人員誤用後該系統將暴露於威脅中。滲透測試(Penetration Test, PT)模擬攻擊者
方法,為測試系統安全性最直接的做法,利用各種手法測試目標系統弱點,協助資訊
系統開發人員強化目標系統的抗擾動性。
本研究提出基於真實企業 Active Directory(AD)事件記錄檔的可循環對抗式樣本
訓練方法,屬於黑箱攻擊。所設計的對抗式樣本訓練方法能確保擾動性品質並合乎事
件紀錄檔的規範,目的在於挑戰有效且複雜的異常事件偵測系統,藉由訓練並產生的
對抗式樣本找出目標偵測模型的潛在弱點。實驗結果證明本方法訓練出的對抗式樣本
能成功攻擊複雜的異常事件偵測系統,且造成的擾動性優於其他研究提出的生成對抗
式樣本擾動方法,並在最後依據攻擊結果實地提出目標系統提升抵禦對抗式攻擊擾動
的方法。
Abstract
As the amount of data kept expanding, the era of big data has come. Artificial
intelligence (AI)-related technologies, including machine learning, deep learning, natural
language processing, have been applied to anomaly detection and many other application
fields and achieved efficient solutions. Comparing with human expert, AI approaches are
more suitable for solving complicated problems with repetitions. However, according to the
previous research [1, 2], deep learning models are vulnerable to adversarial attacks, where
an adversary manipulates the outcomes of a detection model by inserting adversarial samples.
Once the adversary exploits the vulnerability of the core algorithm of the target model, the
integrity and correctness of the model might be at risk.
To accelerate the development process of information system and support by theory,
system developer intends to use open source including labeled dataset, pre-trained model,
library and code published by other scholars. If these open resources have been contaminated
by cyber attacker, it will affect the practical system security. Fortunately, penetration testing
can simulate cyber-attack against the target system. With the hacking drill, it’s the most direct
way to help developer find out the exploitable vulnerabilities and keep target system away
from the threats.
This study proposes a cyclic adversarial sample training method based on real-world
Active Directory event log and it’s inspired by black-box attack. In order to challenge welldesigned anomaly detection system and find out the potential weakness of the target system,
the method proposed by this study train strong perturbative adversarial samples under the
specification of the event log. The experimental results provide the trained adversarial
samples can attack target system successfully and the attack achievement is better than other
studies performed. At the end of the paper, this study will provide an ingenious method
inspired by the attack process to truly improve the robustness of the anomaly detection
system.
目次 Table of Contents
論文審定書 i
摘要 ii
Abstract iii
目錄 iv
圖次 vii
表次 viii
第一章 序論 1
1.1 研究背景 1
1.2 研究動機 2
1.3 研究目的 5
第二章 文獻探討 6
2.1對抗式攻擊 6
2.2產生對抗式樣本 8
2.2.1 取樣方法 9
2.2.2 表格資料合成方法 10
2.2.3 超參數與超參數調整演算法 11
2.2.4 對抗式攻擊擾動方法與替代模型 13
2.3 異常事件模型偵測方法 15
2.4 評估方法 16
2.4.1防禦者對攻擊者能力評估 16
2.4.2攻擊者對目標模型的掌握評估 18
2.4.3目標模型的抗擾動性評估 18
第三章 研究方法 19
3.1系統架構 20
1. 取樣與擾動模組 23
2. 攻擊樣本訓練模組 23
3. 攻擊樣本評估子模組 23
4. 目標攻擊模組 23
3.2 取樣與擾動模組 24
3.3 攻擊樣本訓練模組 26
3.4 攻擊樣本評估子模組 27
3.5 目標攻擊模組 30
第四章 系統評估 32
4.1待訓練資料取樣與前處理 36
4.1.1取樣比例Ps實驗 36
4.1.2 Flip擾動度實驗 38
4.1.3 Sequence擾動實驗 39
4.2對抗式樣本訓練 41
4.2.1訓練評估方法效能實驗 41
4.2.2基因演算法調參實驗 42
4.3目標模型對抗式攻擊實驗 46
4.3.1 Huang系統擾動實驗 46
4.3.2 Kang系統擾動實驗 48
4.4文獻對抗式攻擊法比較 49
4.4.1 Artificial adversary法 49
4.4.2 FGSM法 50
4.4.3 LowProFool法 51
4.5 偵測系統抗擾動性提升 52
第五章 研究貢獻與未來展望 55
參考資料 57
附錄A 訓練時間 61

圖次
圖1-1、干擾攻擊 4
圖2-1、擾動影響分類[1] 6
圖2-2、毒化攻擊與逃逸攻擊[14] 7
圖2-3、CTGAN架構[27] 11
圖2-4、基因演算法 12
圖2-5、攻擊策略示意圖[49] 17
圖3-1、系統總覽 20
圖3-2、系統架構圖 21
圖4-1、取樣方法結果比較圖(Recall) 37
圖4-2、0%擾動Pf結果 39
圖4-3、30%擾動Pf結果 39
圖4-4、70%擾動Pf結果 39
圖4-5、100%擾動Pf結果 39
圖4-6、Sequence擾動結果(Recall) 41
圖4-7、Sequence擾動結果(Precision) 41
圖4-8、超參數調整演算法收斂圖 45
圖4-9、Huang系統擾動結果圖(情境一) 47
圖4-10、Kang系統擾動結果圖(情境一) 48
圖4-11、Kang系統倍數原始偵測資料集抗擾動實驗 53
圖4-12、Huang系統倍數原始偵測資料集抗擾動實驗 54
圖A-1、訓練時間複雜度 61
圖A-2、資料量與訓練時間關係圖 61

表次
表3-1、演算法變數與描述表 21
表3-2、一般事件與高風險事件ID 28
表3-3、研究設定的不可感知性規則 28
表4-1、混淆矩陣 32
表4-2、實驗項目總表 34
表4-3、實驗設備 35
表4-4、取樣策略 37
表4-5、最佳取樣比例Ps結果 38
表4-6、事件紀錄序列表 40
表4-7、訓練評估法實驗結果 42
表4-8、基因調參演算法超參數設定組合 43
表4-9、超參數調整實驗綜合評估項目與權重表 44
表4-10、超參數調整演算法實驗結果 45
表4-11、超參數調整演算法提出超參數結果 45
表4-12、Huang系統擾動結果表(情境二) 47
表4-13、Kang系統擾動結果表(情境二) 49
表4-14、Artificial adversary法比較 50
表4-15、FGSM法比較 51
表4-16、LowProFool法比較 52
表A-1、訓練時間實驗結果 61

參考文獻 References
[1] I. J. Goodfellow, J. Shlens, and C. Szegedy, "Explaining and harnessing adversarial
examples," arXiv preprint arXiv:1412.6572, 2014.
[2] C. Szegedy et al., "Intriguing properties of neural networks," arXiv preprint
arXiv:1312.6199, 2013.
[3] Cisco, "2020 全 球 網 路 趨 勢 報 告 ," 2020. [Online]. Available:
https://www.cisco.com/c/dam/m/zh_tw/solutions/enterprise-networks/networkingreport/files/Cisco_BlockBuster_2020-Global-Networking-TrendsReport_ZHTW.pdf
[4] McKinsey, "The state of AI in 2020," McKinsey, Ed., ed.
https://www.mckinsey.com/business-functions/mckinsey-analytics/ourinsights/global-survey-the-state-of-ai-in-2020: mckinsey.com, 2020.
[5] "2020 年網路資安威脅偵測數量成長 20%,突破 626 億," vol. 2021, ed: Trend
Micro, 2021.
[6] "Exploiting AI: How Cybercriminals Misuse and Abuse AI and ML," UNICRI,Trend
Micro,Europol, 2020, vol. 2022. [Online]. Available:
https://www.trendmicro.com/vinfo/us/security/news/cybercrime-and-digitalthreats/exploiting-ai-how-cybercriminals-misuse-abuse-ai-and-ml
[7] mitre.org. "CVE-2021-44228." https://cve.mitre.org/cgibin/cvename.cgi?name=CVE-2021-44228 (accessed November 22, 2022).
[8] T. L. 趨勢科技全球技術支援與研發中心, "Apache Log4j 爆十年來最嚴重的漏
洞,而且人人都有危險,Google、Apple、Amazon、 Netflix 等等也都無法倖免,"
ed, 2021.
[9] R. Feinman, R. R. Curtin, S. Shintre, and A. B. Gardner, "Detecting adversarial
samples from artifacts," arXiv preprint arXiv:1703.00410, 2017.
[10] A. Chakraborty, M. Alam, V. Dey, A. Chattopadhyay, and D. Mukhopadhyay,
"Adversarial attacks and defences: A survey," arXiv preprint arXiv:1810.00069, 2018.
[11] A. Aldahdooh, W. Hamidouche, S. A. Fezza, and O. Déforges, "Adversarial example
detection for DNN models: A review and experimental comparison," Artificial
Intelligence Review, pp. 1-60, 2022.
[12] S. Wang, S. Nepal, C. Rudolph, M. Grobler, S. Chen, and T. Chen, "Backdoor attacks
against transfer learning with pre-trained deep learning models," IEEE Transactions
on Services Computing, 2020.
[13] B. Dickson, "Adversarial AI: Blocking the hidden backdoor in neural networks,"
vol. 2020, ed. bdtechtalks.com, 2020.
[14] Y. Deldjoo, T. D. Noia, and F. A. Merra, "A survey on adversarial recommender
systems: from attack/defense strategies to generative adversarial networks," ACM
Computing Surveys (CSUR), vol. 54, no. 2, pp. 1-38, 2021.
[15] V. Ballet, X. Renard, J. Aigrain, T. Laugel, P. Frossard, and M. Detyniecki,
"Imperceptible adversarial attacks on tabular data," arXiv preprint arXiv:1911.03274,
2019.
[16] B. Biggio et al., "Evasion attacks against machine learning at test time," in Joint
European conference on machine learning and knowledge discovery in databases,
2013: Springer, pp. 387-402.
[17] G. L. Wittel and S. F. Wu, "On Attacking Statistical Spam Filters," in CEAS, 2004.
66
[18] M. Sharif, S. Bhagavatula, L. Bauer, and M. K. Reiter, "Accessorize to a crime: Real
and stealthy attacks on state-of-the-art face recognition," in Proceedings of the 2016
acm sigsac conference on computer and communications security, 2016, pp. 1528-
1540.
[19] K. D. Gupta and D. Dasgupta, "Using Negative Detectors for Identifying Adversarial
Data Manipulation in Machine Learning," in 2021 International Joint Conference on
Neural Networks (IJCNN), 2021: IEEE, pp. 1-8.
[20] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to
document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324,
1998.
[21] A. Krizhevsky and G. Hinton, "Learning multiple layers of features from tiny
images," 2009.
[22] A. I. Newaz, N. I. Haque, A. K. Sikder, M. A. Rahman, and A. S. Uluagac,
"Adversarial attacks to machine learning-based smart healthcare systems," in
GLOBECOM 2020-2020 IEEE Global Communications Conference, 2020: IEEE, pp.
1-6.
[23] F. Cartella, O. Anunciacao, Y. Funabiki, D. Yamaguchi, T. Akishita, and O. Elshocht,
"Adversarial attacks for tabular data: Application to fraud detection and imbalanced
data," arXiv preprint arXiv:2101.08030, 2021.
[24] I. Goodfellow et al., "Generative adversarial nets," Advances in neural information
processing systems, vol. 27, 2014.
[25] N. Park, M. Mohammadi, K. Gorde, S. Jajodia, H. Park, and Y. Kim, "Data synthesis
based on generative adversarial networks," arXiv preprint arXiv:1806.03384, 2018.
[26] E. Choi, S. Biswal, B. Malin, J. Duke, W. F. Stewart, and J. Sun, "Generating multilabel discrete patient records using generative adversarial networks," in Machine
learning for healthcare conference, 2017: PMLR, pp. 286-305.
[27] L. Xu, M. Skoularidou, A. Cuesta-Infante, and K. Veeramachaneni, "Modeling
tabular data using conditional gan," arXiv preprint arXiv:1907.00503, 2019.
[28] R. Agarwal, T. Thapliyal, and S. K. Shukla, "Detecting malicious accounts showing
adversarial behavior in permissionless blockchains," arXiv preprint
arXiv:2101.11915, 2021.
[29] Y. Mathov, E. Levy, Z. Katzir, A. Shabtai, and Y. Elovici, "Not all datasets are born
equal: On heterogeneous data and adversarial examples," arXiv preprint
arXiv:2010.03180, 2020.
[30] M. Chalé and N. D. Bastian, "Challenges and opportunities for generative methods in
the cyber domain," in 2021 Winter Simulation Conference (WSC), 2021: IEEE, pp. 1-
12.
[31] T. Yu and H. Zhu, "Hyper-parameter optimization: A review of algorithms and
applications," arXiv preprint arXiv:2003.05689, 2020.
[32] J. Bergstra and Y. Bengio, "Random search for hyper-parameter optimization,"
Journal of machine learning research, vol. 13, no. 2, 2012.
[33] J. Snoek, H. Larochelle, and R. P. Adams, "Practical bayesian optimization of
machine learning algorithms," Advances in neural information processing systems,
vol. 25, 2012.
[34] J. Luketina, M. Berglund, K. Greff, and T. Raiko, "Scalable gradient-based tuning of
continuous regularization hyperparameters," in International conference on machine
67
learning, 2016: PMLR, pp. 2952-2960.
[35] S. R. Young, D. C. Rose, T. P. Karnowski, S.-H. Lim, and R. M. Patton, "Optimizing
deep learning hyper-parameters through an evolutionary algorithm," in Proceedings
of the workshop on machine learning in high-performance computing environments,
2015, pp. 1-5.
[36] N. Gorgolis, I. Hatzilygeroudis, Z. Istenes, and L. G. Gyenne, "Hyperparameter
optimization of LSTM network models through genetic algorithm," in 2019 10th
International Conference on Information, Intelligence, Systems and Applications
(IISA), 2019: IEEE, pp. 1-4.
[37] T. Elsken, J. H. Metzen, and F. Hutter, "Neural architecture search: A survey," The
Journal of Machine Learning Research, vol. 20, no. 1, pp. 1997-2017, 2019.
[38] C. Liu et al., "Progressive neural architecture search," in Proceedings of the European
conference on computer vision (ECCV), 2018, pp. 19-34.
[39] C. Liu et al., "Auto-deeplab: Hierarchical neural architecture search for semantic
image segmentation," in Proceedings of the IEEE/CVF conference on computer
vision and pattern recognition, 2019, pp. 82-92.
[40] Z. Guo et al., "Single path one-shot neural architecture search with uniform
sampling," in European conference on computer vision, 2020: Springer, pp. 544-560.
[41] D. Soni. "artificial-adversary." https://github.com/airbnb/artificial-adversary
(accessed May 20, 2022).
[42] W. Wang et al., "Delving into data: Effectively substitute training for black-box
attack," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition, 2021, pp. 4761-4770.
[43] A. Botta, "Getting to know a black-box model:A two-dimensional example of
Jacobian-based adversarial attacks and Jacobian-based data augmentation," ed.
towardsdatascience, 2018.
[44] W. Matsuda, M. Fujimoto, and T. Mitsunaga, "Detecting apt attacks against active
directory using machine leaning," in 2018 IEEE Conference on Application,
Information and Network Security (AINS), 2018: IEEE, pp. 60-65.
[45] Q. Cao, Y. Qiao, and Z. Lyu, "Machine learning to detect anomalies in web log
analysis," in 2017 3rd IEEE International Conference on Computer and
Communications (ICCC), 2017: IEEE, pp. 519-523.
[46] R. Chen et al., "Logtransfer: Cross-system log anomaly detection for software
systems with transfer learning," in 2020 IEEE 31st International Symposium on
Software Reliability Engineering (ISSRE), 2020: IEEE, pp. 37-47.
[47] 黃嵩育, "基於 Active Directory 事件紀錄偵測系統," 碩士論文, 資訊管理學系,
國立中山大學, 2021.
[48] 康為傑, "以非監督式分群及風險分析偵測暴力破解攻擊," 碩士論文, 資訊管理
學系研究所, 國立中山大學, 2021.
[49] R. Bhargava and C. Clifton, "Anomaly detection under poisoning attacks," in
Proceedings of the ODD v5. 0: Outlier Detection De-constructed Workshop, 24th
ACM SIGKDD international conference on Knowledge Discovery and Data Mining
(KDD), 2018.
[50] P. J. Huber, Robust statistical procedures. SIAM, 1996.
[51] L. Perini, C. Galvin, and V. Vercruyssen, "A Ranking Stability Measure for
Quantifying the Robustness of Anomaly Detection Methods," in Joint European
68
Conference on Machine Learning and Knowledge Discovery in Databases, 2020:
Springer, pp. 397-408.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus:開放下載的時間 available 2027-07-30
校外 Off-campus:開放下載的時間 available 2027-07-30

您的 IP(校外) 位址是 44.200.27.215
現在時間是 2024-04-16
論文校外開放下載的時間是 2027-07-30

Your IP address is 44.200.27.215
The current date is 2024-04-16
This thesis will be available to you on 2027-07-30.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 2027-07-30

QR Code