論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available
論文名稱 Title |
基於超像素與圖神經網路的局部影像解釋器 Local Image Explainer based on Superpixel Algorithms and Graph Neural Networks |
||
系所名稱 Department |
|||
畢業學年期 Year, semester |
語文別 Language |
||
學位類別 Degree |
頁數 Number of pages |
41 |
|
研究生 Author |
|||
指導教授 Advisor |
|||
召集委員 Convenor |
|||
口試委員 Advisory Committee |
|||
口試日期 Date of Exam |
2024-08-01 |
繳交日期 Date of Submission |
2024-08-02 |
關鍵字 Keywords |
圖神經網路、可解釋性、超像素、電腦視覺、圖卷積網絡 Graph Neural Network, Graph Convolutional Network, Interpretability, Superpixel, Computer Vision |
||
統計 Statistics |
本論文已被瀏覽 104 次,被下載 3 次 The thesis/dissertation has been browsed 104 times, has been downloaded 3 times. |
中文摘要 |
近幾年人們持續地挖掘深度學習的潛力,深度學習也成功被廣泛地應用在各個領域,近期又出現了大量有關於如何提升深度學習方法可解釋性的研究。已經被證實的是,AI模型如果能提供決策背後原理和相關證據,對於獲得人類的信任至關重要。我們提出了一種方法-超像素圖神經網路影像解釋器,這個創新的方法目的是透過超像素演算法和圖神經網路為影像分類任務提供局部、易於理解的解釋。超像素圖影像解釋器透過運用超像素演算法將二維像素矩陣轉換成為超像素,收集低維度的資訊(例如邊緣、紋路或輪廓)並將其轉換成圖資料。再訓練圖神經網路來預測分類任務並獲得局部解釋,我們的演算法能偵測那些影響圖神經網路進行決策的關鍵節點,將這些關鍵節點轉換成原始影像裡的像素群,就能組合成我們的最終解釋。在數個資料集中,我們的方法獲得極高的解釋能力和最快的執行速度相較於傳統影像解釋方法,準確度方面和卷積神經網路相比也有競爭力。 |
Abstract |
Deep Learning models have demonstrated remarkable potential and found extensive applications across various domains. Recently, there has been a surge in research focusing on enhancing the interpretability of deep learning methods. It has been proven that an AI model's ability to offer the rationale and the relevant evidence behind specific decisions is essential in acquiring human trust. In this context, we present the Superpixel Graph Image Explainer (SPGIE), an innovative interpretable model designed to generate localized, understandable explanations for images by using superpixel algorithms and the Graph Neural Network (GNN) model. SPGIE operates by employing superpixel algorithms to convert a two-dimensional matrix of pixels to superpixels, integrating low-dimensional information such as edges, texture, or contour. By training with the GNN model and acquiring local explanations, our algorithm identifies critical nodes that influence the decision-making process of the GNN model. Consequently, we trace these superpixel nodes back to the original pixels to compose our final explanation. In experiments, our approach achieves notable interpretability power and significantly lower runtime compared to traditional Computer Vision Image Explainers, and demonstrates competitive or superior accuracy compared to Convolutional Neural Networks. |
目次 Table of Contents |
論文審定書 i 摘要 ii Abstract iii List of Figures v List of Tables vi 1. Introduction 1 2. Background and Related Works 3 2.1 Superpixel Algorithm 3 2.2 Explainer of Computer Vision 4 2.3 Graph Neural Network 6 2.4 Graph Attention Network (GAT) 7 2.5 Dynamic Reduction Network (DRN) 8 3. Methodology 9 3.1 Superpixel Graph Image Classification 9 3.2 Importance Estimation of GNN Models 10 3.3 Converting Superpixels Back to Pixels 15 4. Experiments 15 4.1 Performance of Models 16 4.2 Quantitative Evaluation via Faithfulness of Image Recognition 18 4.3 DRN distance regularization 22 4.4 Qualitative Results 24 4.5 Discussion 29 5. Conclusion 30 6. References 31 |
參考文獻 References |
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Süsstrunk, S. (2012). SLIC Superpixels Compared to State-of-the-Art Superpixel Methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2274–2282. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2012.120 Adebayo, J., Gilmer, J., Goodfellow, I., & Kim, B. (2018, October 8). Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values. arXiv.Org. https://arxiv.org/abs/1810.03307v1 Avelar, P. H. C., Tavares, A. R., da Silveira, T. L. T., Jung, C. R., & Lamb, L. C. (2020). Superpixel Image Classification with Graph Attention Networks. 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), 203–209. https://doi.org/10.1109/SIBGRAPI51738.2020.00035 Chattopadhyay, A., Sarkar, A., Howlader, P., & Balasubramanian, V. N. (2018). Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks. 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), 839–847. https://doi.org/10.1109/WACV.2018.00097 Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 603–619. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/34.1000236 Dwivedi, V. P., Joshi, C. K., Luu, A. T., Laurent, T., Bengio, Y., & Bresson, X. (2022). Benchmarking Graph Neural Networks (arXiv:2003.00982). arXiv. https://doi.org/10.48550/arXiv.2003.00982 Fey, M., Lenssen, J. E., Weichert, F., & Müller, H. (2018). SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels. arXiv:1711.08920 [Cs]. http://arxiv.org/abs/1711.08920 Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M., & Lempitsky, V. (2016). Domain-Adversarial Training of Neural Networks (arXiv:1505.07818). arXiv. https://doi.org/10.48550/arXiv.1505.07818 Gray, L., Klijnsma, T., & Ghosh, S. (2020). A Dynamic Reduction Network for Point Clouds. arXiv:2003.08013 [Hep-Ex]. http://arxiv.org/abs/2003.08013 LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539 Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324. https://doi.org/10.1109/5.726791 Meyer, F. (1992). Color image segmentation. 1992 International Conference on Image Processing and Its Applications, 303–306. https://ieeexplore.ieee.org/abstract/document/785528 Monti, F., Boscaini, D., Masci, J., Rodola, E., Svoboda, J., & Bronstein, M. M. (2017). Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs. 5115–5124. https://openaccess.thecvf.com/content_cvpr_2017/html/Monti_Geometric_Deep_Learning_CVPR_2017_paper.html Ren & Malik. (2003). Learning a classification model for segmentation. Proceedings Ninth IEEE International Conference on Computer Vision, 10–17 vol.1. https://doi.org/10.1109/ICCV.2003.1238308 Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. arXiv:1602.04938 [Cs, Stat]. http://arxiv.org/abs/1602.04938 Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-CAM: Visual Explanations From Deep Networks via Gradient-Based Localization. 618–626. https://openaccess.thecvf.com/content_iccv_2017/html/Selvaraju_Grad-CAM_Visual_Explanations_ICCV_2017_paper.html Stutz, D., Hermans, A., & Leibe, B. (2018). Superpixels: An Evaluation of the State-of-the-Art. Computer Vision and Image Understanding, 166, 1–27. https://doi.org/10.1016/j.cviu.2017.03.007 Vedaldi, A., & Soatto, S. (2008). Quick Shift and Kernel Methods for Mode Seeking. In D. Forsyth, P. Torr, & A. Zisserman (Eds.), Computer Vision – ECCV 2008 (pp. 705–718). Springer Berlin Heidelberg. Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., & Bengio, Y. (2018). Graph Attention Networks. arXiv:1710.10903 [Cs, Stat]. http://arxiv.org/abs/1710.10903 Wang, Y., Sun, Y., Liu, Z., Sarma, S. E., Bronstein, M. M., & Solomon, J. M. (2019). Dynamic Graph CNN for Learning on Point Clouds (arXiv:1801.07829). arXiv. https://doi.org/10.48550/arXiv.1801.07829 Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2015). Learning Deep Features for Discriminative Localization. arXiv:1512.04150 [Cs]. http://arxiv.org/abs/1512.04150 |
電子全文 Fulltext |
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。 論文使用權限 Thesis access permission:校內校外完全公開 unrestricted 開放時間 Available: 校內 Campus: 已公開 available 校外 Off-campus: 已公開 available |
紙本論文 Printed copies |
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。 開放時間 available 已公開 available |
QR Code |