國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,應用實例回放與雙教師之持續性半監督實例分割,Continual Semi-Supervised Instance Segmentation with Instance Replay and Dual Teachers

論文名稱 Title	應用實例回放與雙教師之持續性半監督實例分割 Continual Semi-Supervised Instance Segmentation with Instance Replay and Dual Teachers
系所名稱 Department	資訊管理學系 Department of Information Management
畢業學年期 Year, semester	112 學年度第 2 學期 The spring semester of Academic Year 112	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	51
研究生 Author	李晨愷 Chen-Kai Li
指導教授 Advisor	楊惠芳 Yang,Huei-Fang
召集委員 Convenor	劉宗榮 Liu,Tsung-Jung
口試委員 Advisory Committee	魏家博, 劉冠顯 Wei,Chia-Po; Liu,Kuan-Hsien
口試日期 Date of Exam	2024-07-24	繳交日期 Date of Submission	2024-08-29
關鍵字 Keywords	實例分割、持續性學習、半監督式實例分割、持續性半監督實例分割、複製粘貼數據增強 Instance segmentation, Continual learning, Semi-supervised instance segmentation, Continual Semi-Supervised Instance segmentation, Copy-paste augmentation
統計 Statistics	本論文已被瀏覽 88 次，被下載 0 次 The thesis/dissertation has been browsed 88 times, has been downloaded 0 times.

中文摘要
實例分割在過去幾年取得了顯著的進展。然而，這些研究通常使用完全標註且固定的數據集來訓練模型。不過在現實世界中，數據是會不斷增加的，如果每次引入新類別時，都需要重新訓練一個模型既耗時又成本昂貴。因此，本論文旨在使模型具備持續性學習能力，持續學習的目的是使模型在僅接觸新數據的情況下，仍能保留先前學到的知識，其主要目標是讓模型能夠有效識別新舊類別，保持對過去知識的性能，同時整合新信息。然而，這種方法經常遇到一個被稱為災難性遺忘的問題。災難性遺忘發生在模型學習新數據時，忘記了先前學到的知識，因為在持續性學習的學習步驟中，只會有新類別的標註，為了更好地保留舊知識，本論文提出了一種基於複製粘貼的實例回放策略，將舊類別的實例粘貼到當前學習步驟中的圖像上，使模型能夠在同一圖像中看到新舊類別，與過去需要回放整張圖片的策略不同，實例回放的方式有效的減少了儲存的空間，並且透過複製粘貼的技術，在保留舊知識的同時又增強了新類別的圖像的多樣性，從而提高模型的整體性能。對於持續性實例分割來說，另一個問題是，由於持續性實例分割任務需要像素級別的標註，而這樣的標註成本是相當高的。因此，本論文提出了持續性半監督式學習:在每個學習步驟中，提供少量完全標註的數據以及大量未標註的數據，這樣能夠大幅減少對於像素級標註的需求。接著，提出了一種新穎的雙教師模型架構，透過兩個專門的教師模型分別為無標註的數據產生新、舊類別的偽標籤，透過這樣的方式，可以讓舊類別的知識被完全保留，也可以讓新類別的知識被充分學習。最後，本論文是透過將雙教師模型與複製粘貼的回放技術結合，在 Pascal SBD、COCO、ADE20K 數據集上都超越了現有方法，特別是在較長步驟的持續性學習上，本論文的模型表現出顯著的性能提升，這驗證了方法的優越性，使其更適用於現實場景。
Abstract
Instance segmentation has made significant advancements in the past few years. However, most methods typically use fully annotated and fixed datasets for model train- ing. In the real world, data is continuously being added, and retraining a model each time new classes are introduced can be time-consuming and costly. Therefore, this thesis aims to equip the model with continual capabilities. The goal of continual learning is to en- able a model to retain previously learned knowledge while being accessed only to new data. However, the major challenge of is catastrophic forgetting, which occurs when a model forgets previously learned knowledge while learning new data. To better retain old knowledge, this thesis proposes method that employs an instance replay strategy based on copy-paste, where instances of old classes are pasted onto images from the current learning step to allow the model to see both new and old classes in the same image. Unlike previous replay strategies that store entire images, the proposed instance replay effectively reduces storage space while preserving old knowledge and enhancing the diversity of new class images, thereby improving model performance. Another issue is that continual instance segmentation tasks require pixel-level annotations, which are time-consuming and costly. Therefore, this thesis introduces continual semi-supervised instance segmentation: at each learning step, a small amount of fully labeled data, along with a large amount of unlabeled data, is provided. This setting reduces the need for pixel-level annotations. To provide pseudo labels for unlabeled data, a dual-teacher strategy is employed, where two special- ized teachers provide pseudo-labels for new and old classes, respectively. This strategy enhances model performance by allowing the knowledge of old classes to be fully retained while enabling better learning of new classes. The proposed method, which combines copy-paste replay and dual-teacher strategy techniques outperforms existing methods on the PASCAL VOC SBD, COCO and ADE20K datasets, particularly in long sequences of tasks.

目次 Table of Contents
論文審定書...........................................................................i 摘要...........................................................................ii Abstract.......................................................................iii 第一章緒論..................................................................1 第二章相關文獻.............................................................5 2.1 實例分割................................................................5 2.2 持續性學習..............................................................6 2.3 持續性實例分割學習.....................................................7 2.4 半監督式學習............................................................8 2.5 半監督實例分割..........................................................9 2.6 持續性半監督式學習....................................................10 2.7 複製粘貼數據增強......................................................11 第三章方法................................................................13 3.1 問題定義...............................................................13 3.2 階段一:複製粘貼回放結合持續性學習..................................14 3.2.1 持續性學習實例分割器...............................................14 3.2.2 輸出層知識蒸餾......................................................14 3.2.3 特徵層知識蒸餾......................................................16 3.2.4 偽標籤技術..........................................................16 3.2.5 複製粘貼技術重現舊知識.............................................16 3.3 階段二:基於專業教師的持續性半監督學習.............................19 第四章實驗結果與分析.....................................................23 4.1 實驗準則...............................................................23 4.2 實驗細節...............................................................24 4.3 PascalSBD2012實驗結果..............................................25 4.4 COCO實驗結果........................................................28 4.5 ADE20K實驗結果......................................................29 4.6 消融實驗...............................................................30 4.7 視覺化結果.............................................................33 第五章結論................................................................37 參考文獻.....................................................................38

參考文獻 References
[1] T.-T. Chuang, T.-Y. Wei, Y.-H. Hsieh, C.-S. Chen, and H.-F. Yang, “Continual cell instance segmentation of microscopy images,” in ICASSP, 2023, pp. 1–5. [2] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” in ICCV, 2017, pp. 2961–2969. [3] D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, “YOLACT: Real-time instance segmen- tation,” in ICCV, 2019, pp. 9157–9166. [4] Z. Tian, C. Shen, and H. Chen, “Conditional convolutions for instance segmenta- tion,” in ECCV, 2020, pp. 282–298. [5] W. Zhang, J. Pang, K. Chen, and C. C. Loy, “K-Net: Towards unified image seg- mentation,” in NeurIPS, vol. 34, 2021, pp. 10 326–10 338. [6] F. Cermelli, A. Geraci, D. Fontanel, and B. Caputo, “Modeling missing annotations for incremental learning in object detection,” in CVPR, 2022, pp. 3700–3710. [7] Y. Gu, C. Deng, and K. Wei, “Class-incremental instance segmentation via multi- teacher networks,” in AAAI, vol. 35, no. 2, 2021, pp. 1478–1486. [8] Z. Wang, Y. Li, and S. Wang, “Noisy boundaries: Lemon or lemonade for semi- supervised instance segmentation?” in CVPR, 2022, pp. 16 826–16 835. [9] B. Kim, J. Jeong, D. Han, and S. J. Hwang, “The devil is in the points: Weakly semi- supervised instance segmentation via point-guided mask representation,” in CVPR, 2023, pp. 11 360–11 370. [10] J. Hu, C. Chen, L. Cao, S. Zhang, A. Shu, G. Jiang, and R. Ji, “Pseudo-label align- ment for semi-supervised instance segmentation,” in ICCV, 2023, pp. 16 337–16 347. [11] B.Chen,W.Chen,S.Yang,Y.Xuan,J.Song,D.Xie,S.Pu,M.Song,andY.Zhuang, “Label matching semi-supervised object detection,” in CVPR, 2022, pp. 14 381– 14 390. [12] Q. Xie, M.-T. Luong, E. Hovy, and Q. V. Le, “Self-training with noisy student im- proves imagenet classification,” in CVPR, 2020, pp. 10 684–10 695. [13] Q. Yang, X. Wei, B. Wang, X.-S. Hua, and L. Zhang, “Interactive self-training with mean teachers for semi-supervised object detection,” in CVPR, 2021, pp. 5937–5946. [14] R. Girshick, “Fast r-cnn,” in ICCV, 2015, pp. 1440–1448. [15] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in CVPR, 2016, pp. 779–788. [16] Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: Fully convolutional one-stage object detection,” in ICCV, 2019, pp. 9627–9636. [17] Z. Li and D. Hoiem, “Learning without forgetting,” IEEE TPAMI, vol. 40, no. 12, pp. 2935–2947, 2017. [18] L. Yu, B. Twardowski, X. Liu, L. Herranz, K. Wang, Y. Cheng, S. Jui, and J. v. d. Weijer, “Semantic drift compensation for class-incremental learning,” in CVPR, 2020, pp. 6982–6991. [19] A. Douillard, M. Cord, C. Ollion, T. Robert, and E. Valle, “PODNet: Pooled outputs distillation for small-tasks incremental learning,” in CVPR, 2020, pp. 86–102. [20] A. Maracani, U. Michieli, M. Toldo, and P. Zanuttigh, “RECALL: Replay-based continual learning in semantic segmentation,” in ICCV, 2021, pp. 7026–7035. [21] S.-A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert, “iCaRL: Incremental classifier and representation learning,” in CVPR, 2017, pp. 2001–2010. [22] E. Belouadah and A. Popescu, “IL2M: Class incremental learning with dual mem- ory,” in ICCV, 2019, pp. 583–592. [23] D.Abati,J.Tomczak,T.Blankevoort,S.Calderara,R.Cucchiara,andB.E.Bejnordi, “Conditional channel gated networks for task-aware continual learning,” in CVPR, 2020, pp. 3931–3940. [24] J. Rajasegaran, M. Hayat, S. H. Khan, F. S. Khan, and L. Shao, “Random path se- lection for continual learning,” in NeurIPS, vol. 32, 2019. [25] K. Joseph, J. Rajasegaran, S. Khan, F. S. Khan, and V. N. Balasubramanian, “In- cremental object detection via meta-learning,” IEEE TPAMI, vol. 44, no. 12, pp. 9209–9216, 2021. [26] Y.Liu,Y.Cong,D.Goswami,X.Liu,andJ.vandeWeijer,“Augmentedboxreplay: Overcoming foreground shift for incremental object detection,” in ICCV, 2023, pp. 11 367–11 377. [27] R.-Z. Qiu, P. Chen, W. Sun, Y.-X. Wang, and K. Hauser, “GAPS: Few-shot incre- mental semantic segmentation via guided copy-paste synthesis,” in CVPRW, 2023. [28] L. Yu, X. Liu, and J. Van de Weijer, “Self-training for class-incremental semantic segmentation,” IEEE TNNLS, vol. 34, no. 11, pp. 9116–9127, 2022. [29] S. Cha, Y. Yoo, T. Moon et al., “SSUL: Semantic segmentation with unknown la- bel for exemplar-based class-incremental learning,” in NeurIPS, vol. 34, 2021, pp. 10 919–10 930. [30] A. Tarvainen and H. Valpola, “Mean teachers are better role models: Weight- averaged consistency targets improve semi-supervised deep learning results,” in NeurIPS, vol. 30, 2017. [31] D. Berthelot, N. Carlini, I. Goodfellow, N. Papernot, A. Oliver, and C. A. Raffel, “Mixmatch: A holistic approach to semi-supervised learning,” in NeurIPS, vol. 32, 2019. [32] Q. Xie, Z. Dai, E. Hovy, T. Luong, and Q. Le, “Unsupervised data augmentation for consistency training,” in NeurIPS, vol. 33, 2020, pp. 6256–6268. [33] D.Berthelot,N.Carlini,E.D.Cubuk,A.Kurakin,K.Sohn,H.Zhang,andC.Raffel, “Remixmatch: Semi-supervised learning with distribution alignment and augmen- tation anchoring,” in ICLR, 2020. [34] K. Sohn, D. Berthelot, N. Carlini, Z. Zhang, H. Zhang, C. A. Raffel, E. D. Cubuk, A. Kurakin, and C.-L. Li, “Fixmatch: Simplifying semi-supervised learning with consistency and confidence,” in NeurIPS, vol. 33, 2020, pp. 596–608. [35] Y.-C. Liu, C.-Y. Ma, Z. He, C.-W. Kuo, K. Chen, P. Zhang, B. Wu, Z. Kira, and P. Vajda, “Unbiased teacher for semi-supervised object detection,” in ICLR, 2021. [36] L. Wang, K. Yang, C. Li, L. Hong, Z. Li, and J. Zhu, “ORDisCo: Effective and ef- ficient usage of incremental unlabeled data for semi-supervised continual learning,” in CVPR, 2021, pp. 5383–5392. [37] Z.Kang,E.Fini,M.Nabi,E.Ricci,andK.Alahari,“Asoftnearest-neighborframe- work for continual semi-supervised learning,” in CVPR, 2023, pp. 11 868–11 877. [38] G. Ghiasi, Y. Cui, A. Srinivas, R. Qian, T.-Y. Lin, E. D. Cubuk, Q. V. Le, and B. Zoph, “Simple copy-paste is a strong data augmentation method for instance seg- mentation,” in CVPR, 2021, pp. 2918–2928. [39] D. Yang, Y. Zhou, X. Hong, A. Zhang, and W. Wang, “One-shot replay: boosting incremental object detection via retrospecting one object,” in AAAI, vol. 37, no. 3, 2023, pp. 3127–3135. [40] B. Hariharan, P. Arbeláez, L. Bourdev, S. Maji, and J. Malik, “Semantic contours from inverse detectors,” in ICCV, 2011, pp. 991–998. [41] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in ECCV, 2014, pp. 740–755. [42] B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso, and A. Torralba, “Scene parsing through ade20k dataset,” in CVPR, 2017, pp. 633–641. [43] Y.Wu,A.Kirillov,F.Massa,W.-Y.Lo,andR.Girshick,“Detectron2,”https://github. com/facebookresearch/detectron2, 2019.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：開放下載的時間 available 2025-08-29 校外 Off-campus：開放下載的時間 available 2025-08-29 您的 IP(校外) 位址是 216.73.216.89 現在時間是 2025-06-25 論文校外開放下載的時間是 2025-08-29 Your IP address is 216.73.216.89 The current date is 2025-06-25 This thesis will be available to you on 2025-08-29.
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 2025-08-29

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2453 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2453 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS