Responsive image
博碩士論文 etd-0729124-150159 詳細資訊
Title page for etd-0729124-150159
論文名稱
Title
應用實例回放與雙教師之持續性半監督實例分割
Continual Semi-Supervised Instance Segmentation with Instance Replay and Dual Teachers
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
51
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2024-07-24
繳交日期
Date of Submission
2024-08-29
關鍵字
Keywords
實例分割、持續性學習、半監督式實例分割、持續性半監督實例分割、複製粘貼數據增強
Instance segmentation, Continual learning, Semi-supervised instance segmentation, Continual Semi-Supervised Instance segmentation, Copy-paste augmentation
統計
Statistics
本論文已被瀏覽 88 次,被下載 0
The thesis/dissertation has been browsed 88 times, has been downloaded 0 times.
中文摘要
實例分割在過去幾年取得了顯著的進展。然而,這些研究通常使用完全 標註且固定的數據集來訓練模型。不過在現實世界中,數據是會不斷增加的,如 果每次引入新類別時,都需要重新訓練一個模型既耗時又成本昂貴。因此,本論 文旨在使模型具備持續性學習能力,持續學習的目的是使模型在僅接觸新數據的 情況下,仍能保留先前學到的知識,其主要目標是讓模型能夠有效識別新舊類別, 保持對過去知識的性能,同時整合新信息。然而,這種方法經常遇到一個被稱為 災難性遺忘的問題。災難性遺忘發生在模型學習新數據時,忘記了先前學到的知 識,因為在持續性學習的學習步驟中,只會有新類別的標註,為了更好地保留舊 知識,本論文提出了一種基於複製粘貼的實例回放策略,將舊類別的實例粘貼到 當前學習步驟中的圖像上,使模型能夠在同一圖像中看到新舊類別,與過去需要 回放整張圖片的策略不同,實例回放的方式有效的減少了儲存的空間,並且透過 複製粘貼的技術,在保留舊知識的同時又增強了新類別的圖像的多樣性,從而提 高模型的整體性能。對於持續性實例分割來說,另一個問題是,由於持續性實例 分割任務需要像素級別的標註,而這樣的標註成本是相當高的。因此,本論文提 出了持續性半監督式學習:在每個學習步驟中,提供少量完全標註的數據以及大 量未標註的數據,這樣能夠大幅減少對於像素級標註的需求。接著,提出了一種 新穎的雙教師模型架構,透過兩個專門的教師模型分別為無標註的數據產生新、 舊類別的偽標籤,透過這樣的方式,可以讓舊類別的知識被完全保留,也可以讓 新類別的知識被充分學習。最後,本論文是透過將雙教師模型與複製粘貼的回放 技術結合,在 Pascal SBD、COCO、ADE20K 數據集上都超越了現有方法,特別 是在較長步驟的持續性學習上,本論文的模型表現出顯著的性能提升,這驗證了 方法的優越性,使其更適用於現實場景。
Abstract
Instance segmentation has made significant advancements in the past few years. However, most methods typically use fully annotated and fixed datasets for model train- ing. In the real world, data is continuously being added, and retraining a model each time new classes are introduced can be time-consuming and costly. Therefore, this thesis aims to equip the model with continual capabilities. The goal of continual learning is to en- able a model to retain previously learned knowledge while being accessed only to new data. However, the major challenge of is catastrophic forgetting, which occurs when a model forgets previously learned knowledge while learning new data. To better retain old knowledge, this thesis proposes method that employs an instance replay strategy based on copy-paste, where instances of old classes are pasted onto images from the current learning step to allow the model to see both new and old classes in the same image. Unlike previous replay strategies that store entire images, the proposed instance replay effectively reduces storage space while preserving old knowledge and enhancing the diversity of new class images, thereby improving model performance. Another issue is that continual instance segmentation tasks require pixel-level annotations, which are time-consuming and costly. Therefore, this thesis introduces continual semi-supervised instance segmentation: at each learning step, a small amount of fully labeled data, along with a large amount of unlabeled data, is provided. This setting reduces the need for pixel-level annotations. To provide pseudo labels for unlabeled data, a dual-teacher strategy is employed, where two special- ized teachers provide pseudo-labels for new and old classes, respectively. This strategy enhances model performance by allowing the knowledge of old classes to be fully retained while enabling better learning of new classes. The proposed method, which combines copy-paste replay and dual-teacher strategy techniques outperforms existing methods on the PASCAL VOC SBD, COCO and ADE20K datasets, particularly in long sequences of tasks.
目次 Table of Contents
論文審定書...........................................................................i
摘要...........................................................................ii Abstract.......................................................................iii
第一章 緒論..................................................................1
第二章 相關文獻.............................................................5
2.1 實例分割................................................................5
2.2 持續性學習..............................................................6
2.3 持續性實例分割學習.....................................................7
2.4 半監督式學習............................................................8
2.5 半監督實例分割..........................................................9
2.6 持續性半監督式學習....................................................10
2.7 複製粘貼數據增強......................................................11
第三章 方法................................................................13
3.1 問題定義...............................................................13
3.2 階段一:複製粘貼回放結合持續性學習..................................14
3.2.1 持續性學習實例分割器...............................................14
3.2.2 輸出層知識蒸餾......................................................14
3.2.3 特徵層知識蒸餾......................................................16
3.2.4 偽標籤技術..........................................................16
3.2.5 複製粘貼技術重現舊知識.............................................16
3.3 階段二:基於專業教師的持續性半監督學習.............................19
第四章 實驗結果與分析.....................................................23
4.1 實驗準則...............................................................23
4.2 實驗細節...............................................................24
4.3 PascalSBD2012實驗結果..............................................25
4.4 COCO實驗結果........................................................28
4.5 ADE20K實驗結果......................................................29
4.6 消融實驗...............................................................30
4.7 視覺化結果.............................................................33
第五章 結論................................................................37
參考文獻.....................................................................38
參考文獻 References
[1] T.-T. Chuang, T.-Y. Wei, Y.-H. Hsieh, C.-S. Chen, and H.-F. Yang, “Continual cell instance segmentation of microscopy images,” in ICASSP, 2023, pp. 1–5.
[2] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” in ICCV, 2017, pp. 2961–2969.
[3] D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, “YOLACT: Real-time instance segmen- tation,” in ICCV, 2019, pp. 9157–9166.
[4] Z. Tian, C. Shen, and H. Chen, “Conditional convolutions for instance segmenta- tion,” in ECCV, 2020, pp. 282–298.
[5] W. Zhang, J. Pang, K. Chen, and C. C. Loy, “K-Net: Towards unified image seg- mentation,” in NeurIPS, vol. 34, 2021, pp. 10 326–10 338.
[6] F. Cermelli, A. Geraci, D. Fontanel, and B. Caputo, “Modeling missing annotations for incremental learning in object detection,” in CVPR, 2022, pp. 3700–3710.
[7] Y. Gu, C. Deng, and K. Wei, “Class-incremental instance segmentation via multi- teacher networks,” in AAAI, vol. 35, no. 2, 2021, pp. 1478–1486.
[8] Z. Wang, Y. Li, and S. Wang, “Noisy boundaries: Lemon or lemonade for semi- supervised instance segmentation?” in CVPR, 2022, pp. 16 826–16 835.
[9] B. Kim, J. Jeong, D. Han, and S. J. Hwang, “The devil is in the points: Weakly semi- supervised instance segmentation via point-guided mask representation,” in CVPR, 2023, pp. 11 360–11 370.
[10] J. Hu, C. Chen, L. Cao, S. Zhang, A. Shu, G. Jiang, and R. Ji, “Pseudo-label align- ment for semi-supervised instance segmentation,” in ICCV, 2023, pp. 16 337–16 347.
[11] B.Chen,W.Chen,S.Yang,Y.Xuan,J.Song,D.Xie,S.Pu,M.Song,andY.Zhuang, “Label matching semi-supervised object detection,” in CVPR, 2022, pp. 14 381– 14 390.
[12] Q. Xie, M.-T. Luong, E. Hovy, and Q. V. Le, “Self-training with noisy student im- proves imagenet classification,” in CVPR, 2020, pp. 10 684–10 695.
[13] Q. Yang, X. Wei, B. Wang, X.-S. Hua, and L. Zhang, “Interactive self-training with mean teachers for semi-supervised object detection,” in CVPR, 2021, pp. 5937–5946.
[14] R. Girshick, “Fast r-cnn,” in ICCV, 2015, pp. 1440–1448.
[15] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified,
real-time object detection,” in CVPR, 2016, pp. 779–788.
[16] Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: Fully convolutional one-stage object
detection,” in ICCV, 2019, pp. 9627–9636.
[17] Z. Li and D. Hoiem, “Learning without forgetting,” IEEE TPAMI, vol. 40, no. 12,
pp. 2935–2947, 2017.
[18] L. Yu, B. Twardowski, X. Liu, L. Herranz, K. Wang, Y. Cheng, S. Jui, and J. v. d. Weijer, “Semantic drift compensation for class-incremental learning,” in CVPR, 2020, pp. 6982–6991.
[19] A. Douillard, M. Cord, C. Ollion, T. Robert, and E. Valle, “PODNet: Pooled outputs distillation for small-tasks incremental learning,” in CVPR, 2020, pp. 86–102.
[20] A. Maracani, U. Michieli, M. Toldo, and P. Zanuttigh, “RECALL: Replay-based continual learning in semantic segmentation,” in ICCV, 2021, pp. 7026–7035.
[21] S.-A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert, “iCaRL: Incremental classifier and representation learning,” in CVPR, 2017, pp. 2001–2010.
[22] E. Belouadah and A. Popescu, “IL2M: Class incremental learning with dual mem- ory,” in ICCV, 2019, pp. 583–592.
[23] D.Abati,J.Tomczak,T.Blankevoort,S.Calderara,R.Cucchiara,andB.E.Bejnordi, “Conditional channel gated networks for task-aware continual learning,” in CVPR, 2020, pp. 3931–3940.
[24] J. Rajasegaran, M. Hayat, S. H. Khan, F. S. Khan, and L. Shao, “Random path se- lection for continual learning,” in NeurIPS, vol. 32, 2019.
[25] K. Joseph, J. Rajasegaran, S. Khan, F. S. Khan, and V. N. Balasubramanian, “In- cremental object detection via meta-learning,” IEEE TPAMI, vol. 44, no. 12, pp. 9209–9216, 2021.
[26] Y.Liu,Y.Cong,D.Goswami,X.Liu,andJ.vandeWeijer,“Augmentedboxreplay: Overcoming foreground shift for incremental object detection,” in ICCV, 2023, pp. 11 367–11 377.
[27] R.-Z. Qiu, P. Chen, W. Sun, Y.-X. Wang, and K. Hauser, “GAPS: Few-shot incre- mental semantic segmentation via guided copy-paste synthesis,” in CVPRW, 2023.
[28] L. Yu, X. Liu, and J. Van de Weijer, “Self-training for class-incremental semantic segmentation,” IEEE TNNLS, vol. 34, no. 11, pp. 9116–9127, 2022.
[29] S. Cha, Y. Yoo, T. Moon et al., “SSUL: Semantic segmentation with unknown la- bel for exemplar-based class-incremental learning,” in NeurIPS, vol. 34, 2021, pp. 10 919–10 930.
[30] A. Tarvainen and H. Valpola, “Mean teachers are better role models: Weight- averaged consistency targets improve semi-supervised deep learning results,” in NeurIPS, vol. 30, 2017.
[31] D. Berthelot, N. Carlini, I. Goodfellow, N. Papernot, A. Oliver, and C. A. Raffel, “Mixmatch: A holistic approach to semi-supervised learning,” in NeurIPS, vol. 32, 2019.
[32] Q. Xie, Z. Dai, E. Hovy, T. Luong, and Q. Le, “Unsupervised data augmentation for consistency training,” in NeurIPS, vol. 33, 2020, pp. 6256–6268.
[33] D.Berthelot,N.Carlini,E.D.Cubuk,A.Kurakin,K.Sohn,H.Zhang,andC.Raffel, “Remixmatch: Semi-supervised learning with distribution alignment and augmen- tation anchoring,” in ICLR, 2020.
[34] K. Sohn, D. Berthelot, N. Carlini, Z. Zhang, H. Zhang, C. A. Raffel, E. D. Cubuk, A. Kurakin, and C.-L. Li, “Fixmatch: Simplifying semi-supervised learning with consistency and confidence,” in NeurIPS, vol. 33, 2020, pp. 596–608.
[35] Y.-C. Liu, C.-Y. Ma, Z. He, C.-W. Kuo, K. Chen, P. Zhang, B. Wu, Z. Kira, and P. Vajda, “Unbiased teacher for semi-supervised object detection,” in ICLR, 2021.
[36] L. Wang, K. Yang, C. Li, L. Hong, Z. Li, and J. Zhu, “ORDisCo: Effective and ef- ficient usage of incremental unlabeled data for semi-supervised continual learning,” in CVPR, 2021, pp. 5383–5392.
[37] Z.Kang,E.Fini,M.Nabi,E.Ricci,andK.Alahari,“Asoftnearest-neighborframe- work for continual semi-supervised learning,” in CVPR, 2023, pp. 11 868–11 877.
[38] G. Ghiasi, Y. Cui, A. Srinivas, R. Qian, T.-Y. Lin, E. D. Cubuk, Q. V. Le, and B. Zoph, “Simple copy-paste is a strong data augmentation method for instance seg- mentation,” in CVPR, 2021, pp. 2918–2928.
[39] D. Yang, Y. Zhou, X. Hong, A. Zhang, and W. Wang, “One-shot replay: boosting incremental object detection via retrospecting one object,” in AAAI, vol. 37, no. 3, 2023, pp. 3127–3135.
[40] B. Hariharan, P. Arbeláez, L. Bourdev, S. Maji, and J. Malik, “Semantic contours from inverse detectors,” in ICCV, 2011, pp. 991–998.
[41] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in ECCV, 2014, pp. 740–755.
[42] B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso, and A. Torralba, “Scene parsing through ade20k dataset,” in CVPR, 2017, pp. 633–641.
[43] Y.Wu,A.Kirillov,F.Massa,W.-Y.Lo,andR.Girshick,“Detectron2,”https://github. com/facebookresearch/detectron2, 2019.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus:開放下載的時間 available 2025-08-29
校外 Off-campus:開放下載的時間 available 2025-08-29

您的 IP(校外) 位址是 216.73.216.89
現在時間是 2025-06-25
論文校外開放下載的時間是 2025-08-29

Your IP address is 216.73.216.89
The current date is 2025-06-25
This thesis will be available to you on 2025-08-29.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 2025-08-29

QR Code