國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,台灣水域鯨豚聲紋辨識與定位研究,Identification and Source Ranging for Underwater Target in Taiwan's waters

論文名稱 Title	台灣水域鯨豚聲紋辨識與定位研究 Identification and Source Ranging for Underwater Target in Taiwan's waters
系所名稱 Department	海下科技研究所 Institute of Undersea Technology
畢業學年期 Year, semester	108 學年度第 2 學期 The spring semester of Academic Year 108	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	97
研究生 Author	楊濬維 Chun-Wei Yang
指導教授 Advisor	邱永盛 Linus Yung-Sheng Chiu
召集委員 Convenor	莫顯蕎 Hin-Kiu Mok
口試委員 Advisory Committee	黃清哲, 陳信宏 Ching-Jer Huang; Hsin-Hung Chen
口試日期 Date of Exam	2020-05-27	繳交日期 Date of Submission	2020-06-29
關鍵字 Keywords	梅爾倒頻譜參數、端點偵測法、聲源定位、聲紋辨識、水中聲學 End-point Detection, Mel-frequency Cepstrum Coefficient, Source Ranging, Voiceprint Recognition, Underwater Acoustics
統計 Statistics	本論文已被瀏覽 5931 次，被下載 80 次 The thesis/dissertation has been browsed 5931 times, has been downloaded 80 times.

中文摘要
由於綠色能源議題在國際上的重要性越來越受重視，因此近年來台灣西部沿海預計有大量的離岸風機基樁架設及運作，其對於海洋生態環境等危害更不可忽視，持續性的低頻環境噪音對於海洋哺乳類等生物可能會造成嚴重的聽力受損甚至危害其生命，鯨豚類的活動在風機打樁施工時的即時監測尤為重要。本研究擬利用被動聲學監測搭配梅爾倒頻譜係數進行海洋鯨豚類哨叫聲聲紋辨識，其特徵參數擷取法之頻帶分布相似於人耳聽力系統非線性特性，在低頻的解析度較高，並且大量應用於語音聲紋辨識技術當中。本研究之目的除辨識目標訊號以外，所接收之聲學資料皆包含訊號走時差，也就是目標訊號經由不同路徑傳遞抵達後產生時間上的延遲差異，而在不同空間下之水下音傳接收資料均涵蓋因多重路徑效應導致之相異延遲時間，透過掌握各空間下之水中脈衝通道響應並建置資料庫後，即可利用聲紋辨識達到聲源定距之目的。根據主動聲學實驗資料及結果分析發現，以三種不同聲學訊號拍發下，聲紋辨識成功率達 97.86%，而於南海進行之聲源定距實驗，在聲源與接收器距離兩公里以內之定距誤差在 0.5 公里以內，因此梅爾倒頻譜特徵參數擷取法可有效運用於水中目標物之聲紋辨識，並且聲學資料中走時差特徵亦可作為聲源定位之特徵，透過辨識與比對可有效定位聲源與接收器間距離。
Abstract
With the development of renewable energy, increasingly importance has been attached to offshore wind power. Accordingly, there are numerous foundation pile erected and operated at the west shore of Taiwan. The harm to the marine ecological environment should not be neglected. The constantly low-frequency background noise could cause severe hearing impairment. So that the real-time monitoring to dolphin activities when the construction of foundation pile plays a vital role. The purpose of this study was to investigate the voiceprint recognition of the whistle of dolphin using Mel-frequency Cepstrum Coefficient. The frequency band distribution of feature extraction is similar to human auditory system. The hearing resolution of low-frequency is higher than high-frequency. Nowadays, it is widely applied to speech recognition. Another aim was to use the delay time of the received acoustic data to ranging the source from the receiver. Because of the propagation medium, the signals transmission is limited to the sea surface and seabed. The different path reflection by boundary forms the unique characteristic of delay time to different spatial distribution of the source. By the database creation of different spatial acoustic data with impulse response, we can therefore use the voiceprint recognition to range the distance between source and receiver. It was found that from the received data by the three different type of the source signal transmitted in water, the accuracy of voiceprint recognition was about 97.86%. And the source ranging experiment implemented at the South China Sea, the ranging error distance was less than 0.5 km when the source and receiver near than 2 km. The results revealed that the proposed method is very promising for classification of underwater transient signals. And the characteristic of delay time to the received data can be the feature of source ranging as well.

目次 Table of Contents
論文審定書............................................................................................................. i 謝誌........................................................................................................................ ii 摘要....................................................................................................................... iii Abstract ................................................................................................................. iv 目錄....................................................................................................................... vi 圖次..................................................................................................................... viii 表次...................................................................................................................... xii 第一章緒論.................................................................................................... 1 1.1 前言............................................................................................. 1 1.2 研究動機及目的......................................................................... 2 1.3 文獻回顧..................................................................................... 4 1.4 論文架構................................................................................... 10 第二章訊號處理與數值方法...................................................................... 11 2.1 帶通濾波（Band-pass Filter） ................................................ 11 2.2 調變及重新採樣....................................................................... 12 2.3 端點偵測 (End-Point Detection) ............................................. 14 2.4 特徵參數擷取........................................................................... 22 2.5 聲紋比對與辨識....................................................................... 26 2.6 接收聲學資料能量分析........................................................... 27 第三章高雄港外水下目標物聲紋辨識實驗.............................................. 28 3.1 實驗使用儀器介紹................................................................... 29 3.2 實驗使用之聲學訊號............................................................... 36 3.3 實驗資料分析結果................................................................... 38 3.4 小結........................................................................................... 43 第四章南海海域水下目標物偵測與聲源定位實驗.................................. 44 4.1 實驗使用儀器介紹................................................................... 46 4.2 實驗使用之聲學訊號............................................................... 51 4.3 聲紋資料庫建置....................................................................... 53 4.4 南海實驗資料分析結果........................................................... 62 第五章結論.................................................................................................. 76 參考文獻............................................................................................................. 78

參考文獻 References
[1] 經濟部能源局風力發電單一服務窗口(2019)， https://www.twtpo.org.tw/gas.aspx?id=3268。 [2] 王亭勻(2009)。中華白海豚之保育策略。成功大學海洋科技與事務研究所學位論文，1-115。 [3] 蔡秉及，連光山，林茂，林玉輝(1994)。廈門港及鄰近海域浮游動物的生態研究，海洋學報，4。 [4] Yang, C. K. (2017)。船舶噪音對台灣西海岸中華白海豚之潛在影響。臺灣大學生態學與演化生物學研究所學位論文，1-81。 [5] Thomsen, F., Lüdemann, K., Kafemann, R., & Piper, W. (2006). Effects of offshore wind farm noise on marine mammals and fish. Biola, Hamburg, Germany on behalf of COWRIE Ltd, 62. [6] Caltrans. (2001). Pile Installation Demonstration Project, Fisheries Impact Assessment. PIDP EA 012081, Caltrans Contract 04A0148. [7] 周蓮香及李政諦(2010)。中華白海豚棲地熱點評估及整體保育方案規劃，行政院農業委員會林務局。 [8] Nachtigall, P. E., Pawloski, J. L., & Au, W. W. (2003). Temporary threshold shifts and recovery following noise exposure in the Atlantic bottlenosed dolphin (Tursiops truncatus). The Journal of the Acoustical Society of America, 113(6), 3425-3429. [9] Nister, D., & Stewenius, H. (2006, June). Scalable recognition with a vocabulary tree. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06) (Vol. 2, pp. 2161-2168). Ieee. [10] Tzanetakis, G., & Cook, P. (2002). Musical genre classification of audio signals. IEEE Transactions on speech and audio processing, 10(5), 293-302. [11] Huang, C. J., Yang, Y. J., Yang, D. X., & Chen, Y. J. (2009). Frog classification using machine learning techniques. Expert Systems with Applications, 36(2), 3737-3743. [12] Fagerlund, S. (2007). Bird species recognition using support vector machines. EURASIP Journal on Advances in Signal Processing, 2007(1), 038637. [13] Boashash, B., & O'shea, P. (1990). A methodology for detection and classification of some underwater acoustic signals using time-frequency analysis techniques. IEEE Transactions on Acoustics, Speech, and Signal Processing, 38(11), 1829-1841. [14] Tucker, S., & Brown, G. J. (2005). Classification of transient sonar sounds using perceptually motivated features. IEEE Journal of Oceanic Engineering, 30(3), 588-600. [15] Malfante, M., Mars, J. I., Dalla Mura, M., & Gervaise, C. (2018). Automatic fish sounds classification. The Journal of the Acoustical Society of America, 143(5), 2834-2846. [16] Atal, B., & Rabiner, L. (1976). A pattern recognition approach to voiced unvoiced-silence classification with applications to speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 24(3), 201-212. [17] Qiao, G., Ma, T., Liu, S., Zheng, N., Babar, Z., & Yin, Y. (2019, June). Spectral Entropy Based Dolphin Whistle Detection Algorithm and Its Possible Application for Biologically Inspired Communication. In OCEANS 2019 Marseille (pp. 1-6). IEEE. [18] Erbe, C., & King, A. R. (2008). Automatic detection of marine mammals using information entropy. The Journal of the Acoustical Society of America, 124(5), 2833-2840. [19] Shen, J. L., Hung, J. W., & Lee, L. S. (1998). Robust entropy-based endpoint detection for speech recognition in noisy environments. In Fifth international conference on spoken language processing. [20] Hsieh, S. C., Chen, W. P., Lin, W. C., Chou, F. S., & Lai, J. R. (2012). A Study of the Application of an Average Energy Entropy Method for the Endpoint Extraction of Frog Croak Syllables. 臺灣林業科學, 27(2), 177-189. [21] Boashash, B., & Black, P. (1987). An efficient real-time implementation of the Wigner-Ville distribution. IEEE transactions on acoustics, speech, and signal processing, 35(11), 1611-1618. [22] Atal, B. S. (1974). Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. the Journal of the Acoustical Society of America, 55(6), 1304-1312. [23] Davis, S., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE transactions on acoustics, speech, and signal processing, 28(4), 357-366. [24] Wang, W., Li, S., Yang, J., Liu, Z., & Zhou, W. (2016, January). Feature extraction of underwater target in auditory sensation area based on MFCC. In 2016 IEEE/OES China Ocean Acoustics (COA) (pp. 1-6). IEEE. [25] Taegyun Lim，Keunsung Bae，Chansik Hwang，Hyeonguk Lee， CLASSIFICATION OF UNDERWATER TRANSIENT SIGNALS USING MFCC FEATURE VECTOR，IEEE，2007 [26] Friedlander, B. (1988). Accuracy of source localization using multipath delays. IEEE Transactions on Aerospace and Electronic Systems, 24(4), 346 359. [27] Sakoe, H., & Chiba, S. (1978). Dynamic programming algorithm optimization for spoken word recognition. IEEE transactions on acoustics, speech, and signal processing, 26(1), 43-49. [28] Myers, C., Rabiner, L., & Rosenberg, A. (1980). Performance tradeoffs in dynamic time warping algorithms for isolated word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(6), 623-635.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0529120-164307.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS