論文使用權限 Thesis access permission:校內立即公開,校外一年後公開 off campus withheld
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available
論文名稱 Title |
多重序列的最長共同子序列之混合式演算法 A Hybrid Algorithm for the Longest Common Subsequence of Multiple Sequences |
||
系所名稱 Department |
|||
畢業學年期 Year, semester |
語文別 Language |
||
學位類別 Degree |
頁數 Number of pages |
67 |
|
研究生 Author |
|||
指導教授 Advisor |
|||
召集委員 Convenor |
|||
口試委員 Advisory Committee |
|||
口試日期 Date of Exam |
2009-07-07 |
繳交日期 Date of Submission |
2009-08-19 |
關鍵字 Keywords |
最長共同子序列、螞蟻演算法、基因演算法 Ant Colony Optimization Algorithm, genetic algorithm, longest common subsequence |
||
統計 Statistics |
本論文已被瀏覽 5796 次,被下載 1594 次 The thesis/dissertation has been browsed 5796 times, has been downloaded 1594 times. |
中文摘要 |
k-LCS問題是為了找出k條序列的最長共同子序列(LCS),當序列的數量很大時,此問題是困難的。以往,研究者都著重於尋找兩條序列的最長共同子序列 (2-LCS) 之研究,然而對於找出k條序列的最長共同子序列,從古至今仍未有好的演算法能夠找到最佳解。為了解決k-LCS問題,在本論文中,我們首先提出一個混合式的演算法,它結合啟發式演算法、基因演算法(GA)和螞蟻演算法(ACO)。 此外,我們也提出一個增強式螞蟻演算法(EACO),它以螞蟻演算法為主體結合啟發式演算法和最佳配對演算法(MPA)。在我們實驗中,我們的演算法與expansion演算法、best next for maximal available symbol演算法、基因演算法和螞蟻演算法相互比較,對於多組基因序列與蛋白質序列的資料集實驗,我們的演算法(EACO)的結果都優於其他的演算法。 |
Abstract |
The k-LCS problem is to find the longest common subsequence (LCS) of k input sequences. It is difficult while the number of input sequences is large. In the past, researchers focused on finding the LCS of two sequences (2-LCS). However, there is no good algorithm for finding the optimal solution of k-LCS up to now. For solving the k-LCS problem, in this thesis, we first propose a mixed algorithm, which is a combination of a heuristic algorithm, genetic algorithm (GA) and ant colony optimization (ACO) algorithm. Then, we propose an enhanced ACO (EACO) algorithm, composed of the heuristic algorithm and matching pair algorithm (MPA). In our experiments, we compare our algorithms with expansion algorithm, best next for maximal available symbol algorithm, GA and ACO algorithm. The experimental results on several sets of DNA and protein sequences show that our EACO algorithm outperforms other algorithms in the lengths of solutions. |
目次 Table of Contents |
LIST OF FIGURES ii LIST OF TABLES iii ABSTRACT v 1 Introduction 1 2 Preliminaries 4 2.1 The Longest Common Subsequence Problem . . . . . . . . . . . . . . . 4 2.2 Dynamic Programming Algorithm for 2-LCS . . . . . . . . . . . . . . . 5 2.3 Algorithms for k-LCS . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3.1 The Expansion Algorithm . . . . . . . . . . . . . . . . . . 9 2.3.2 The Best Next for Maximal Available Symbols . . . . . . . . 10 2.3.3 Genetic Algorithm for k-LCS . . . . . . . . . . . . . . . . 11 2.3.4 Ant Colony Optimization Algorithm for k-LCS . . . . . . . . 13 3 Hybrid Algorithms for k-LCS 17 3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2 Mixed Algorithm with ACO and GA . . . . . . . . . . . . . . . . . . . 18 3.3 Enhanced ACO Algorithm with MPA . . . . . . . . . . . . . . . . . . . 26 4 Experimental Results 31 5 Conclusion 50 BIBLIOGRAPHY 52 |
參考文獻 References |
[1] H.-Y. Ann, C.-B. Yang, C.-T. Tseng, and C.-Y. Hor, “A fast and simple algorithm for computing the longest common subsequence of run-length encoded strings,” Information Processing Letters, Vol. 108, pp. 360–364, 2008. [2] P. Bonizzoni, G. D. Vedova, and G. Mauri, “Experimenting an approximation algorithm for the LCS,” Discrete Applied Mathematics, Vol. 110, No. 1, pp. 13–24, 2001. [3] C.-H. Chiang, “A genetic algorithm for the longest common subsequence of multiple sequences,” Master Thesis, Department of Computer Science and Engineering, National Sun Yat-sen University, 2009. [4] M. Dorigo, V. Maniezzo, and A. Colorni, “Ant system: Optimization by a colony of cooperating agents,” IEEE Transactions on Systems, Man, and Cybernetics-Part B, Vol. 26, No. 1, pp. 29–41, 1996. [5] D. S. Hirschberg, “A linear space algorithm for computing maximal common subsequence,” Communications of the ACM, Vol. 18, No. 6, pp. 341–343, 1975. [6] D. S. Hirschberg, “Algorithms for the longest common subsequence problem,” Journal of ACM, Vol. 24, pp. 664–675, 1977. [7] K.-F. Huang, C.-B. Yang, and K.-T. Tseng, “An efficient algorithm for multiple sequence alignment,” Proc. of the 19th Workshop on Combinatorial Mathematics and Computation Theory, Kaohsiung, Taiwan, pp. 50–59, 2002. [8] K.-S. Huang, C.-B. Yang, and K.-T. Tseng, “Fast algorithms for finding the common subsequence of multiple sequences,” Proceeings of International Computer Symposium, Taipei, Taiwan, pp. 90–95, 2004. [9] K.-S. Huang, C.-B. Yang, K.-T. Tseng, H.-Y. Ann, and Y.-H. Peng,“Efficient algorithms for finding interleaving relationship between sequences,”Information Processing Letters, Vol. 105, No. 5, pp. 188–193, 2008. [10] K.-S. Huang, C.-B. Yang, K.-T. Tseng, Y.-H. Peng, and H.-Y. Ann, “Dynamic programming algorithms for the mosaic longest common subsequence problem,” Information Processing Letters, Vol. 102, pp. 99–103, 2007. [11] J. W. Hunt and T. G. Szymanski, “A fast algorithm for computing longest common subsequences,” Communications of the ACM, Vol. 20, No. 5, pp. 350–353, 1977. [12] R. C. T. Lee, R. C. Chang, S. S. Tseng, and Y. T. Tsai, Introduction to the Design and Analysis of Algorithm - a strategic approach. ISBN 007-124346-1, McGraw Hill, 2005. [13] D. Maier, “The complexity of some problems on subsequences and supersequences,”Journal of the ACM, Vol. 25, No. 2, pp. 322–336, 1978. [14] D. B. Needleman and C. D. Wunsch, “A general method applicable to the search for similarities in the amino acid sequence of two proteins,”Journal of Molecular Biology, Vol. 48, No. 3, pp. 443–453, 1970. [15] Y.-H. Peng, C.-B. Yang, K.-S. Huang, C.-T. Tseng, and C.-Y. Hor,“Efficient sparse dynamic programming for the merged LCS problem with block constraints,” Accepted by International Journal of Innovative Computing, Information and Control, 2009. [16] S.-J. Shyu and C.-Y. Tsai, “Finding the longest common subsequence for multiple biological sequences by ant colony optimization,” Computers & Operations Research, Vol. 36, pp. 73–91, 2009. [17] R. E. Smith, B. A. Dike, and S. A. Stegmann, “Fitness inheritance in genetic algorithms,” Proceedings of the 1995 ACM symposium on Applied computing, Nashville, TN, USA, pp. 345–350, 1995. [18] R. A. Wagner and M. J. Fischer, “The string-to-string correction problem,”Journal of the ACM, Vol. 21, No. 1, pp. 168–173, 1974. [19] C.-B. Yang and R. C. T. Lee, “Systolic algorithms for the longest common subsequence problem,” Journal of the Chinese Institute of Engineers, Vol. 10, No. 6, pp. 691–699, 1987. |
電子全文 Fulltext |
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。 論文使用權限 Thesis access permission:校內立即公開,校外一年後公開 off campus withheld 開放時間 Available: 校內 Campus: 已公開 available 校外 Off-campus: 已公開 available |
紙本論文 Printed copies |
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。 開放時間 available 已公開 available |
QR Code |