博碩士論文 etd-0719111-152616 詳細資訊


[回到前頁查詢結果 | 重新搜尋]

姓名 彭煜庭(Yu-Ting Peng) 電子郵件信箱 E-mail 資料不公開
畢業系所 電機工程學系研究所(Electrical Engineering)
畢業學位 碩士(Master) 畢業時期 99學年第2學期
論文名稱(中) 分散式奇異值分解最小平方估計演算法
論文名稱(英) Distributed Algorithms for SVD-based Least Squares Estimation
檔案
  • etd-0719111-152616.pdf
  • 本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
    請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
    論文使用權限

    電子論文:校內外都一年後公開

    論文語文/頁數 中文/82
    統計 本論文已被瀏覽 5633 次,被下載 1844 次
    摘要(中) 奇異值分解(singular value decomposition,SVD)常被用來解最小平方估計的問題,但奇異值分解在求最小平方解時,非常耗費時間和記憶體空間。因此本論文提出疊代式分割與合併的演算法(iterative divide and merge algorithm, IDMSVD),目的是改善奇異值分解在估計參數時非常耗費時間以及記憶體空間的問題。IDMSVD的概念是先透過奇異值分解進行資料縮減,經過數個階層的資料縮減,最後再利用奇異值分解對縮減後的資料進行參數估計。其中資料縮減包含三個步驟,首先將輸入資料分成許多個資料區塊,然後利用奇異值分解對每個資料區塊分別做分解,之後合併分解後的結果,做為下一層的輸入矩陣,重複上述三個步驟,直到縮減後的資料夠小才停止疊代,最後利用奇異值分解最小平方估計法求得最小平方解。而對於大型資料集IDMSVD的執行時間仍有改善的空間,IDMSVD在執行每一個階層時,是按順序處理每一個資料區塊;但是,每個資料區塊彼此之間是互相獨立的,如果可以同時處理所有的資料區塊,則可以節省許多時間。所以,本論文基於IDMSVD提出兩種加速IDMSVD的演算法,分別使用兩種分散式系統實作,為雲端運算的Hadoop平台以及NVIDIA的圖形處理器(graphic processing unit, GPU)。將使用Hadoopuq平台的MapReduce實作的演算法稱為分散式IDMSVD演算法,而使用GPU實作的演算法稱為平行化IDMSVD演算法。實驗結果顯示,IDMSVD可以有效的改善SVD求最小平方解耗費執行時間與記憶體空間的問題,且分散式IDMSVD演算法與平行化IDMSVD演算法亦可進一步改善IDMSVD的執行時間。
    摘要(英) Singular value decomposition (SVD) is a popular decomposition method for solving least-squares estimation problems. However, for large datasets, SVD is very time consuming and memory demanding in obtaining least squares solutions. In this paper, we propose a least squares estimator based on an iterative divide-and-merge scheme for large-scale estimation problems. The estimator consists of several levels. At each level, the input matrices are subdivided into submatrices. The submatrices are decomposed by SVD respectively and the results are merged into smaller matrices which become the input of the next level. The process is iterated until the resulting matrices are small enough which can then be solved directly and efficiently by the SVD algorithm. However, the iterative divide-and-merge algorithms executed on a single machine is still time demanding on large scale datasets. We propose two distributed algorithms to overcome this shortcoming by permitting several machines to perform the decomposition and merging of the submatrices in each level in parallel. The first one is implemented in MapReduce on the Hadoop distributed platform which can run the tasks in parallel on a collection of computers. The second one is implemented on CUDA which can run the tasks in parallel using the Nvidia GPUs. Experimental results demonstrate that the proposed distributed algorithms can greatly reduce the time required to solve large-squares problems.
    關鍵字(中)
  • 矩陣分解
  • 奇異值分解
  • 最小平方估計
  • 分散式系統
  • 大型資料集
  • 平行處理
  • 關鍵字(英)
  • CUDA
  • Matrix decomposition
  • large-scale dataset
  • least-squares solution
  • SVD
  • MapReduce
  • Distributed
  • 論文目次 論文審定書 i
    致 謝 iii
    摘 要 iv
    Abstract v
    第一章 導論 1
    1.1研究動機與文獻探討 1
    1.2論文架構 3
    第二章 最小平方估計法 4
    2.1最小平方問題 4
    2.2奇異值分解最小平方估計法 5
    2.3遞迴式奇異值分解最小平方估計法 7
    2.4最小平方估計法範例 9
    2.4.1奇異值分解最小平方估計法 10
    2.4.2遞迴式奇異值分解最小平方估計法 11
    第三章 疊代式分割與合併演算法 13
    3.1 疊代式分割與合併奇異值分解最小平方估計法 13
    3.2 複雜度分析與比較 19
    3.3 疊代式分割與合併奇異值分解最小平方估計法範例 21
    第四章 分散式的疊代式分割與合併演算法 25
    4.1 Hadoop 25
    4.1.1 Hadoop的工作分配 26
    4.1.2 Hadoop叢集架構 26
    4.1.3 HDFS 28
    4.2 MapReduce 28
    4.2.1 MapReduce程式設計模型 28
    4.3 分散式的疊代式分割與合併奇異值分解最小平方估計法 32
    第五章 平行化疊代式分割與合併演算法 36
    5.1 通用圖形處理器 36
    5.2 CUDA 36
    5.2.1 CUDA架構 38
    5.2.2 CUDA記憶體模型 40
    5.3 平行化疊代式分割與合併奇異值分解最小平方估計法 41
    第六章 實驗結果 44
    6.1 分散式的疊代式分割與合併奇異值分解最小平方估計法 44
    6.1.1 實驗資料 44
    6.1.2 實驗環境 44
    6.1.3 MapReduce實驗一 45
    6.1.4 MapReduce實驗二 50
    6.1.5 MapReduce實驗三 52
    6.2 平行化疊代式分割與合併奇異值分解最小平方估計法 53
    6.2.1 實驗資料 54
    6.2.2 實驗環境 54
    6.2.3 GPU實驗一 54
    6.2.4 GPU實驗二 61
    第七章 結論與未來研究方向 67
    7.1 結論 67
    7.2 未來研究方向 68
    參考文獻 69
    參考文獻 [1] G. H. Golub and C. Reinsch, “Singular value decomposition and least squares solutions,” Numerische Mathematik, vol. 14, no. 6, pp. 403–420, April 1970.
    [2] G. H. Golub and C. F. V. Loan, Matrix Computations, 3rd ed. Baltimore, MD, USA: The Johns Hopkins University Press, October 1996.
    [3] D. C. Montgomery, E. A. Peck, and G. G. Vining, Introduction to Linear Regression Analysis, 4th ed. Hoboken, N.J., USA: Wiley-Interscience, July 2006.
    [4] R. H. Myers, D. C. Montgomery, G. G. Vining, and T. J. Robinson, Generalized Linear Models: with Applications in Engineering and the Sciences, 2nd ed. Hoboken, N.J., USA: Wiley-Interscience, March 2010.
    [5] O. Bretscher, Linear Algebra With Applications, 3rd ed. Upper Saddle River, N.J., USA: Prentice Hall, July 2004.
    [6] A. Bjorck, Numerical Methods for Least Squares Problems, 1st ed. Philadelphia, PA , USA: SIAM: Society for Industrial and Applied Mathematics, December 1996.
    [7] S. S. Niu, L. Ljung, and A . Bjorck, “Decomposition methods for solving least-squares parameter estimation,” IEEE Transactions on Signal Processing, vol. 44, no. 1, pp. 2847–2862, November 1996.
    [8] A. Bjorck and J. Y. Yuan, “Preconditioners for least squares problems by LU factorization,” Electronic Transactions on Numerical Analysis, vol. 8, pp. 26–36, November 1999.
    [9] S.-J. Lee and C.-S. Ouyang, “A neuro-fuzzy system modeling with self-constructing rule generation and hybrid SVD-based learning,” IEEE Transactions on Fuzzy Systems, vol. 11, no. 3, pp. 341–363, June 2003.
    [10] L. V. Foster, “Solving rank-deficient and ill-posed problems using UTV and QR factorizations,” SIAM Journal on Matrix Analysis and Applications, vol. 26, no. 2, pp. 682–600, February 2003.
    [11] C. B. Moler, Numerical Computing with Matlab. Philadelphia, PA , USA: Society for Industrial Mathematics, January 2004.
    [12] W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, Numerical Recipes in C: The Art of Scientific Computing, 2nd ed. Cambridge, UK: Cambridge University Press, October 1992.
    [13] L. Giraud, S. Gratton, and J. Langou, “A rank-k update procedure for reorthogonalizing the orthogonal factor from modified Gram-Schmidt,” SIAM Journal on Matrix Analysis and Applications, vol. 26, no. 4, pp. 1163–1177, April 2004.
    [14] V. Hari, “Accelerating the SVD block-jacobi method,” Computing, vol. 76, no. 1, pp. 27–63, March 2006.
    [15] Y. Yamamoto, T. Fukaya, T. Uneyama, M. Takata, K. Kimura, M. Iwasaki, and Y. Nakamura, “Accelerating the singular value decomposition of rectangular matrices with the CSX600 and the integrable SVD,” in Lecture Notes in Computer Science, vol. 4671, 2007, pp. 340–346.
    [16] S. Lahabar and P. J. Narayanan, “Singular value decomposition on GPU using CUDA,” in Proceedings of the 2009 IEEE International Symposium on Parallel and Distributed Processing, 2009, pp. 1–10.
    [17] T. Kondaa and Y. Nakamura, “A new algorithm for singular value decomposition and its parallelization,” Parallel Computing, vol. 36, no. 6, pp. 331–344, June 2009.
    [18] M. Bečka, G. Okša, M. Vajteršic, and L. Grigori, “On iterative QR pre-processing in the parallel block-jacobi SVD algorithm,” Parallel Computing, vol. 36, no. 6-6, pp. 297–307, June 2009.
    [19] H. Ltaief, J. Kurzak, and J. Dongarra, “Parallel two-sided matrix reduction to band bidiagonal form on multicore architectures,” IEEE Transactions on Parallel and Distributed Systems, vol. 21, no. 4, pp. 417–423, April 2010.
    [20] D. Peleg, Distributed Computing – A Locality-Sensitive Approach, Society for Industrial and Applied Mathematics(SIAM), Philadelphia, 2000.
    [21] S. Ghosh, Distributed Systems – An Algorithmic Approach, Chapman & Hall/CRC, 2006.
    [22] J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Communications of the ACM, vol. 61, no. 1, pp. 107–113, January 2008.
    [23] C.-T. Chu, S. K. Kim, Y.-A. Lin, Y. Yu, G. Bradski, A. Y. Ng, and K. Olukotun, “Map-reduce for machine learning on multicore,” in Advances in Neural Information Processing Systems, 2007, pp. 281–288.
    [24] W. Zhao, H. Ma, and Q. He, “Parallel k-means clustering based on mapreduce,” in Lecture Notes in Computer Science, vol. 6931, 2009, pp. 674–679.
    [25] A. Verma, X. Llora, D. E. Goldberg, and R. H. Campbelly, “Scaling genetic algorithms using mapreduce,” in Proceedings of the 9th International Conference on Intelligent Systems Design and Applications, 2009, pp. 13–18.
    [26] J. Cohen, “Graph twiddling in a mapreduce world,” Computing in Science & Engineering, vol. 11, no. 4, pp. 29–41, January 2009.
    [27] S. J. Matthews and T. L. Williams, “MrsRF: an efficient mapreduce algorithm for analyzing large collections of evolutionary trees,” BMC Bioinformatics, vol. 11, no. Suppl 1, January 2010.
    [28] W. Fang, B. He, Q. Luo, and N. K. Govindaraju, “Mars: Accelerating mapreduce with graphics processors,” IEEE Transactions on Parallel and Distributed Systems, vol. 22, no. 4, pp. 608–620, April 2011.
    [29] P. Harish and P. J. Narayanan. “Accelerating Large Graph Algorithms on the GPU Using CUDA,” in Proceedings of the IEEE International Conference on High Performance Computing, December 2007.
    [30] S. Ghemawat, H. Gobioff, and S.-T. Leung, “The Google File System,” in Proceedings of the 19th ACM Symposium on Operating Systems Principles, 2003.
    [31] T. White, Hadoop: The Definitive Guide, America: O’Reilly Media, 2009.
    [32] http://hadoop.apache.org/
    [33] http://developer.nvidia.com/
    口試委員
  • 吳志宏 - 召集委員
  • 侯俊良 - 委員
  • 歐陽振森 - 委員
  • 蔡賢亮 - 委員
  • 李錫智 - 指導教授
  • 口試日期 2011-07-05 繳交日期 2011-07-19

    [回到前頁查詢結果 | 重新搜尋]


    如有任何問題請與論文審查小組聯繫