國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,運用機器學習於資產定價模型之研究-以台灣股市為例,Application of Machine Learning in Asset Pricing Models

論文名稱 Title	運用機器學習於資產定價模型之研究-以台灣股市為例 Application of Machine Learning in Asset Pricing Models - New Evidence from Taiwan
系所名稱 Department	財務管理學系 Department of Finance
畢業學年期 Year, semester	112 學年度第 2 學期 The spring semester of Academic Year 112	語文別 Language	英文 English
學位類別 Degree	博士 Ph.D.	頁數 Number of pages	83
研究生 Author	蕭弘宗 Hung-Tsung Hsiao
指導教授 Advisor	王昭文 Wang,Chou-Wen
召集委員 Convenor	王銘駿 Wang,Ming-Chun
口試委員 Advisory Committee	邱魏頌正, 蘇玄啟, 陳宜伶 Chiou-Wei,Song-Zan; Su*, Xuan-Qi; Chen,Yi-Ling
口試日期 Date of Exam	2024-07-13	繳交日期 Date of Submission	2024-07-16
關鍵字 Keywords	資產定價、機器學習、Ridge、神經網絡、XGBoosT Asset Pricing, Machine Learning, Ridge Regression, Neural Networks, XGBoosT
統計 Statistics	本論文已被瀏覽 128 次，被下載 7 次 The thesis/dissertation has been browsed 128 times, has been downloaded 7 times.

中文摘要
本研究探討了上市櫃股票的資產定價(以台灣股市作為實證分析場域)，運用9種機器學習分析方法進行了廣泛研究，並與傳統線性模型作多方面比較。本文使用Ward方法和平均輪廓方法，透過133個特徵變數識別出了14個不同的主題性集群，這些集群描繪了影響股票行為的各種特徵因素，為交易策略提供了重要的市場分割洞察。其次，應用線性迴歸模型及機器學習方法(LASSO, Ridge及Enet)突顯本文變數（例如獲利/品質(V4)、低槓桿(V6)和債務發行(V9)）在過去正回報篩選後對股票表現的重要預測作用；相反的而價值（V10）則在負回報篩選中表現出重要的預測作用。而低風險/動量（V12）和動量（V14）等變數顯示出較低的迴歸係數，這顯示其預測作用較低。第三，機器學習效能的比較顯示Ridge在解釋能力和誤差指標方面表現優越，而神經網絡在召回率指標上表現出色，XGBoosT在精確度上則勝出，本研究據此發現設計了能穩定超越0050 ETF報酬之交易策略。總體而言，本研究提升了對台灣股市的理解並提供了實用的洞察。
Abstract
This study investigates asset pricing of listed stocks in the Taiwan stock market using machine learning analysis methods. Specifically, it employs nine machine learning techniques and one linear model. First, by using the Ward method and the average silhouette method, we identified 14 distinct thematic clusters from 133 feature variables. These clusters depict various characteristics influencing stock behavior, providing important market segmentation insights for strategic decision-making. Secondly, OLS linear regression and machine learning methods (LASSO, Ridge, and Enet) highlight the significant predictive roles of variables (e.g., Profitability/Quality (V4), Low Leverage (V6), and Debt Issuance (V9)) on stock performance after positive return screening. Conversely, the variable Value (V10) shows important predictive power in the negative return screening. Variables such as Low Risk/Momentum (V12) and Momentum (V14) exhibit lower regression coefficients, indicating a smaller impact. Thirdly, the comparison of machine learning models reveals that the Ridge regression model excels in explanatory power and error metrics, while neural networks perform well in recall, and XGBoosT slightly outperforms other models in precision. Based on these findings, we designed a trading strategy that consistently outperforms the 0050 ETF returns. Overall, this study enhances the understanding of dynamic changes in the Taiwan stock market, providing practical insights and decision support for investors.

目次 Table of Contents
論文審定書 i 誌謝 ii 摘要 iii Abstract iv 1. Introduction 1 1.1 Research Motivation 1 1.2 Research Objectives 2 1.3 Scope of Research 3 1.4 Research Contribution 7 1.5 Structure of the Study 8 2. Literature Review 10 2.1 Asset Pricing Models 10 2.2 Machine Learning in Asset Pricing 14 2.3 Research on Emerging Markets 16 3. Data and methodology 19 3.1 Data 19 3.2 Methodology 21 3.2.1 Simple Linear 22 3.2.2 Penalized Linear Models 23 3.2.3 Tree Models 24 3.2.4 Logistic regression Models 26 3.2.5 Vanilla neural networks: an extension of linear regression 28 3.2.6 Support Vector Machines 31 3.3 Appraising the Efficacy of Machine Learning Models in Regression and Classification Tasks 32 4. Empirical analysis and Result 34 4.1 Result of Thematic Cluster 34 4.2 Results of Evaluating model via Machine Learnings 39 4.3 Trading Strategy 43 5. Conclusion 45 References 49 Table List 52 Table 4-1: COMPARISION TABLE of JENSEN’S and the STUDY’S CATEGORY 52 Table 4-2 : LIST of the STUDY’S CATEGORY VARIABLES 55 Table 4-3: More Details of the STUDY’S CATEGORY VARIABLES 56 Table 4-4：NEW CLUSTERS BASED on JENSEN’S CATEGORIZATION 57 Table 4-5：DISCRIPTIVE STATISTICS of TRAINING SAMPLES 58 Table 4-6：DISCRIPTIVE STATISTICS of TEST SAMPLES 59 Table 4-7：MODEL COEFFICIENTS 60 Table 4-8：MACHINE LEARNING MODLES with REGRESSION 61 Table 4-9：MACHINE LEARNING MODLES with CLASSIFICATION 62 Table 4-10 PERFORMANCE ANALYSIS OF VARIOUS STOCK PICKS USING XGBoost 63 Figure List 64 Figure 4-1：AVERAGE SILHOUETTE METHOD for WARD’S 64 Figure 4-2：CLUSTERING RESULTS of DATA FEATURES 65 Figure 4-4: MODEL COEFFICIENTS 67 Figure 4-5 CONFUSION MATRIX OF ML MODELS 68 Figure 4-6 COMPARATIVE 5-YEAR COMPOUNDED Returns of XGBoost-SELECTED STOCKS 69 Appendix A： 70 Details of Jensen’s variables 70 Glossary of Terms 74

參考文獻 References
Asness, C. S., & Frazzini, A. (2014). The devil in HML's details. SSRN. Banz, R. W. (1981). The relationship between return and market value of common stocks. Journal of financial economics, 9(1), 3-18. Basu, S. (1977). Investment performance of common stocks in relation to their price‐earnings ratios: A test of the efficient market hypothesis. The journal of finance, 32(3), 663-682. Carhart, M. M. (1997). On persistence in mutual fund performance. The journal of finance, 52(1), 57-82. Cochrane, J. H. (2011). Presidential address: Discount rates. The journal of finance, 66(4), 1047-1108. Fama, E. F., & French, K. R. (1992). The cross‐section of expected stock returns. The journal of finance, 47(2), 427-465. Fama, E. F., & French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of financial economics, 33(1), 3-56. Fama, E. F., & French, K. R. (1996). Multifactor explanations of asset pricing anomalies. The journal of finance, 51(1), 55-84. Fama, E. F., & French, K. R. (2004). The capital asset pricing model: Theory and evidence. Journal of economic perspectives, 18(3), 25-46. Fama, E. F., & French, K. R. (2015). A five-factor asset pricing model. Journal of financial economics, 116(1), 1-22. Fama, E. F., & French, K. R. (2017). International tests of a five-factor asset pricing model. Journal of financial economics, 123(3), 441-463. Feng, G., Giglio, S., & Xiu, D. (2020). Taming the factor zoo: A test of new factors. The journal of finance, 75(3), 1327-1370. Green, J., Hand, J. R., & Zhang, X. F. (2017). The characteristics that provide independent information about average US monthly stock returns. The Review of Financial Studies, 30(12), 4389-4436. Gu, S., Kelly, B., & Xiu, D. (2020). Empirical asset pricing via machine learning. The Review of Financial Studies, 33(5), 2223-2273. Hsiao, H.-T., Wang, C.-W., Liu, I.-C., & Kung, K.-L. (2024). Mortality improvement neural-network models with autoregressive effects. The Geneva Papers on Risk and Insurance-Issues and Practice, 1-21. Harvey, C. R., Liu, Y., & Zhu, H. (2016). … and the cross-section of expected returns. The Review of Financial Studies, 29(1), 5-68. Huber, P. J. (2004). Robust statistics (Vol. 523). John Wiley & Sons. Jegadeesh, N., & Titman, S. (1993). Returns to buying winners and selling losers: Implications for stock market efficiency. The journal of finance, 48(1), 65-91. Jensen, T. I., Kelly, B., & Pedersen, L. H. (2023). Is there a replication crisis in finance? The journal of finance, 78(5), 2465-2518. Kelly, B., & Xiu, D. (2023). Financial machine learning. Foundations and Trends® in Finance, 13(3-4), 205-363. Leippold, M., Wang, Q., & Zhou, W. (2022). Machine-learning in the Chinese factor zoo. Journal of financial economics, 145(2), 64-82. Lintner, J. (1965). Security prices, risk, and maximal gains from diversification. The journal of finance, 20(4), 587-615. McLean, R. D., & Pontiff, J. (2016). Does academic research destroy stock return predictability? The journal of finance, 71(1), 5-32. Sharpe, W. F. (1964). Capital asset prices: A theory of market equilibrium under conditions of risk. The journal of finance, 19(3), 425-442. Stattman, D. (1980). Book values and stock returns. The Chicago MBA: A journal of selected papers, 4(1), 25-45. Welch, I., & Goyal, A. (2008). A comprehensive look at the empirical performance of equity premium prediction. The Review of Financial Studies, 21(4), 1455-1508. Yao, H., Xia, S., & Liu, H. (2022). Six-factor asset pricing and portfolio investment via deep learning: Evidence from Chinese stock market. Pacific-Basin Finance Journal, 76, 101886.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：校內校外完全公開 unrestricted 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0616124-130303.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2453 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2453 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS