Responsive image
博碩士論文 etd-0614122-091537 詳細資訊
Title page for etd-0614122-091537
論文名稱
Title
金融科技的投資研究與其延伸探討
Application of Financial Technology to Investment and Other Extensions
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
181
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2022-07-11
繳交日期
Date of Submission
2022-07-14
關鍵字
Keywords
金融科技、程式交易、現金流折現法評價、跨語言系統、損失發散、標記更正
Financial technology, Algorithmic trading, Valuation of DCF, Cross-linguist system, Loss divergence, Label correction
統計
Statistics
本論文已被瀏覽 430 次,被下載 0
The thesis/dissertation has been browsed 430 times, has been downloaded 0 times.
中文摘要
隨著人工智慧的興起,機器學習在金融科技的應用也變成了新興的議題。雖然金融科技包含的面向相當廣泛,核心議題仍然圍繞著金融商品的交易與評價。本論文是以金融科技為主軸所延伸出來的五項研究。第一項研究是利用現金流折現法評價股票,這個研究的困難點在於估計未來合理各期的現金流與折現因子,因此利用模仿學習與引導策略搜索方法(guided policy search)的方法來讓現金流與折現利率符合以雷維函式(Levy function)的歷史資料。第二項研究在高頻的外匯交易資料,因為越高頻的資料越難預測,因此探討如於其中挑選交易資料(交易頻率與交易貨幣配對)來讓測試的資料有正的報酬。後續三項研究是延伸自前兩項議題。第一項延伸研究是探討金融科技本身提供給各語言的人使用時是否可以達到使用功能與輸入語言可以分別開發,兩者之間依賴著語言共通的向量集合,在此我們稱之為羅賽塔石(Rosetta Stone)。第二項延伸研究源於先前外匯交易與股票評價的研究上使用的深度學習模型在訓練的過程有時損失(loss)會有發散的狀況,因此神經網路使用二階動能的優化器(optimizer)會出現損失發散(loss divergence)的問題,這個問題會讓長時間的訓練功虧一簣,在此我們分析並比較了各種二階動能優化器的優劣後提出可以及早偵測損失發散的方法。最後一項
延伸研究是關於機器學習的標記問題,如果金融的資料(例如,股票評價標記是否價格過高與過低)由使用者來進行標記則可能會有標記錯誤的問題,因此我們在此探討該如何用最有效率的方式來找出需要標記的資料進行重新標記。本論文依序探討上述的五項議題並提出解決方案,分別進行一系列實驗評量其成效。
Abstract
Following the rise of artificial intelligence, the application of machine learning in financial technology has become a promising topic. Financial technology comprises a broad spectrum, and its core subjects include trading and valuations. This dissertation covers five research topics and address some essential issues in trading and valuation. The first research topic concerns the valuation of discounted cash flow (DCF), which requires the estimation of free cash flow (FCF) and discounted rate. Here, the FCF complies with the chronological order of the Levy distribution. This study employs imitation learning and guided policy search to fit the parameters to derive the FCF and discounted rate. The second topic covers machine learning in intraday foreign exchange trading. Evidently, it often suffers from over-fitting. However, data selection provides a new way to avoid over-fitting; therefore, this study aims to develop new data selection methods to tackle overfitting. Three extended topics were addressed from the above two research topics. The third topic explores how a financial system's input language can be separated from its functionality. There is a vector set that can act as the concept of Rosetta Stone. To obtain the translation of a sentence written in a new language, only the alignment of this sentence with the vector set is required. The fourth topic concerns the problem of loss convergence that occurs during the process of training a financial model with the RMSProp optimizer. After the reasons for the problem are identified, this study introduces methods for detecting and enduring loss divergence in the early stage of model training. The last topic is for solving the problem of labeling financial data caused by human errors. For example, for DCF valuation, supervised learning requires labeled data; however, data incorrectly labeled by humans degraded training performance. Therefore, a method was introduced to distribute the data to be labeled and avoid human errors. To evaluate the proposed methods for the above five issues properly, some sets of experiments were conducted, and the results showed better performance for these methods compared to existing methods.
目次 Table of Contents
Dissertation Validation Letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
Abstract(Chinese) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Abstract(English) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 DCF valuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 FX Intraday Trading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Extended researches. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11
Chapter 2 DCF valuation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
2.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18
2.2.1 Dividing positive FCF into fraction and exponent. . . . . . . . . . . . .18
2.2.2 Modeling the exponents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.3 Modeling the fraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.4 Ranges of tick, c, and pivot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2.5 Demonstrations of pivot, c, tick, and rate . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.6 Loss function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.7 Trading strategy and performance metrics . . . . . . . . . . . . . . . . . . . . . 29
2.2.8 Baseline models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3.1 Loss and accuracy for the exponent. . . . . . . . . . . . . . . . . . . . . . . . . . . . .34
2.3.2 Distributions of the agent and teacher outputs . . . . . . . . . . . . . . . . 34
2.3.3 Valuation performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.4 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41
Chapter 3 FX Intraday Trading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .46
3.2.1 Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47
3.2.2 Selection of currency pair and trading frequency . . . . . . . . . . . . . . 51
3.2.3 Training and testing of the machine learning model. . . . . . . . . . .55
3.2.4 Moving window. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .58
3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.3.1 Performance of metrics in selecting trading currency pairs and
frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.3.2 Comparison of trading strategies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .63
3.3.3 Effect of fine-tuning hyperparameters . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.3.4 Evaluation of machine learning models . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.3.5 Further analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.4 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .73
Chapter 4 Deciphering the Rosetta Stone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.2 Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .78
4.2.1 Imitation learning to decouple encoder and decoder . . . . . . . . . . 79
4.2.2 Transfer learning to decouple encoder and decoder . . . . . . . . . . . 81
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.3.1 Visualization of vector set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.3.2 Translation quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.3.3 Accuracy over nth character. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .88
4.3.4 Results on large dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.3.5 Results of the Multi30K dataset and transformer . . . . . . . . . . . . . 92
4.3.6 Application of imitation learning on letter to our shareholders
of Volkswagen AG 2021 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.4 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .96
Chapter 5 Loss divergence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.2.1 Stability analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.2.2 Detecting divergence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .110
5.2.3 Countermeasures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112
5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.3.1 Loss divergence of RMSProp and countermeasures . . . . . . . . . . 114
5.3.2 Applying countermeasures to loss divergence of Adam and Adamvariants.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .117
5.3.3 Ensemble method to avoid loss divergence. . . . . . . . . . . . . . . . . . . .119
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Chapter 6 Active label correction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .122
6.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6.2.1 Sample inspection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .127
6.2.2 Label correction with relabeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.2.3 Definition of performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.2.4 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.3.1 Uncertainty sampling with and without sample inspection . . 130
6.3.2 Comparison of human error rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .131
6.3.3 Comparison of strong and weak model . . . . . . . . . . . . . . . . . . . . . . . . 137
6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Chapter 7 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .141
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Appendix A: Account of financial statements used in DCF valuation . . . . . . . . . . 162
Appendix B: Comparison of log loss and linear loss. . . . . . . . . . . . . . . . . . . . . . . . . . .164
參考文獻 References
[1] R. Lundholm and T. O’keefe, “Reconciling value estimates from the discounted cash flow model and the residual income model,” Contemporary Accounting Research, vol. 18, no. 2, pp. 311–335, 2001.
[2] P. Krüger, A. Landier, and D. Thesmar, “The wacc fallacy: The real effects of using a unique discount rate,” The Journal of Finance, vol. 70, no. 3, pp. 1253–1285, 2015.
[3] V. Dickinson, “Cash flow patterns as a proxy for firm life cycle,” The Accounting Review, vol. 86, no. 6, pp. 1969–1994, 2011.
[4] M. A. Dempster, T. W. Payne, Y. Romahi, and G. W. Thompson, “Computational learning techniques for intraday fx trading using popular technical indicators,” IEEE Transactions on Neural Networks, vol. 12, no. 4, pp. 744–754, 2001.
[5] Y. Li, “Deep reinforcement learning: An overview,” arXiv preprint arXiv:1701.07274, 2017.
[6] S. Gu, T. Lillicrap, I. Sutskever, and S. Levine, “Continuous deep q-learning with model-based acceleration,” in International Conference on Machine Learning, New York, USA, 2016, pp. 2829–2838.
[7] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971, 2015.
[8] J. Kober and J. Peters, “Imitation and reinforcement learning,” IEEE Robotics & Automation Magazine, vol. 17, no. 2, pp. 55–62, 2010.
[9] H. He, J. Eisner, and H. Daume, “Imitation learning by coaching,” in Advances in Neural Information Processing Systems, 2012, pp. 3149–3157.
[10] A. Attia and S. Dayan, “Global overview of imitation learning,” arXiv preprint arXiv:1801.06503, 2018.
[11] T. Hester, M. Vecerik, O. Pietquin, M. Lanctot, T. Schaul, B. Piot, D. Horgan, J. Quan, A. Sendonaris, I. Osband et al., “Deep q-learning from demonstrations,” in Thirty-Second AAAI Conference on Artificial Intelligence, Louisiana, USA, 2018.
[12] S. Ross, G. Gordon, and D. Bagnell, “A reduction of imitation learning and structured prediction to no-regret online learning,” in Proceedings of the Fourteenth International Conference on Artificial Intelligence and statistics, Florida, USA, 2011, pp. 627–635.
[13] K.-W. Chang, A. Krishnamurthy, A. Agarwal, H. Daume, and J. Langford, “Learning to search better than your teacher,” in International Conference on Machine Learning. Lille, France: PMLR, 2015, pp. 2058–2066.
[14] S. Levine and V. Koltun, “Guided policy search,” in International Conference on Machine Learning, Georgia, USA, 2013, pp. 1–9.
[15] G. Kahn, T. Zhang, S. Levine, and P. Abbeel, “Plato: Policy learning using adaptive trajectory optimization,” in 2017 IEEE International Conference on Robotics and Automation (ICRA). Singapore: IEEE, 2017, pp. 3342–3349.
[16] W. Montgomery, A. Ajay, C. Finn, P. Abbeel, and S. Levine, “Reset-free guided policy search: Efficient deep reinforcement learning with stochastic initial states,” in 2017 IEEE International Conference on Robotics and Automation (ICRA). Singapore: IEEE, 2017, pp. 3373–3380.
[17] S. Levine and P. Abbeel, “Learning neural network policies with guided policy search under unknown dynamics,” in Advances in Neural Information Processing Systems, 2014, pp. 1071–1079.
[18] Z. Wang, V. Bapst, N. Heess, V. Mnih, R. Munos, K. Kavukcuoglu, and N. de Freitas, “Sample efficient actor-critic with experience replay,” arXiv preprint arXiv:1611.01224, 2016.
[19] A. Damodaran, Investment valuation: Tools and techniques for determining the value of any asset. John Wiley & Sons, 2012, vol. 666.
[20] T. Koller, M. Goedhart, D. Wessels et al., Valuation: measuring and managing the value of companies. John Wiley and Sons, 2010, vol. 499.
[21] M. Goedhart, T. Koller, and D. Wessels, Valuation: Measuring and managing the value of companies. JohnWiley & Sons, 2015.
[22] T. K. Bee, “Discounted cash flow method for valuing international chemical distributors,” The Journal of Private Equity, vol. 22, no. 1, pp. 52–69, 2018.
[23] E. Scalas and K. Kim, “The art of fitting financial time series with levy stable distributions,” arXiv preprint physics/0608224, 2006.
[24] A. Matacz, “Financial modeling and option theory with the truncated lévy process,” International Journal of Theoretical and Applied Finance, vol. 3, no. 01, pp. 143–160, 2000.
[25] A. P. Chaboud, B. Chiquoine, E. Hjalmarsson, and C. Vega, “Rise of the machines: Algorithmic trading in the foreign exchange market. the journal of finance,” The Journal of Finance, vol. 69, pp. 2045–2084, 2014.
[26] T. Hendershott, C. M. Jones, and A. J. Menkveld, “Does algorithmic trading improve liquidity?” The Journal of Finance, vol. 66, pp. 1–33, 2011.
[27] M. P. Taylor and H. Allen, “The use of technical analysis in the foreign exchange market,” Journal of International Money and Finance, vol. 11, pp. 304–314, 1992.
[28] N. Taylor, “The rise and fall of technical trading rule success,” Journal of Banking and Finance, vol. 40, pp. 286–302, 2014.
[29] G. Sermpinis, C. Stasinakis, R. Rosillo, and D. de la Fuente, “European exchange trading funds trading with locally weighted support vector regression,” European Journal of Operational Research, vol. 258, pp. 372–384, 2017.
[30] Y. Deng, F. Bao, R. Kong, Y., Z., and Q. Dai, “Deep direct reinforcement learning for financial signal representation and trading,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, pp. 653–664, 2017.
[31] G. Sermpinis, C. Dunis, J. Laws, and C. Stasinakis, “Forecasting and trading the eur/usd exchange rate with stochastic neural network combination and time-varying leverage,” Decision Support Systems, vol. 54, pp. 316–329, 2012.
[32] D. Anguita, S. Ridella, F. Rivieccio, and R. Zunino, “Automatic hyperparameter tuning for support vector machines,” In International Conference on Artificial Neural Networks, pp. 1345–1350, 2002.
[33] J. Snoek, H. Larochelle, and R. P. Adams, “Practical bayesian optimization of machine learning algorithms,” Neural Information Processing Systems, pp. 2951–2959, 2012.
[34] R. G. Mantovani, A. L. Rossi, J. Vanschoren, B. Bischl, and A. C. Carvalho, “To tune or not to tune: recommending when to adjust svm hyper-parameters via meta-learning,” Proceedings of the International Joint Conference on Neural Networks, pp. 1–8, 2015.
[35] K. Duan, S. S. Keerthi, and A. N. Poo, “Evaluation of simple performance measures for tuning svm hyperparameters,” Neurocomputing, vol. 51, pp. 41–59, 2003.
[36] P. M. Domingos, “A few useful things to know about machine learning,” Communications of the ACM, vol. 55, pp. 78–87, 2012.
[37] J. Beckmann and R. Schüssler, “Forecasting exchange rates under parameter and model uncertainty,” Journal of International Money and Finance, vol. 60, pp. 267–288, 2016.
[38] C. X. Luo, X. and X. Ban, “Extreme learning machine for regression and classification using l1-norm and l2-norm,” In Proceedings of ELM-2014, vol. 1, pp. 293–300, 2015.
[39] X. Luo, X. Chang, and X. Ban, “Regression and classification using extreme learning machine based on l1-norm and l2-norm,” Neurocomputing, vol. 174, pp. 179–186, 2016.
[40] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” The Journal of Machine Learning Research, vol. 15, pp. 1929–1958, 2014.
[41] L. Luo and X. Chen, “Integrating piecewise linear representation and weighted support vector machine for stock trading signal prediction,” Applied Soft Computing, vol. 13, pp. 806–816, 2013.
[42] J. L. Wu, L. C. Yu, and P. C. Chang, “An intelligent stock trading system using comprehensive features,” Applied Soft Computing, vol. 23, pp. 39–50, 2014.
[43] X. D. Zhang, A. Li, and R. Pan, “Stock trend prediction based on a new status box method and adaboost probabilistic support vector machine,” Applied Soft Computing, vol. 49, pp. 385–398, 2016.
[44] C. F. Tsai and Z. Y. Quan, “Stock prediction by searching for similarities in candlestick charts,” ACM Transactions on Management Information Systems, vol. 5, 2014.
[45] M. A. H. Dempster, T. W. Payne, Y. Romahi, and G. W. P. Thompson, “Computational learning techniques for intraday FX trading using popular technical indicators,” IEEE Transactions on Neural Networks, vol. 12, pp. 744–754, 2001.
[46] S. Y. Seidel and T. S. Rappaport, “914 mhz path loss prediction models for indoor wireless communications in multifloored buildings,” IEEE transactions on Antennas and Propagation, vol. 40, pp. 207–217, 1992.
[47] A. Bose and C. H. Foh, “A practical path loss model for indoor wifi positioning enhancement,” International Conference on Information, Communications Signal Processing, pp. 1–5, 2007.
[48] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” International Conference on Learning Representations (ICLR), 2015.
[49] S. J. Reddi, S. Kale, and S. Kumar, “On the convergence of adam and beyond,” in International Conference on Learning Representations (ICLR), 2018.
[50] T. Dozat, “Incorporating nesterov momentum into adam,” in ICLR 2016 workshop, 2016.
[51] H. Huang, C. Wang, and B. Dong, “Nostalgic adam: Weighting more of the past gradients when designing the adaptive learning rate,” in Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, 7 2019, pp. 2556–2562.
[52] L. Liu, H. Jiang, P. He, W. Chen, X. Liu, J. Gao, and J. Han, “On the variance of the adaptive learning rate and beyond,” in International Conference on Learning Representations (ICLR), April 2020.
[53] J. Ma and D. Yarats, “Quasi-hyperbolic momentum and adam for deep learning,” in International Conference on Learning Representations (ICLR), 2019.
[54] D. D. Lewis and J. Catlett, “Heterogeneous uncertainty sampling for supervised learning,” in Machine learning proceedings 1994. Elsevier, 1994, pp. 148–156.
[55] M. Kraus and S. Feuerriegel, “Decision support from financial disclosures with deep neural networks and transfer learning,” Decision Support Systems, vol. 104, pp. 38–48, 2017.
[56] M. Hagenau, M. Liebmann, and D. Neumann, “Automated news reading: Stock price prediction based on financial news using context-capturing features,” Decision Support Systems, vol. 55, no. 3, pp. 685–697, 2013.
[57] R. Akita, A. Yoshihara, T. Matsubara, and K. Uehara, “Deep learning for stock prediction using numerical and textual information,” in 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS). Okayama, Japan: IEEE, 2016, pp. 1–6.
[58] X. Ding, Y. Zhang, T. Liu, and J. Duan, “Deep learning for event-driven stock prediction,” in Twenty-fourth international joint conference on artificial intelligence, 2015.
[59] Y.-W. C. Chien and Y.-L. Chen, “Mining associative classification rules with stock trading data–a ga-based method,” Knowledge-Based Systems, vol. 23, no. 6, pp. 605–614, 2010.
[60] Y. Li, W. Zheng, and Z. Zheng, “Deep robust reinforcement learning for practical algorithmic trading,” IEEE Access, vol. 7, pp. 108 014–108 022, 2019.
[61] S. Lahmiri, “Intraday stock price forecasting based on variational mode decomposition,” Journal of Computational Science, vol. 12, pp. 23–27, 2016.
[62] A. P. Shon, D. Verma, and R. P. Rao, “Active imitation learning,” in Association for the Advancement of Artificial Intelligence (AAAI), 2007, pp. 756–762.
[63] M. Christersson, J. Vimpari, and S. Junnila, “Assessment of financial potential of real estate energy efficiency investments–a discounted cash flow approach,” Sustainable Cities and Society, vol. 18, pp. 66–73, 2015.
[64] J. Mun, “Real options and monte carlo simulation versus traditional dcf valuation in layman’s terms,” in Managing Enterprise Risk. Elsevier, 2006, pp. 75–106.
[65] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[66] W. E. Buffett, “An owner’s manual,” https://www.berkshirehathaway.com/1996ar/manual.html, 1996.
[67] H. Z. Moayedi and M. A. Masnadi-Shirazi, “Arima model for network traffic prediction and anomaly detection,” In 2008 International Symposium on Information Technology, vol. 4, pp. 1–6, 2008.
[68] P. A. Dinda and D. R. O’hallaron, “Host load prediction using linear models,” Cluster Computing, vol. 3, pp. 265–280, 2000.
[69] R. C. Garcia, J. Contreras, M. Van Akkeren, and J. B. C. Garcia, “A garch forecasting model to predict day-ahead electricity prices,” IEEE transactions on power systems, vol. 20, pp. 867–874, 2005.
[70] B. Chen, Y. R. Gel, N. Balakrishna, and B. Abraham, “Computationally efficient bootstrap prediction intervals for returns and volatilities in arch and garch processes,” Journal of Forecasting, vol. 30, pp. 51–71, 2011.
[71] G. P. Zhang, “Time series forecasting using a hybrid arima and neural network model,” Neurocomputing, vol. 50, pp. 159–175, 2003.
[72] S. Sun, S. Wang, Y. Wei, and G. Zhang, “A clustering-based nonlinear ensemble approach for exchange rates forecasting,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2018.
[73] N. I. Sapankevych and R. Sankar, “Time series prediction using support vector machines: a survey,” IEEE Computational Intelligence Magazine, vol. 4, pp. 24–38, 2009.
[74] S. Fu, Y. Li, S. Sun, and H. Li, “Evolutionary support vector machine for rmb exchange rate forecasting,” Physica A: Statistical Mechanics and its Applications, vol. 521, pp. 692–704, 2019.
[75] P. Yaohao and P. H. M. Albuquerque, “Non-linear interactions and exchange rate prediction: Empirical evidence using support vector regression,” Applied Mathematical Finance, vol. 26, pp. 69–100, 2019.
[76] S. Sun, S. Wang, and Y. Wei, “A new multiscale decomposition ensemble approach for forecasting exchange rates,” Economic Modelling, 2018.
[77] S. Sun, Y. Wei, and S. Wang, “Adaboost-lstm ensemble learning for financial time series forecasting,” International Conference on Computational Science, pp. 590–597, 2018.
[78] V. Plakandaras, T. Papadimitriou, and P. Gogas, “Forecasting daily and monthly exchange rates with machine learning techniques,” Journal of Forecasting, vol. 34, pp. 560–573, 2015.
[79] W. K. Wong, M. Manzur, and B. K. Chew, “How rewarding is technical analysis? evidence from singapore stock market,” Applied Financial Economics, vol. 13, pp. 543–551, 2003.
[80] T. Dietterich, “Overfitting and undercomputing in machine learning,” ACM computing surveys, vol. 27, pp. 326–327, 1995.
[81] D. H. Bailey, J. Borwein, M. Lopez de Prado, and Q. J. Zhu, “The probability of backtest overfitting,” Journal of Computational Finance, 2016.
[82] C. R. Harvey and Y. Liu, “Evaluating trading strategies,” The Journal of Portfolio Management, vol. 40, no. 5, pp. 108–118, 2014.
[83] P. Carr and M. Lopez de Prado, “Determining optimal trading rules without backtesting,” Available at SSRN 2658641, 2014.
[84] F. Schorfheide and K. I. Wolpin, “On the use of holdout samples for model selection,” American Economic Review, vol. 102, no. 3, pp. 477–81, 2012.
[85] D. Pradeepkumar and V. Ravi, “Soft computing hybrids for forex rate prediction: A comprehensive review,” Computers Operations Research, vol. 99, pp. 262–284, 2018.
[86] Y.-L. Peng and W.-P. Lee, “Data selection to avoid overfitting for foreign exchange intraday trading with machine learning,” Applied Soft Computing, vol. 108, p. 107461, 2021.
[87] Dukascopy, “Historical Data Feed,” 2007-2016. [Online]. Available: https: //www.dukascopy.com/swiss/english/marketwatch/historical/
[88] T. E. Copeland and D. Galai, “Information effects on the bid‐ask spread,” The Journal of Finance, vol. 38, pp. 1457–1469, 1983.
[89] M. D. Evans and R. K. Lyons, “Order flow and exchange rate dynamics,” Journal of Political Economy, vol. 11, pp. 170–180, 2002.
[90] J. Park, “A market microstructure explanation for predictable variations in stock returns following large price changes,” Journal of Financial and Quantitative Analysis, vol. 30, pp. 241–256, 1995.
[91] F. Wang, K. Yamasaki, S. Havlin, and H. E. Stanley, “Scaling and memory of intraday volatility return intervals in stock markets,” Physical Review E, vol. 73, p. 026117, 2006.
[92] X. Gabaix, P. Gopikrishnan, V. Plerou, and H. E. Stanley, “A theory of powerlaw distributions in financial market fluctuations,” Nature, vol. 423, pp. 267–270, 2003.
[93] S. Yang, Y. Wang, and X. Chu, “A survey of deep learning techniques for neural machine translation,” arXiv preprint arXiv:2002.07526, 2020.
[94] K. Cho, B. van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using RNN encoder–decoder for statistical machine translation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar: Association for Computational Linguistics, Oct. 2014, pp. 1724–1734. [Online]. Available: https://aclanthology.org/D14-1179
[95] A. Lopez, “Statistical machine translation,” ACM Computing Surveys (CSUR), vol. 40, no. 3, pp. 1–49, 2008.
[96] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in neural information processing systems, 2017, pp. 5998–6008.
[97] G. Lample, A. Conneau, M. Ranzato, L. Denoyer, and H. Jégou, “Word translation without parallel data,” in International Conference on Learning Representations, 2018. [Online]. Available: https://openreview.net/forum?id=H196sainb
[98] X. Wang and F. Ren, “Chinese-japanese clause alignment,” in International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 2005, pp. 400–412.
[99] J. Véronis, “From the rosetta stone to the information society,” in Parallel text processing. Springer, 2000, pp. 1–24.
[100] F. J. Och and H. Ney, “A systematic comparison of various statistical alignment models,” Computational linguistics, vol. 29, no. 1, pp. 19–51, 2003.
[101] A. Conneau, R. Rinott, G. Lample, A. Williams, S. Bowman, H. Schwenk, and V. Stoyanov, “XNLI: Evaluating cross-lingual sentence representations,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: Association for Computational Linguistics, Oct.-Nov. 2018, pp. 2475–2485. [Online]. Available: https://aclanthology.org/D18-1269
[102] P. A. Chew and A. Abdelali, “Benefits of the‘massively parallel rosetta stone’: Cross-language information retrieval with over 30 languages,” in Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, 2007, pp. 872–879.
[103] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp. 4171–4186. [Online]. Available: https://aclanthology.org/N19-1423
[104] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V. Le, “Xlnet: Generalized autoregressive pretraining for language understanding,” in Advances in neural information processing systems, 2019, pp. 5754–5764.
[105] T. Pires, E. Schlinger, and D. Garrette, “How multilingual is multilingual BERT?” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, Jul. 2019, pp. 4996–5001. [Online]. Available: https://aclanthology.org/P19-1493
[106] J. Zhu, Y. Xia, L. Wu, D. He, T. Qin, W. Zhou, H. Li, and T. Liu, “Incorporating bert into neural machine translation,” in International Conference on Learning Representations, 2020. [Online]. Available: https://openreview.net/forum?id=Hyl7ygStwB
[107] “Bible in multiple languages,” https://ibibles.net/, 2020. [108] M. Ott, S. Edunov, A. Baevski, A. Fan, S. Gross, N. Ng, D. Grangier, and M. Auli, “fairseq: A fast, extensible toolkit for sequence modeling,” in Proceedings of NAACL-HLT 2019: Demonstrations, 2019.
[109] L. v. d. Maaten and G. Hinton, “Visualizing data using t-sne,” Journal of Machine Learning Research, vol. 9, no. Nov, pp. 2579–2605, 2008.
[110] Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey et al., “Google’s neural machine translation system: Bridging the gap between human and machine translation,” arXiv preprint arXiv:1609.08144, 2016.
[111] A. Lavie and A. Agarwal, “Meteor: An automatic metric for mt evaluation with high levels of correlation with human judgments,” in Proceedings of the second workshop on statistical machine translation, 2007, pp. 228–231.
[112] “Evaluating models|automl translation| google cloud,” https://cloud.google.com/translate/automl/docs/\evaluate#automl_translate_get_model_evaluation-web, 2020.
[113] D. Elliott, S. Frank, K. Sima’an, and L. Specia, “Multi30k: Multilingual english-german image descriptions,” in Proceedings of the 5th Workshop on Vision and Language. Association for Computational Linguistics, 2016, pp.0–74. [Online]. Available: http://www.aclweb.org/anthology/W16-3210
[114] D. Elliott, S. Frank, L. Barrault, F. Bougares, and L. Specia, “Findings of the second shared task on multimodal machine translation and multilingual image description,” in Proceedings of the Second Conference on Machine Translation, Volume 2: Shared Task Papers. Copenhagen, Denmark: Association for Computational Linguistics, September 2017, pp. 215–233. [Online]. Available: http://www.aclweb.org/anthology/W17-4718
[115] L. Barrault, F. Bougares, L. Specia, C. Lala, D. Elliott, and S. Frank, “Findings of the third shared task on multimodal machine translation,” in Proceedings of the Third Conference on Machine Translation: Shared Task Papers, 2018, pp. 304–323.
[116] J. Duchi, E. Hazan, and Y. Singer, “Adaptive subgradient methods for online learning and stochastic optimization,” Journal of Machine Learning Research, vol. 12, no. Jul, pp. 2121–2159, 2011.
[117] T. Tieleman and G. Hinton, “Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude,” COURSERA: Neural networks for machine learning, vol. 4, no. 2, pp. 26–31, 2012.
[118] M. D. Zeiler, “Adadelta: an adaptive learning rate method,” arXiv preprint arXiv:1212.5701, 2012.
[119] P. T. Tran et al., “On the convergence proof of amsgrad and a new version,” IEEE Access, vol. 7, pp. 61 706–61 716, 2019.
[120] I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in International Conference on Learning Representations (ICLR), 2019.
[121] L. Luo, Y. Xiong, Y. Liu, and X. Sun, “Adaptive gradient methods with dynamic bound of learning rate,” in International Conference on Learning Representations (ICLR), New Orleans, Louisiana, May 2019.
[122] J. Zhuang, T. Tang, Y. Ding, S. C. Tatikonda, N. Dvornek, X. Papademetris, and J. Duncan, “Adabelief optimizer: Adapting stepsizes by the belief in observed gradients,” in Advances in Neural Information Processing Systems (NeurIPS), vol. 33. Curran Associates, Inc., 2020, pp. 18 795–18 806.
[123] Q. Wang, W. Xu, and H. Zheng, “Combining the wisdom of crowds and technical analysis for financial market prediction using deep random subspace ensembles,” Neurocomputing, vol. 299, pp. 51–61, 2018.
[124] M. F. Dicle and J. D. Levendis, “Technical financial analysis tools for stata,” The Stata Journal, vol. 17, no. 3, pp. 736–747, 2017.
[125] F. Zou, L. Shen, Z. Jie, W. Zhang, and W. Liu, “A sufficient condition for convergences of adam and rmsprop,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 11 127–11 135.
[126] N. S. Keskar and R. Socher, “Improving generalization performance by switching from adam to sgd,” arXiv preprint arXiv:1712.07628, 2017.
[127] J. J. Allaire, D. Falbel, S. Keydana, and C. Sparling, “cifar10_cnn.r,” https://github.com/rstudio/keras/blob/master/vignettes/examples/cifar10_cnn.R, 2018.
[128] M. P. Marcus, B. Santorini, M. A. Marcinkiewicz, and A. Taylor, “Treebank-3,” Linguistic Data Consortium, Philadelphia, vol. 14, 1999.
[129] Y. Gal, R. Islam, and Z. Ghahramani, “Deep bayesian active learning with image data,” in International Conference on Machine Learning. PMLR, 2017, pp. 1183–1192.
[130] J. E. Van Engelen and H. H. Hoos, “A survey on semi-supervised learning,” Machine Learning, vol. 109, no. 2, pp. 373–440, 2020.
[131] S. Rahman, S. Khan, and N. Barnes, “Transductive learning for zero-shot object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 6082–6091.
[132] F. Ma, Y. Wu, X. Yu, and Y. Yang, “Learning with noisy labels via selfreweighting from class centroids,” IEEE Transactions on Neural Networks and Learning Systems, 2021.
[133] D. Karimi, J. M. Peters, A. Ouaalam, S. P. Prabhu, M. Sahin, D. A. Krueger, A. Kolevzon, C. Eng, S. K. Warfield, and A. Gholipour, “Learning to detect brain lesions from noisy annotations,” in 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI). IEEE, 2020, pp. 1910–1914.
[134] G. Wang, X. Liu, C. Li, Z. Xu, J. Ruan, H. Zhu, T. Meng, K. Li, N. Huang, and S. Zhang, “A noise-robust framework for automatic segmentation of covid-19 pneumonia lesions from ct images,” IEEE Transactions on Medical Imaging, vol. 39, no. 8, pp. 2653–2663, 2020.
[135] U. Rebbapragada, C. E. Brodley, D. Sulla-Menashe, and M. A. Friedl, “Active label correction,” in 2012 IEEE 12th International Conference on Data Mining. IEEE, 2012, pp. 1080–1085.
[136] J. Kremer, F. Sha, and C. Igel, “Robust active label correction,” in International conference on artificial intelligence and statistics. PMLR, 2018, pp. 308–316.
[137] C. Northcutt, L. Jiang, and I. Chuang, “Confident learning: Estimating uncertainty in dataset labels,” Journal of Artificial Intelligence Research, vol. 70, pp. 1373–1411, 2021.
[138] D. Dua and C. Graff, “UCI machine learning repository,” 2017. [Online]. Available: http://archive.ics.uci.edu/ml
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus:開放下載的時間 available 2025-07-14
校外 Off-campus:開放下載的時間 available 2025-07-14

您的 IP(校外) 位址是 18.118.9.7
現在時間是 2024-04-25
論文校外開放下載的時間是 2025-07-14

Your IP address is 18.118.9.7
The current date is 2024-04-25
This thesis will be available to you on 2025-07-14.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 2025-07-14

QR Code