機器學習模型在台股期貨價格漲跌預測分析 Empirical Study on TAIEX Futures Price Prediction with Machine Learning Models |
66 |
2022-07-05 |
2022-07-25 |
台股期貨、技術指標、多元羅吉斯迴歸模型、極限梯度提升模型、輕量梯度提升模型 TAIEX futures, Technical indicators, Multinomial logistic regression model, Extreme gradient boosting model, Light gradient boosting model |
本研究使用2019 年1 月至2021 年6 月之台股期貨5 分鐘頻資料,以簡單移動平均、隨機指標、相對強弱指標以及指數平滑異同移動平均等技術指標做為特徵輸入多元羅吉斯迴歸模型、極限梯度提升模型以及輕量梯度提升模型預測台股期貨每20 分鐘價格為上漲、下跌或者持平,並嘗試將訓練資料分群進而比較不同模型間、不同訓練期間和預測不同漲跌點數之預測差異。 就整體預測結果來看,當預測漲跌點數由超過0 點增至5 點再至10 點時,精確度會隨之下降,且所有預測上漲的精確度都比預測下跌的精確度來得高。至於模型間之預測能力比較,發現在預測不同漲跌點數的情況下,三種模型預測精確度雖互有高下,但之間並無大幅的差距,可謂不同模型並無明顯優劣之分。 另為比較不同訓練期間是否影響預測結果,本文使用移動窗格法與定錨式移動窗格法將訓練期間分為12、15、18、21、24 及27 個月,實證發現無論是上漲或下跌的預測精確度皆無顯著差異,顯示訓練期間越長對模型預測效果影響並不顯著。 最後分別選用模型預測結果中最佳的訓練期來訓練投資模型,投資模型共設有多元羅吉斯迴歸模型、極限梯度提升模型與輕量梯度提升模型三種單一模型,再加上結合任兩個單一模型與結合三個單一模型之兩種多模型組合,並以2021 年7 月至2021 年9 月每5 分鐘資料進行樣本外回溯投資,在預測漲跌點數超過0 點、5 點和10 點的情境下分別使用五種投資模型做多與放空,總共得到三十種投資績效。而其中僅以多元羅吉斯迴歸單一模型預測台股期貨價格未來會下跌超過10 點時進場放空操作能獲得正報酬,勝率有55.56%,平均每次操作獲利高達1361 元。 |
This paper uses the 5-minute frequency data of TAIEX futures from January 2019 to June 2021, and employs technical indicators such as SMA, KD, RSI, and MACD as input features to the multinomial logistic regression model, the extreme gradient boosting model and the light gradient boosting model and then to predict that the price of TAIEX futures will rise, fall or remain flat every 20 minutes. Furthermore, we try to group the data in order to figure out if there are any differences between using different models, different training periods, or different predicted points of rise and fall. Overall, when the points of predicted ups and downs increases from more than 0 points to 5 points to 10 points, the precision would decrease accordingly, and the accuracy of all predictions of upsides is higher than that of downsides. As for the comparison of the prediction ability between the three models, it is found out that there are no significant differences between them. In addition, in order to compare whether different training periods affect the prediction results, we use the walk forward method and the anchored walk forward method to divide the training period into 12, 15, 18, 21, 24, and 27 months separately. Empirically it proofs that the longer the training period would not ensure the better outcome. Finally, the best training period in the model prediction results is selected to train the investment models. The investment models consist of three single models: the multinomial logistic regression model, the extreme gradient boosting model and the light gradient boosting model, plus a combination of any two single models, and a combination of three single models, and then using 5-minute frequency data from July 2021 to September 2021 to conduct out-of-sample retrospective investment, when the predicted up and down points exceed 0 points, 5 points and 10 points , five investment models are used to long and short the TAIEX futures. Among all of performance we obtain, only when the multinomial logistic regression model predicts the TAIEX futures will fall by more than 10 points and then shorts the futures could get positive returns, the winning rate is 55.56%, and the average profit per operation is as high as NT$1361. |
論文審定書........................................................................................................................i 誌謝...................................................................................................................................ii 摘要..................................................................................................................................iii Abstract ...........................................................................................................................iv 第一章 緒論.............................................................................................................. 1 1.1 研究背景與動機.......................................................................................... 1 1.2 研究目的...................................................................................................... 4 1.3 研究架構...................................................................................................... 4 第二章 文獻探討...................................................................................................... 6 2.1 股市技術分析相關研究.............................................................................. 6 2.2 以機器學習預測股市漲跌........................................................................ 10 第三章 研究方法.................................................................................................... 14 3.1 資料來源.................................................................................................... 14 3.2 研究模型.................................................................................................... 14 3.2.1 多元羅吉斯迴歸模型(Multinomial Logistic Regression)....... 14 3.2.2 極限梯度提升模型(Extreme Gradient Boosting, XGBoost).... 17 3.2.3 輕量梯度提升模型(Light Gradient Boosting, LightGBM)...... 21 3.3 特徵資料.................................................................................................... 24 3.3.1 簡單移動平均線(Simple Moving Average, SMA)........................ 25 3.3.2 指數平滑異同移動平均線(Moving Average Convergence and Divergence, MACD) ...................................................................................... 26 3.3.3 隨機指標(Stochastic Oscillator, KD)...................................... 26 3.3.4 相對強弱指標(Relative Strength Index, RSI)........................ 27 3.3.5 開盤價、最高價、最低價、收盤價................................................ 27 3.4 研究設計.................................................................................................... 29 第四章 實證結果.................................................................................................... 32 4.1 樣本敘述統計............................................................................................ 32 4.2 模型預測結果分析.................................................................................... 35 4.3 實證回溯投資............................................................................................ 43 4.3.1 回測資料............................................................................................. 43 4.3.2 投資方式設定..................................................................................... 44 4.3.3 回測投資結果..................................................................................... 47 第五章 結論與建議................................................................................................ 53 5.1 研究結論.................................................................................................... 53 5.2 研究限制與建議........................................................................................ 54 參考文獻........................................................................................................................ 55 |
