論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available
論文名稱 Title |
利用Spotify和YouTube特徵值預測Billboard百大排行榜之熱門歌曲 Using Spotify and YouTube features to predict the Billboard Hot 100 chart’s popular songs |
||
系所名稱 Department |
|||
畢業學年期 Year, semester |
語文別 Language |
||
學位類別 Degree |
頁數 Number of pages |
53 |
|
研究生 Author |
|||
指導教授 Advisor |
|||
召集委員 Convenor |
|||
口試委員 Advisory Committee |
|||
口試日期 Date of Exam |
2024-07-19 |
繳交日期 Date of Submission |
2024-08-28 |
關鍵字 Keywords |
熱門歌曲科學、Spotify、Billboard、YouTube、機器學習、消融實驗 Hit Song Science, Spotify, Billboard, YouTube, Machine Learning, Ablation Experiment |
||
統計 Statistics |
本論文已被瀏覽 57 次,被下載 2 次 The thesis/dissertation has been browsed 57 times, has been downloaded 2 times. |
中文摘要 |
近年來,隨著多元化的音樂產生,音樂產業正蓬勃發展,在現今這個串流音樂為主的時代,普羅大眾大多也使用手機上的串流平台APP收聽音樂,許多串流音樂平台也陸續興起,廣被大眾使用來收聽音樂和廣播電台的Spotify,其中更是有數據分析人員針對每一首歌曲進行其音頻特徵的分析。而這些在串流平台的音樂也大多都會將其上傳至目前最大的影音平台YouTube,以此得到更多的流量。 而音樂人們所創作的音樂也會依照流量在音樂排行榜上進行排名,其中最有名的排行榜就是美國的billboard,透過實體販售和串流平台流量,能夠準確地反映一首歌曲的熱門程度。即透過排名可以表現出歌曲是否為所謂的熱門歌曲,而近幾年,人們也開始利用機器學習的模型來針對歌曲特徵還有較具指標性的排行榜進行預測,了解甚麼樣的特徵會讓一首歌曲達到指標,成為一首熱門歌曲,這類型的相關研究又被命名為「Hit Song Science」。 本研究將會透過在開放平台Kaggle以及串流平台相關的API取得以Spotify Hot 100為主的相關歌曲資料集還有這些歌曲在Spotify跟YouTube上面的特徵值,來針對其是否進入Billboard Hot 100排行進行熱門歌曲的預測研究,透過機器學習軟件和消融實驗來了解特徵值是否對歌曲入榜造成影響,最後得出Importance Rank來了解特徵值在不同模型下的重要程度。讓音樂產業的工作者能更了解甚麼樣的音樂可以符合市場需求。 |
Abstract |
In recent years, the music industry has been thriving with the emergence of diverse music genres. In today’s era, dominated by streaming music, the general public primarily uses streaming platform apps on their smartphones to listen to music. Many streaming music platforms, such as Spotify, have gained widespread popularity, allowing users to listen to music and radio stations. Data analysts have been conducting audio feature analyses of each song on these streaming platforms. Additionally, most music on these platforms is also uploaded to YouTube, the largest video platform, to gain more exposure. The music created by artists is ranked on various music charts based on its popularity, with the most well-known chart being the U.S. Billboard. This chart accurately reflects a song’s popularity through physical sales and streaming data. Rankings can indicate whether a song is considered a "hit song." In recent years, machine learning models have been used to predict song features and key charts, aiming to identify what characteristics make a song successful and become a hit. This type of research is known as "Hit Song Science." This study utilizes song datasets primarily from Spotify Hot 100, obtained through the open platform Kaggle and related APIs from streaming platforms, along with the features of these songs on Spotify and YouTube. The goal is to predict whether a song will enter the Billboard Hot 100. Through machine learning software and ablation experiments, we aim to understand whether specific features impact a song’s chart performance, ultimately determining the importance rank of features across different models. This research can help music industry professionals better understand what type of music aligns with market demands. |
目次 Table of Contents |
Table of Content 論文審定書 i 摘要 ii Abstract iii Figure Directory vi Table Directory vii Chapter 1 Introduction 1 1.1 Research Background 1 1.2 Research Motivation and Objectives 2 Chapter 2 Literature Review 6 2.1 Hit Song Science 6 2.2 The Rise of Social Media Platforms and Streaming Platforms 7 2.3 Current Trends In The Music Industry 10 2.4 Introduction to NLTK and Sentiment Analysis 13 2.5 Selenium Automated Script Web Scraping 14 Chapter 3 Research Methodology and Framework 16 3.1 Data Collection 16 3.2 Data Merging 24 3.3 Data Preprocessing 24 3.3.1 Handling Missing Values for Binary Variables 24 3.3.2 Handling Missing Values for Other Variables 25 3.3.3 Variable Normalization 27 3.3.4 Imbalanced learn 28 3.4 Model Building 29 Chapter 4 Research Results 32 Chapter 5 Conclusion and Research Recommendations 41 References 43 Chinese References 45 |
參考文獻 References |
Ahmad, I. S., Bakar, A. A., & Yaakub, M. R. (2020). Movie Revenue Prediction Based on Purchase Intention Mining Using YouTube Trailer Reviews. Information Processing & Management, 57(5). https://doi.org/10.1016/j.ipm.2020.102278 Breiman, L. (2001). Random forests. Machine learning, 45, 5-32. Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, Dhanaraj, R., & Logan, B. (2005). Automatic Prediction of Hit Songs. Ismir, Dietterich, T. G. (1998). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting and randomization. Machine learning, 32, 1-22. Dimolitsas, I., Kantarelis, S., & Fouka, A. (2023). SpotHitPy: A Study For ML-Based Song Hit Prediction Using Spotify. arXiv preprint arXiv:2301.07978. Elbagir, S., & Yang, J. (2019). Twitter sentiment analysis using natural language toolkit and VADER sentiment. Proceedings of the international multiconference of engineers and computer scientists, Fang, J. (2013). Why logistic regression analyses are more reliable than multiple regression analyses. Journal of Business and Economics, 4(7), 620-633. Herremans, D., Martens, D., & Sörensen, K. (2014). Dance Hit Song Prediction. Journal of New Music Research, 43(3), 291-302. https://doi.org/10.1080/09298215.2014.881888 Mauch, M., MacCallum, R. M., Levy, M., & Leroi, A. M. (2015). The evolution of popular music: USA 1960-2010. R Soc Open Sci, 2(5), 150081. https://doi.org/10.1098/rsos.150081 Middlebrook, K., & Sheik, K. (2019). Song hit prediction: Predicting billboard hits using spotify data. arXiv preprint arXiv:1908.08609. Morgan, S. P., & Teachman, J. D. (1988). Logistic regression: Description, examples, and comparisons. Journal of Marriage and Family, 50(4), 929-936. Natekin, A., & Knoll, A. (2013). Gradient boosting machines, a tutorial. Frontiers in neurorobotics, 7, 21. Nguyen, D. P., & Maag, S. (2020). Codeless web testing using selenium and machine learning. ICSOFT 2020: 15th International Conference on Software Technologies, Ni, Y., Santos-Rodriguez, R., Mcvicar, M., & De Bie, T. (2011). Hit song science once again a science. 4th International Workshop on Machine Learning and Music, Ochi, V., Estrada, R., Gaji, T., Gadea, W., & Duong, E. (2021). Spotify danceability and popularity analysis using sap. arXiv preprint arXiv:2108.02370. Pooransingh, A., & Dhoray, D. (2021). Similarity Analysis of Modern Genre Music Based on Billboard Hits. IEEE Access, 9, 144916-144926. https://doi.org/10.1109/access.2021.3122386 Raza, A. H., & Nanath, K. (2020). Predicting a Hit Song with Machine Learning: Is there an apriori secret formula? 2020 International Conference on Data Science, Artificial Intelligence, and Business Analytics (DATABIA), Saragih, H. S. (2023). Predicting song popularity based on Spotify's audio features: insights from the Indonesian streaming users. Journal of Management Analytics, 10(4), 693-709. Yao, J. (2019). Automated sentiment analysis of text data with NLTK. Journal of Physics: Conference Series, Yee, Y. K., & Raheem, M. (2022). Predicting Music Popularity Using Spotify and YouTube Features. Indian Journal Of Science And Technology, 15(36), 1786-1799. https://doi.org/10.17485/IJST/v15i36.2332 Zabor, E. C., Reddy, C. A., Tendulkar, R. D., & Patil, S. (2022). Logistic regression in clinical studies. International Journal of Radiation Oncology* Biology* Physics, 112(2), 271-277. 維基百科「告示牌百大排行榜」https://zh.wikipedia.org/wiki/%E5%91%8A%E7%A4%BA%E7%89%8C%E7%99%BE%E5%A4%A7%E5%96%AE%E6%9B%B2%E6%A6%9C SoundLife「Spotify年度總結」https://zh.soundoflife.com/blogs/experiences/spotify-wrapped-2020-music-genre 社會科學「從Billboard排行榜來分析流行音樂的趨勢演變」 https://case.ntu.edu.tw/blog/?p=21521 文策院訪問「數位浪潮加速進擊,音樂產業的未來將走向何方」https://taicca.tw/article/7332ed67 工商時報「全球最大的音樂串流平台,解密Spotify創業的幕後故事」 https://www.ctee.com.tw/news/20210621700309-431001 Line科技「YouTube帶領素人走向國際,隱藏的大明星不在有志難伸」 https://today.line.me/tw/v2/article/MZRxaz 台北流行音樂中心「音樂興發現”Billboard Hot 100”美國百大單曲榜超過65年歷史」 https://tmc.taipei/media/article/billboard-hot-100-surpasses-65-years-of-us-singles-chart/ 商業週刊「年輕人都在反算法,沒想到他先站出來了」 https://www.businessweekly.com.tw/business/blog/3006800 |
電子全文 Fulltext |
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。 論文使用權限 Thesis access permission:校內校外完全公開 unrestricted 開放時間 Available: 校內 Campus: 已公開 available 校外 Off-campus: 已公開 available |
紙本論文 Printed copies |
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。 開放時間 available 已公開 available |
QR Code |