國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,食品闢謠查核輔助系統,Detection Support of Food Rumor Veracity

論文名稱 Title	食品闢謠查核輔助系統 Detection Support of Food Rumor Veracity
系所名稱 Department	資訊管理學系 Department of Information Management
畢業學年期 Year, semester	109 學年度第 2 學期 The spring semester of Academic Year 109	語文別 Language	英文 English
學位類別 Degree	碩士 Master	頁數 Number of pages	59
研究生 Author	陳崇華 Chong-Hua Chen
指導教授 Advisor	張德民 Chang,Te-Min
召集委員 Convenor	林耕霈 Lin, Keng-Pei
口試委員 Advisory Committee	蕭文峰 HSIAO, WEN-FENG
口試日期 Date of Exam	2021-06-21	繳交日期 Date of Submission	2021-07-01
關鍵字 Keywords	食品謠言、分類、分群、K-medoid、PAM、詞嵌入、同義詞 food rumor, classification, cluster, K-medoid, PAM, word embedding, synonym
統計 Statistics	本論文已被瀏覽 603 次，被下載 0 次 The thesis/dissertation has been browsed 603 times, has been downloaded 0 times.

中文摘要
假新聞、謠言一直是全球最重大的問題之一，在台灣，社群媒體也深受假新聞其害，自從 2014 年台灣食安風暴後，台灣人對於食品健康與安全的重視度也隨之上升。食品假新聞與謠言的數量也隨著人們對於食品安全的恐懼感上升，這些食品謠言不但會影響大眾對於飲食的觀念，嚴重的情況下甚至會導致聽信偏方的患者延誤就醫，造成不可挽回的傷害。然而，假新聞危害如此嚴重，日常中所使用的 line 社群軟體卻還是有各種食品假新聞在傳播; 遺憾的是，在台灣食品藥物管制署 (FDA) 的闢謠資訊更新的速度還遠遠不及假新聞增長的速度。為了解決此問題，本文提出一個系統輔助架構，利用分類、分群、詞嵌入等機器學習演算法，讓組織端可以藉由使用者查詢系統的資訊，不但能增進澄清謠言的速度，還能淘汰非謠言的查詢，降低組織端人力成本及增加闢謠效率。對於使用者端，倚靠相似度查詢以及適當的前處理，可以解決同義字食品被查詢的問題，此外，使用 Kmedoid 分群演算法，降低每次查詢的複雜度，提升使用者查詢的速度。
Abstract
Fake news and rumor has always been one of the most important issues in the world. In Taiwan, social media has also suffered from fake news. Since the food safety crisis in Taiwan in 2014, Taiwanese people have attached greater importance to food health and safety. The number of food fake news and rumor also rises with people’s fear of food safety. These food rumors will not only affect the public’s perception of diet, but in severe cases can even cause delays in seeking medical treatment for patients who follow the prescription, and ultimately cause irreparable harm. Although fake news is greatly harmful, The line social software used in daily life is still spreading various food fake news; Regrettably, the rate of update of the anti-rumor information in Taiwan’s Food and Drug Administration (FDA) is far slower than the growth rate of fake news. In order to solve this problem, this article proposes a system-assisted architecture that uses machine learning algorithms such as classification, clustering, and word embedding, so that the organization can query system information through users; it can not only increase the speed of clarifying rumors, but also eliminate non-rumor queries. The inquiry of rumors reduces the labor cost of the organization and increases the efficiency of rumor rejection. For the user side, relying on similarity query and proper preprocessing can solve the problem of synonymous food being queried. In addition, the K-medoid grouping algorithm is used to reduce the complexity of each query and improve the query speed of users. In order to solve this problem, this article proposes a system-assisted architecture that uses machine learning algorithms such as classification, clustering, and word embedding, so that the organization can analyze user queries through the system; it can not only increase the speed of clarifying rumors, but also eliminate non-rumor queries. The inquiry of rumors reduces the labor cost of the organization and increases the efficiency of rumor rejection. For the user side, relying on similarity query and proper preprocessing can solve the problem of synonymous food being queried. In addition, the K-medoid grouping algorithm is used to reduce the complexity of each query and improve the query speed of users.

目次 Table of Contents
論文審定書. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii 摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Objective Of The Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Organization Of The Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Chapter 2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Fake news detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Rumor detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Food fake news . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.4 Text-based Vector Space Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.5 Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.5.1 Randomforest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.5.2 SVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.5.3 XGBoosting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.6 Partitional clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.6.1 K-means. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.6.2 K-medoids. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Chapter 3 Proposed Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.1 Relevant word embeddings from Wiki . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.1.1 Corpus Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1.2 Word Embedding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Classification-based query matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.1 Cluster Analysis on Food . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.2 Query Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3 Matching results and post-processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3.1 Query Detection Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Chapter 4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.1 Vector Space Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.1.1 Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.1.2 Word2Vec Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.2 Food Clusters and Query Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.2.1 Food Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.2.2 Query Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.3 New Query Detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .34 4.3.1 Query Detection Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Chapter 5 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.1 Validation Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .38 5.2 Validation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Chapter 6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6.1 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6.2 Managerial Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6.3 Research Limitations and Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

參考文獻 References
[1] M. G. Vestergaard and L. M. Nielsen, “The danish veterinary and food administration’s fight against fake nutrition news on digital media,” Tidsskrift for Medier, Erkendelse Og Formidling, vol. 7, no. 2, pp. 21–21, 2019. [2] N. Grinberg, K. Joseph, L. Friedland, B. SwireThompson, and D. Lazer, “Fake news on twitter during the 2016 us presidential election,” Science, vol. 363, no. 6425, pp. 374–378, 2019. [3] K. Shu, A. Sliva, S. Wang, J. Tang, and H. Liu, “Fake news detection on social media: A data mining perspective,” ACM SIGKDD explorations newsletter, vol. 19, no. 1,pp. 22–36, 2017. [4] C. Zhang, A. Gupta, C. Kauten, A. V. Deokar, and X. Qin, “Detecting fake news for reducing misinformation risks using analytics approaches,” European Journal of Operational Research, vol. 279, no. 3, pp. 1036–1052, 2019. [5] H. Ahmed, I. Traore, and S. Saad, “Detection of online fake news using ngram analysis and machine learning techniques,” in International conference on intelligent,secure, and dependable systems in distributed and cloud environments. Springer,2017, pp. 127–138. [6] K. Demestichas, K. Remoundou, and E. Adamopoulou, “Food for thought: Fighting fake news and online disinformation,” IT Professional, vol. 22, no. 2, pp. 28–34,2020. [7] J. Sampson, F. Morstatter, L. Wu, and H. Liu, “Leveraging the implicit structure within social media for emergent rumor detection,” in Proceedings of the 25th ACM international on conference on information and knowledge management, 2016, pp.2377–2382. [8] J. Ma, W. Gao, and K.F. Wong, “Rumor detection on twitter with treestructured recursive neural networks.” Association for Computational Linguistics, 2018. [9] A. Habib, S. Akbar, M. Z. Asghar, A. M. Khattak, R. Ali, and U. Batool, “Rumor detection in business reviews using supervised machine learning,” in 2018 5th International Conference on Behavioral, Economic, and SocioCultural Computing(BESC). IEEE, 2018, pp. 233–237. [10] S. B. Rowe and N. Alexander, “On posttruth, fake news, and trust,” Nutrition Today,vol. 52, no. 4, pp. 179–182, 2017. [11] S. Abnar, R. Ahmed, M. Mijnheer, and W. Zuidema, “Experiential, distributional and dependencybased word embeddings have complementary roles in decoding brain activity,” arXiv preprint arXiv:1711.09285, 2017. [12] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013. [13] A. Bakarov, “A survey of word embeddings evaluation methods,” arXiv preprint arXiv:1801.09536, 2018. [14] T. K. Ho, “Random decision forests,” in Proceedings of 3rd international conference on document analysis and recognition, vol. 1. IEEE, 1995, pp. 278–282. [15] P. Vora, M. Khara, and K. Kelkar, “Classification of tweets based on emotions using word embedding and random forest classifiers,” International Journal of Computer Applications, vol. 178, no. 3, pp. 1–7, 2017. [16] Z.Q. Wang, X. Sun, D.X. Zhang, and X. Li, “An optimal svmbased text classification algorithm,” in 2006 International Conference on Machine Learning and Cybernetics. IEEE, 2006, pp. 1378–1381. [17] J. Chen, D. Liang, Z. Zhu, X. Zhou, Z. Ye, and X. Mo, “Social media popularity prediction based on visualtextual features with xgboost,” in Proceedings of the 27th ACM International Conference on Multimedia, 2019, pp. 2692–2696. [18] E. Sherkat, J. Velcin, and E. E. Milios, “Fast and simple deterministic seeding of kmeans for text document clustering,” in International conference of the crosslanguage evaluation forum for European languages. Springer, 2018, pp. 76–88. [19] L. Jing, M. K. Ng, J. Xu, and J. Z. Huang, “Subspace clustering of text documents with feature weighting kmeans algorithm,” in PacificAsia Conference on Knowledge Discovery and Data Mining. Springer, 2005, pp. 802–812. [20] A. Rangrej, S. Kulkarni, and A. V. Tendulkar, “Comparative study of clustering techniques for short text documents,” in Proceedings of the 20th international conference companion on World wide web, 2011, pp. 111–112. [21] A. Onan, “A kmedoids based clustering scheme with an application to document clustering,” in 2017 international conference on computer science and engineering (UBMK). IEEE, 2017, pp. 354–359. [22] F. Liu and L. Xiong, “Survey on text clustering algorithm,” in 2011 IEEE 2nd International Conference on Software Engineering and Service Science. IEEE, 2011,pp. 901–904. [23] N. K. Kaur, U. Kaur, and D. D. Singh, “Kmedoid clustering algorithma review,” In ternational Journal of Computer Application and Technology (IJCAT), vol. 1, no. 1,pp. 2349–1841, 2014.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：開放下載的時間 available 2026-07-01 校外 Off-campus：開放下載的時間 available 2026-07-01 您的 IP(校外) 位址是 216.73.216.54 現在時間是 2025-06-17 論文校外開放下載的時間是 2026-07-01 Your IP address is 216.73.216.54 The current date is 2025-06-17 This thesis will be available to you on 2026-07-01.
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 2026-07-01

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2453 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2453 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS