博碩士論文 etd-0723120-225123 詳細資訊


[回到前頁查詢結果 | 重新搜尋]

姓名 張薰月(Hsun-Yueh Chang) 電子郵件信箱 E-mail 資料不公開
畢業系所 資訊管理學系研究所(Department of Information Management)
畢業學位 碩士(Master) 畢業時期 108學年第2學期
論文名稱(中) 基於輔助性多工與可解釋多標籤學習的食材辨識系統
論文名稱(英) Food Ingredients Recognition via Interpretable Multi-label Learning with Auxiliary tasks
檔案
  • etd-0723120-225123.pdf
  • 本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
    請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
    論文使用權限

    紙本論文:立即公開

    電子論文:校內校外完全公開

    論文語文/頁數 英文/41
    統計 本論文已被瀏覽 5673 次,被下載 54 次
    摘要(中) 隨著大數據的興起,深度學習已廣泛用於解決各種分類問題,在食品相關領域,食材識別是一種熱門且具有挑戰性的應用。挑戰之一是烹飪後食材難以識別,另一個挑戰是多標籤學習。在本研究中,我們將多標籤學習應用於 BBC食品網站上的食譜資料,嘗試在食物圖像中找到相應的食材。我們提出了一種多任務學習方法來解決多標籤問題。首先將烹飪步驟的文本內容做轉換,得到的向量用作多任務學習的輸出之一,而另一個輸出是食材。我們的方法通過多任務學習,兩個​​任務彼此共享學習到的資訊,可以學習單任務學習無法學習的資訊,從而提高食材預測的準確性,並對模型提供可理解的解釋。
    摘要(英) With the rise of big data in recent years, deep learning has been extensively used to solve various classification problems, for food-related fields, ingredient recognition is one of the popular and challenging applications. One of the challenges is the difficulty of recognition after cooking, and another challenge is multi-label learning.
    In this thesis, we try to find the corresponding ingredient set in food images from the recipe data on the BBC food website. by proposing a deep learning multi-task learning algorithm to solve this multi-label problem. This method first converts the cooking instruction text into the vector and uses it as one of the outputs of multi-task learning, and another output is the ingredient set.
    With multi-task learning, the two tasks share the learned information with each other, and learn the patterns that single-task learning may not learn, thereby improving the accuracy of the ingredient prediction and providing an understandable explanation for the model.
    關鍵字(中)
  • 詞嵌入
  • 多標籤學習
  • 卷積神經網絡
  • 多任務學習
  • 機器學習可解釋性
  • 關鍵字(英)
  • Word-embedding
  • Convolutional neural network
  • Multi-task learning
  • Machine Learning Interpretability
  • Multi-label Learning
  • 論文目次 論文審定書................................................................................................................... i
    誌謝............................................................................................................................. ii
    摘要............................................................................................................................ iii
    Abstract....................................................................................................................... iv
    目錄............................................................................................................................. v
    List of Figures.............................................................................................................. vi
    List of Table.................................................................................................................vii
    1. Introduction................................................................................................................1
    2. Background and Related Work.................................................................................3
    2.1 Convolutional neural network...............................................................................3
    2.2 Food Understanding............................................................................................ 5
    2.3 Multi-label classification......................................................................................8
    2.4 Multi-task Learning........................................................................................... 13
    2.5 Word embedding...............................................................................................15
    2.6 Explainable AI..................................................................................................17
    3. Proposed approach............................................................................................18
    4. Experiments...................................................................................................... 23
    4.1 Dataset............................................................................................................. 23
    4.2 Evaluation metrics.............................................................................................23
    4.3 Comparison Methods........................................................................................24
    4.4 Experimental Results.........................................................................................25
    5. Conclusion.............................................................................................................. 27
    6. References...............................................................................................................28
    參考文獻 Agrawal, R., Gupta, A., Prabhu, Y., & Varma, M. (n.d.). Multi-Label Learning with Millions of Labels: Recommending Advertiser Bid Phrases for Web Pages. 11.
    Argyriou, A., Evgeniou, T., & Pontil, M. (n.d.). Convex Multi-Task Feature Learning. 40.
    Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning, 2(1), 1–127. https://doi.org/10.1561/2200000006
    Bhatia, K., Jain, H., Kar, P., Varma, M., & Jain, P. (n.d.). Sparse Local Embeddings for Extreme Multi-label Classification. 18.
    Bossard, L., Guillaumin, M., & Van Gool, L. (2014). Food-101 – Mining Discriminative Components with Random Forests. In D. Fleet, T. Pajdla, B. Schiele, & T. Tuytelaars (Eds.), Computer Vision – ECCV 2014 (Vol. 8694, pp. 446–461). Springer International Publishing. https://doi.org/10.1007/978-3-319-10599-4_29
    Caruana, R. (n.d.). Multitask Learning. 35.
    Chen, J., & Ngo, C. (2016). Deep-based Ingredient Recognition for Cooking Recipe Retrieval. Proceedings of the 2016 ACM on Multimedia Conference - MM ’16, 32–41. https://doi.org/10.1145/2964284.2964315
    Convolutional neural network. (2019). In Wikipedia. https://en.wikipedia.org/w/index.php?title=Convolutional_neural_network&oldid=925804458
    Ege, T., & Yanai, K. (2017). Image-Based Food Calorie Estimation Using Knowledge on Food Categories, Ingredients and Cooking Directions. Proceedings of the on Thematic Workshops of ACM Multimedia 2017 - Thematic Workshops ’17, 367–375. https://doi.org/10.1145/3126686.3126742
    Evgeniou, T. (2004). Regularized multi-task learning. 109–117.
    Jain, H., Prabhu, Y., & Varma, M. (2016). Extreme Multi-label Loss Functions for Recommendation, Tagging, Ranking & Other Missing Label Applications. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’16, 935–944. https://doi.org/10.1145/2939672.2939756
    Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016). Bag of Tricks for Efficient Text Classification. ArXiv:1607.01759 [Cs]. http://arxiv.org/abs/1607.01759
    Kang, Y., Cheng, I.-L., Mao, W., Kuo, B., & Lee, P.-J. (2019). Towards Interpretable Deep Extreme Multi-label Learning. ArXiv:1907.01723 [Cs, Stat]. http://arxiv.org/abs/1907.01723
    Kawano, Y., & Yanai, K. (2014). Food image recognition with deep convolutional features. Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing Adjunct Publication - UbiComp ’14 Adjunct, 589–593. https://doi.org/10.1145/2638728.2641339
    Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 25 (pp. 1097–1105). Curran Associates, Inc. http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
    LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
    Liang, Y., & Li, J. (2017). Computer vision-based food calorie estimation: Dataset, method, and experiment. ArXiv:1705.07632 [Cs]. http://arxiv.org/abs/1705.07632
    Liu, J., Chang, W.-C., Wu, Y., & Yang, Y. (2017). Deep Learning for Extreme Multi-label Text Classification. Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR ’17, 115–124. https://doi.org/10.1145/3077136.3080834
    Lounici, K., Pontil, M., Tsybakov, A. B., & van de Geer, S. (2009). Taking Advantage of Sparsity in Multi-Task Learning. ArXiv:0903.1468 [Math, Stat]. http://arxiv.org/abs/0903.1468
    Marin, J., Biswas, A., Ofli, F., Hynes, N., Salvador, A., Aytar, Y., Weber, I., & Torralba, A. (2019). Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1–1. https://doi.org/10.1109/TPAMI.2019.2927476
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (n.d.). Distributed Representations of Words and Phrases and their Compositionality. 9.
    Myers, A., Johnston, N., Rathod, V., Korattikara, A., Gorban, A., Silberman, N., Guadarrama, S., Papandreou, G., Huang, J., & Murphy, K. (2015). Im2Calories: Towards an Automated Mobile Vision Food Diary. 2015 IEEE International Conference on Computer Vision (ICCV), 1233–1241. https://doi.org/10.1109/ICCV.2015.146
    Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. https://doi.org/10.3115/v1/D14-1162
    Pouladzadeh, P., Kuhad, P., Peddi, S. V. B., Yassine, A., & Shirmohammadi, S. (2016). Food calorie measurement using deep learning neural network. 2016 IEEE International Instrumentation and Measurement Technology Conference Proceedings, 1–6. https://doi.org/10.1109/I2MTC.2016.7520547
    Pouladzadeh, P., Yassine, A., & Shirmohammadi, S. (2015). FooDD: Food Detection Dataset for Calorie Measurement Using Food Images. In V. Murino, E. Puppo, D. Sona, M. Cristani, & C. Sansone (Eds.), New Trends in Image Analysis and Processing—ICIAP 2015 Workshops (Vol. 9281, pp. 441–448). Springer International Publishing. https://doi.org/10.1007/978-3-319-23222-5_54
    Prabhu, Y., & Varma, M. (2014). FastXML: A fast, accurate and stable tree-classifier for extreme multi-label learning. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’14, 263–272. https://doi.org/10.1145/2623330.2623651
    Recipes—BBC Food. (n.d.). Retrieved July 23, 2020, from https://www.bbc.co.uk/food/recipes
    Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. ArXiv:1602.04938 [Cs, Stat]. http://arxiv.org/abs/1602.04938
    Ruder, S. (2017). An Overview of Multi-Task Learning in Deep Neural Networks. ArXiv:1706.05098 [Cs, Stat]. http://arxiv.org/abs/1706.05098
    Salvador, A., Drozdzal, M., Giro-i-Nieto, X., & Romero, A. (n.d.). Inverse Cooking: Recipe Generation From Food Images. 10.
    Samek, W., Wiegand, T., & Müller, K.-R. (2017). Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models. ArXiv:1708.08296 [Cs, Stat]. http://arxiv.org/abs/1708.08296
    Shen, D., Wang, G., Wang, W., Min, M. R., Su, Q., Zhang, Y., Li, C., Henao, R., & Carin, L. (2018). Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms. ArXiv:1805.09843 [Cs]. http://arxiv.org/abs/1805.09843
    Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. ArXiv:1409.1556 [Cs]. http://arxiv.org/abs/1409.1556
    Sorower, M. S. (2010). A Literature Survey on Algorithms for Multi-label Learning.
    Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2014). Going Deeper with Convolutions. ArXiv:1409.4842 [Cs]. http://arxiv.org/abs/1409.4842
    Wang, J., Yang, Y., Mao, J., Huang, Z., Huang, C., & Xu, W. (2016). CNN-RNN: A Unified Framework for Multi-label Image Classification. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2285–2294. https://doi.org/10.1109/CVPR.2016.251
    Wei, T., & Li, Y.-F. (n.d.). Does Tail Label Help for Large-Scale Multi-Label Learning. 7.
    Yeh, C.-K., Wu, W.-C., Ko, W.-J., & Wang, Y.-C. F. (2017). Learning Deep Latent Spaces for Multi-Label Classification. ArXiv:1707.00418 [Cs]. http://arxiv.org/abs/1707.00418
    Yu, H.-F., Jain, P., Kar, P., & Dhillon, I. S. (n.d.). Large-scale Multi-label Learning with Missing Labels. 9.
    Zhang, J., Wu, Q., Shen, C., Zhang, J., & Lu, J. (2017). Multi-Label Image Classification with Regional Latent Semantic Dependencies. ArXiv:1612.01082 [Cs]. http://arxiv.org/abs/1612.01082
    Zhang, W., Yan, J., Wang, X., & Zha, H. (2017). Deep Extreme Multi-label Learning. ArXiv:1704.03718 [Cs]. http://arxiv.org/abs/1704.03718
    Zhang, Y., & Yang, Q. (2018). A Survey on Multi-Task Learning. ArXiv:1707.08114 [Cs]. http://arxiv.org/abs/1707.08114
    口試委員
  • 黃三益 - 召集委員
  • 李珮如 - 委員
  • 康藝晃 - 指導教授
  • 口試日期 2020-07-30 繳交日期 2020-08-23

    [回到前頁查詢結果 | 重新搜尋]


    如有任何問題請與論文審查小組聯繫