A Research on the Automatic Classification of CSR Reports In Accordance To GRI Environment Standards
Topic Detection, Text Classification, Corporate Social Responsibility, Corporate Social Responsibility Standards, Text Mining
近年來,在公司治理及永續發展的浪潮下,企業投入心力對社會做出貢獻,以報告書的方式揭露公司在社會責任的目標、成果、承諾及規劃。報告書參照多項GRI 指標撰寫,揭露公司對經濟、環境及社會的影響,是CSR評鑑的重要標準。為了自動快速預測文本所參照的指標,本論文篩選各項環境GRI指標的種子字,利用GuidedLDA探測文本揭露的環境指標項目 。我們蒐集了台灣上市公司的CSR報告書,擷取參照GRI指標撰寫的文章段落,在實驗中發現,小樣本的情況下,我們的方法分辨文本揭露的環境指標項目,準確度上優於現有的分類器,且有能力判斷是否參照GRI環境指標。
Corporate governance and sustainable development have been taken seriously by many enterprises. Enterprise strives to contribute to society, and they publish Corporate Social Responsibility reports every year to record their goals, results, commitment and planning. According to the GRI Standard, CSR reports describe the impact of companies to economical, environmental and social aspects. There are several criteria for each aspect that can be used for CSR report evaluation. It is essential that a CSR report meets each of the criteria. In this thesis, we propose an approach to automatically predict whether a given CS report will meet each criterion in the environmental aspect. Our approach applies GuidedLDA for such a task. We collected the CSR reports from listed companies of Taiwan and captured the paragraphs that have the potential meeting the GRI Standard. We found from experiments that, in small data set our approach performs better than all other traditional learning methods. For larger data set, our approach outperform unsupervised learning approach and has comparable performance with the supervised learning approach yet being able to provide interpretations.
目次 Table of Contents
論文審定書 i
誌謝 ii
摘要 iii
Abstract iv
Table Of Content vi
Table Of Figure vii
Table Of Table viii
CHAPTER 1 – Introduction 1
CHAPTER 2 – Related Work 5
1. Corporate social responsibility 5
2. Corpus Modeling 6
3. Text-mining in CSR reports 10
CHAPTER 3 – Methodology 14
1. Task description 14
2. Datasets collection 14
3. Our approach 19
CHAPTER 4 – Experimental Evaluation 31
1. Unsupervised Learning methods 31
2. Supervised Learning methods 37
3. The impact of the quality of the seed words on the method 39
4. Add irrelevant environmental indicator data in data sets 40
CHAPTER 5 – Conclusion 42
Reference 43
