Assessing ESG Compliance and Impact: A Zero-Shot Learning Approach to Analyzing Fortune 500 Companies' Sustainability Reports

Table: COMP1
Experimentation location: Home
Regulated Research (Form 1c): No
Project continuation (Form 7): No

Display board image not available



This is a working list of references I have thus far. This list will most likely expand as I continue to complete my research paper about the project.

  • “Predicting Companies’ ESG Ratings from News Articles
 Using Multivariate Timeseries Analysis” (Aue et. al) 
  • “Analyzing Sustainability Reports Using Natural Language Processing” (Luccioni et. al) 
  • “Detecting ESG topics using domain-specific language models and data augmentation approaches” (Nuget et. al) 
  • “ESGBERT: Language Model to Help with Classification Tasks Related to Companies Environmental, Social, and Governance Practices” (Mehra

Additional Project Information

Project website: -- No project website --
Research paper:
Additional Resources: -- No resources provided --

Research Plan:

In the last decade, the evaluation of companies has evolved to include environmental, social, and governance (ESG) factors, reflecting an increased societal demand for sustainable and socially responsible investing. Despite many companies aligning their strategies with ESG principles and publishing detailed sustainability reports, stakeholders face challenges in efficiently reviewing these documents and verifying the claims against actual performance. This research aims to simplify the process of evaluating company sustainability reports, facilitating a more straightforward assessment of their commitment to ESG goals. Addressing this issue is crucial for enhancing corporate accountability, ensuring that company's are promoting sustainability and diversity, and supporting informed investment decisions. The following is a comprehensive summary that details the different methods and procedures that will be used in order to carry out my research project:

  • Data Collection: I collect sustainability reports from Fortune 500 companies over five years from, then preprocess this text by splitting into sentences, cleaning unwanted characters, spaces, URLs, and applying lemmatization and tokenization to standardize and refine the data.
  • Text Analysis: Using Facebook's pretrained BART-large-mnli model, I classify sentences into 19 ESG-related categories through zero-shot learning, enabling me to categorize content the model hasn't explicitly been trained on. Each sentence is assigned a category and a confidence score, indicating the model's certainty in its classification.
  • Category Analysis: The distribution of categorized sentences is analyzed to identify the most frequently discussed ESG topics within the reports. This involves using visualization tools like Matplotlib to highlight the primary focus areas of the companies' sustainability efforts.
  • Quantitative Data Gathering: Based on the identified key ESG topics, I collect quantitative data from the companies' official reports and public disclosures, focusing on metrics relevant to the most discussed categories.
  • Correlation Analysis: I compare the qualitative insights from the sustainability reports with the quantitative data to assess alignment and discrepancies. This step involves looking for trends, patterns, and changes over time to evaluate whether companies' actions match their stated ESG commitments.


For some reason, I do not see a section for the abstract, so I am putting it here: 

In the past decade, there has been a significant shift in how companies are evaluated, with a growing emphasis on factors beyond financial performance. Among the variety of metrics that have arisen in the world of sustainable and socially conscious investing, there are three notable ones: environment, social, and governance, collectively known as ESG. Many companies have dedicated efforts to align their strategies with ESG principles, recognizing the increasing importance of these factors to stakeholders, investors, and regulatory bodies. Companies release yearly corporate sustainability reports specifying their ESG goals, such as reducing carbon footprint or increasing employee diversity, and their progress towards said goals. This paper presents a comprehensive analysis of how Fortune 500 companies across various industries are integrating ESG considerations into their operations and reporting, and the extent to which they are abiding by their publicly stated commitments. In order to do this, we extract and tokenize statements from various company sustainability reports, classify them into one of nineteen different ESG subcategories, and then compare the determined ESG focuses to actual statistical data in order to evaluate the authenticity and effectiveness of these reports, scrutinizing the gap between stated objectives and actual outcomes. This examination not only unveils the current state of ESG compliance among leading corporations but also offers critical insights into the challenges and successes in implementing sustainable and socially responsible practices.

Questions and Answers

1. The major objective of the project was to analyze how Fortune 500 companies integrate environment, social, and governance (ESG) considerations into their operations and reporting, and to evaluate the authenticity and effectiveness of their publicly stated commitments. The goal was to shed light on the current state of ESG compliance and the challenges and successes in implementing sustainable practices. This objective was driven by the increasing importance of ESG factors in corporate evaluation and the need for transparency in sustainability reporting. In order to achieve this goal, I planned to create a tool using data analytics and natural language processing that could automatically review and derive insights from company sustainability reports, which can then be further analyzed by a human.

2. The major tasks involved extracting and tokenizing statements from company sustainability reports, classifying them into nineteen different ESG subcategories using the process of zero-shot learning with a pre-trained large language model, and comparing the determined ESG focuses to actual statistical data in order to conclude whether a company had been making progress on their ESG-related goals.

3. My project's innovation lies in its use of zero-shot classification for the rapid and automatic analysis of sustainability reports, allowing for a detailed examination of ESG integration across industries without the need for training data. This approach enables efficient categorization of ESG statements into nineteen subcategories, providing a granular understanding of corporate sustainability efforts and their authenticity. Additionally, comparing these categorizations to actual statistical data to evaluate the authenticity of reports adds a layer of rigor to the analysis, which is key in a world where ESG is becoming more important in comparison to financial metrics.

4. The most challenging part of the project was the extraction and classification of statements from a large number of sustainability reports, as the process of applying a large-language model to these individual statements was a very time consuming process. The comparison with actual statistical data in order to evaluate authenticity has also been a challenge. Overcoming these challenges required developing a systematic approach. 

5. If I were to do this project again, I would consider finding a more efficient way to compare the categorized ESG statements with actual statistical data. This could involve developing a more automated process or utilizing advanced data analysis tools to streamline the evaluation of the authenticity and effectiveness of the sustainability reports, as for the most part, I did this portion of my project manually.

6. Working on this project has motivated me to explore the development of machine learning algorithms for more nuanced ESG analysis, as well as the potential for real-time ESG monitoring systems that could provide dynamic insights into corporate sustainability efforts.

7. The completion of my project was not affected by COVID-19.

I want to note that I am still in the process of completing some aspects of my project and my paper about my project. In the presentation I attached as part of my application, I did not detail the results of verifying the accuracy of the model against testing data because I am still in the process of evaluating the model. I also wanted to inform you that I put the abstract in research plan section above since I did not see a section where I could enter my abstract.