No. 202: Assessing Corporate Sustainability with Large Language Models: Evidence from Europe
Abstract
Companies play a crucial role in reaching global sustainability goals, yet evidence of their progress along environmental, social, and governance (ESG) dimensions remains limited. Here, we develop a machine learning (ML) framework to systematically track ESG indicators from corporate reports. Applying our ML framework to the annual and sustainability reports of the 600 largest listed corporations in Europe over the 2014–2023 period, we collect 2,880,249 observations with ESG indicators across environmental (e.g., scope 1, 2, and 3 greenhouse gas emissions, water consumption, waste), social (e.g., employee turnover, women in top management, gender pay gap), and governance (e.g., lobbying expenses) topics. We use this dataset for conducting two key analyses over time and across industries: first, we assess ESG transparency as a firm’s disclosure of ESG indicators defined by the newly mandated European Sustainability Reporting Standards (ESRS). Second, we analyze ESG performance by extracting the numerical values of these indicators. Our results reveal a pronounced transparency gap: companies in the top decile of ESG ratings provided 22% more ESG indicators on average than those in the bottom decile. This gap narrowed substantially in later years, indicating a gradual convergence in ESG disclosure practices. ESG performance improved unevenly: while some environmental performance indicators showed notable improvements, most social indicators remained largely stagnant, with the exception of progress on gender equality. For example, during 2021–2023, total scope 3 emissions increased by a factor of 5.6, which is largely explained by an increase in scope 3 transparency. Our open-source ML framework enables policy-makers, investors, and financial markets to systematically track corporate ESG efforts, which, in turn, helps identify and drive progress toward sustainability goals.