How AI Can Empower ESG Analysis

Iceberg Data Lab CTO Pierre-Olivier Haye outlines the ways technology can help financial institutions better adapt to complex and rapidly evolving sustainability obligations.

The ever-increasing number of ESG regulations – including everything from the EU’s Corporate Sustainability Reporting Directive (CSRD) to California’s Climate Corporate Data Accountability Act – means the amount of sustainability-related company data is predicted to grow exponentially.

However, the vast majority of this information is unstructured, meaning it isn’t stored in a specific format or organised in a pre-defined manner. This includes sensors, text files such as emails, and audio and video files.

“The ‘crime’ of reported sustainability data is that the majority of it – and likely the most interesting – is not structured,” Pierre-Olivier Haye, Chief Technology Officer and Co-founder of ESG data provider Iceberg Data Lab (IDL), told ESG Investor. “The important bits are typically buried in a 200-page annual report.”

Haye believes that data extraction is the leading use case for AI in the sustainability industry.

Recently, the emergence of large language models (LLMs), such as OpenAI’s ChatGPT and Google’s Gemini – which can perform natural language-processing tasks – have turbocharged the ability to retrieve, comprehend and categorise data.

But as with all machine-learning algorithms, the challenge with LLMs is the error rate during the time the models are being trained on the data set. “There is always some uncertainty with the results,” according to Haye. “Many refer to ‘hallucinations’ when the LLM results are nonsensical, but we call it uncertainty.”

IDL is attempting to solve the inconsistencies intrinsic to LLMs used for sustainability data by restricting the information universe to specific documents for training the model.

“It is a great use case because many industries are very interested in the specific extraction of ESG text. However, the other challenge we face is that ESG sits in the intersection of many topics – such as finance, environment and nature, human rights and diversity,” said Haye.

“To train the model we need people who have this expertise range, which is quite rare because it is a relatively new field. As such, they need to be able to learn, discover, ask the right questions, and essentially improve their own understanding.”

Tutor bot

To address this issue, IDL launched an ESG AI assistant, Barbatus, in July 2023.

“It is a chatbot tuned to be an ESG specialist, so people can have a conversation that will help educate them on ESG topics,” Haye explained.

He outlined a scenario where an investor, scrutinising a company’s annual report, would like to know where it stands in comparison to industry peers. It is also useful to the investor to understand the sector’s challenges to be able to benchmark the company.

“For example, an investor might be looking at investing in French multinational Alstom [which provides rolling stock, services and maintenance for the rail sector],” said Haye. “They may want to understand the environmental issues the transport sector is facing to get a better understanding of how Alstom is dealing with those specific challenges.”

In this case, the chatbot would scan Alstom’s annual report published and compare it to sector ambitions.

“The investor is then able to have a proper view and discussion about whether the company is going in the right direction,” he continued. “They can also expand their knowledge by interacting with the chatbot. We are just at the beginning of the ESG topic, so it’s an interesting way to learn more.”

Data governance

One of the biggest challenges facing the finance industry when using tools like AI or machine learning is data confidentiality, according to Haye. As such, many feel that they need to operate LLMs themselves.

But with AI evolving at an exceedingly fast pace, staying abreast of the latest market developments can be difficult. Financial institutions must also invest in expensive machinery and attract people with the suitable skillset at a time when there is a chronic talent shortage.

“It is a costly proposition that doesn’t come with a guarantee that they will be able to fix the specific problem,” Haye said.

Hiring is a challenge, especially as ESG expertise requires both a scientific and a finance background. “They need to have an open mind, but often the finance and ESG worlds don’t mix and mingle easily,” he added.

Another problem for large companies that isn’t specific to banks or investment firms is the ability to innovate, due to the weight of processes and hierarchy which blocks innovation, Haye explained. As such, he recommends against financial institutions starting a large ESG-focused AI project.

“They will spend a lot of money but most likely won’t be able to release the product [in a timely manner] due to complex internal bureaucracy,” he said.

He believes that start-ups are better placed to drive innovation. For example, IDL is currently working on a hosted solution to assure data security for financial institutions, due to be released soon.

“They don’t want OpenAI using their internal reports or documents for training, so we wanted to be able to guarantee confidentiality,” said Haye. “Our objective was to achieve the same results as OpenAI, but with an internally hosted solution.”

Where to start?

Though most financial institutions are exploring how to apply AI and LLMs in different parts of their business, Haye’s advice to those looking to experiment with AI for ESG is to establish a baseline to build from.

“You only control what you measure – so if you are starting from nothing, independent of your data provider or strategy, begin by measuring,” he said. “Then you can define the axis of improvements, which is the path to doing something. If you jump in too fast, you might head in the right direction, but it’s not guaranteed that you will continue taking the right steps or truly understand what you are doing – so you won’t be in control.”

Once the organisation starts measuring, it should define its strategy and aims, and then document its computation methodology, Haye suggested.

“As Winston Churchill said: ‘The only statistics you can trust are those you falsified yourself’,” he added. “Therefore, you need to understand the data you have and how it has been computed because there will be bias. Most of the data provided by external sources has bias.”

A prime example of this is Scope 3 emissions, which are typically reported by third parties.

“Therefore, we have no other choice but to model [Scope 3]. But to use the model properly, we need to understand the data flows in the calculation,” explained Haye, stressing the importance of using high-quality data providers.

“We help our clients understand the computation methodology, so that they can grasp any limitations and go beyond just the number,” he added.

Environmental impact

Spiralling energy demand from AI has already affected the carbon footprints of many technology companies. Microsoft, for example, reported its carbon emissions were up almost 30% in 2023 compared to 2020 levels, mainly due to the energy requirements of its AI business.

However, Haye is confident that innovation will solve the negative environmental impact of AI.

“Today, we are working with big models, which require big data and big energy – big everything,” he said. “But as we move to working on specific use cases, we can use much smaller, tailored models. By virtue of being smaller, they will use less energy to train and run. In my opinion, this is something to explore.”

In addition, having highly specialised companies like IDL developing solutions can reduce the amount of energy required, as they can train the AI model once and serve many clients with it. “Most of the energy consumption is due to the training process, so sharing the model training will help reduce carbon emissions,” explained Haye.

He also highlighted the efforts cloud providers are making in recycling their hardware. For example, French cloud computing company OVHcloud has a system for recycling servers when they depreciate by disassembling them and reusing components. The company offers recycled servers for less than €10 per month.

“It is improving the lifetime of the hardware, which is the type of solution that cloud providers need to work on,” Haye argued.

Supportive regulation

The new regulations coming down the pipeline may be critical for transitioning to a low-carbon economy, but companies may struggle with the associated reporting burden. Due to come into play in 2025, CSRD is already proving a challenge for many – despite there being a clearly defined reporting framework.

Generally, Haye – like many others – believes that regulators should provide guidance as to the end goal of regulations, but without dive too deeply into how to reach it.

“There is an important distinction between identifying the target result and proscribing the way it should be attained – it’s better to let those companies [in the market] innovate in terms of methodology, technology and so on,” he argued. “While regulation is necessary and we need to have framework to speak the same language, sometimes regulators stray into the ‘how’, which blocks innovation.”

AI is also a good solution for interpreting new regulations, according to Haye. For example, IDL has created a prototype solution to extract ESG data from CSRD reports.

“ESG teams need to harvest and analyse the data to create the report, so this could be an interesting solution for them,” he added.

The post How AI Can Empower ESG Analysis appeared first on ESG Investor.