Credit: Rob Dobi

Can generative AI provide trusted financial advice?

Betsy Vereckey

Apr 8, 2024

Artificial intelligence is already a mainstay in financial technology, where it underpins everything from budgeting apps to customer service chatbots. But can AI provide smart, personalized financial advice the way financial advisers do?

The reality is closer to that than you might think, said a professor of finance at MIT Sloan and director of the MIT Laboratory for Financial Engineering. Lo recently presented an update on research he’s conducting that sheds light on whether generative AI has the potential to provide sound financial advice unique to each individual.

Many people who want bespoke financial advice meet with a trusted financial adviser. Can large language models — the bedrock for AI systems like GPT-4 — step in to replace them? Lo is collaborating on a three-part project with graduate students Jillian Ross and Nina Gerszberg to better understand the role of LLMs in providing financial advice. Lo presented their findings thus far at the 2024 MIT AI Conference, sponsored by the MIT Industrial Liaison Program.

“Financial advice, in my view, is an ideal test bed because the stakes are really high,” Lo said. “There’s a lot of money in this space, a lot of financial advisers, and a lot of problems with having bad financial advice.”

Here’s what the researchers have learned so far by asking the large language models to perform certain tasks.

Large language models have the acumen to provide financial advice, but only with the addition of supplemental modules.

LLMs can already be used to offer financial advice. The important question, Lo said, is whether the advice is good — i.e., that it reflects the domain-specific knowledge that humans demonstrate in passing the CFA exam and obtaining other certifications.

Lo said his team’s research thus far shows that AI does a pretty good job of this — provided that a supplemental module that incorporates finance-specific knowledge is added to the mix.

“The preliminary analysis is that with a relatively light module — not a lot of data and not a lot of analytics — we’re actually able to generate passing domain-specific knowledge among large language models,” Lo said.

Without a module, ChatGPT “doesn’t quite pass, but it’s close,” Lo said. “It’s actually remarkably close.” But even so, Lo anticipates that his ongoing research will find that supplemental modules or finance-specific LLMs will continue to be required to navigate the sector’s complex legal, ethical, and regulatory landscape.

AI has the potential to personalize financial advice in tone and content.

Numerous large language models, such as ChatGPT 4.0, are targeted toward individuals with at least a college-level education, but Lo and his research team are working to see whether they can “bring it down to a high school level, where, for financial purposes, that actually would be ideal.”

As they develop the ability to “talk” to both an elderly retiree who never finished high school and a professional regulator, LLMs will not just be able to answer questions satisfactorily but do so in a relatable tone, the way a human financial adviser would.

Out of the box, large language models adopt either a neutral or slightly positive tone, but Lo believes that this, too, can be personalized to help cultivate a relationship with a client, which in turn may increase the chance that the client will follow the LLM’s financial advice.

Lo’s work on tone builds on his earlier research on why some investors “freak out” and exit the stock market after significant losses. “If your client is neutral, then you should adopt a neutral tone,” he said. “If your client is slightly positive, adopt a positive tone.”

However, if a client is exhibiting wild optimism or pessimism, an adviser should adopt the opposite tone — with the idea of hitting a middle ground.

“When we are engaging in extreme reactions, those are the times when you need this kind of opposite reinforcement to be able to moderate individuals,” Lo said. “When the investor is exuberant, you want to bring that investor back down to planet Earth so that they don’t engage in any kind of extreme investment behavior.”

Generative AI has the potential to act ethically, but bias remains a concern.

Create your own path

Developed in part due to the kinds of opportunities students pursue after graduation and the types of companies that are recruiting at MIT Sloan, the EM Certificate is flexible, allowing students to focus on their specific areas of interest and pursue their career goals. MIT Sloan's EM Certificate curriculum helps graduates succeed in the kinds of roles they are seeking.

Recent EM Certificate students have completed summer internships at companies including Amazon, Bain, Boston Consulting Group, IBM, and Nike. In addition, EM Certificate graduates have accepted positions at Fidelity Investments, Genentech, McKinsey & Company, Microsoft, and Tesla.

The final point that Lo explored was what he considers a “complicated question” about whether generative AI can be trusted. Can it adhere to a fiduciary duty to engage in ethical financial behavior, as human advisers are required to do?

“That’s a whole can of worms,” Lo said. “Some people would argue that financial ethics is an oxymoron, but I would disagree with that. We have to talk about the notion of fiduciary duty.”

Leading the AI-driven organization

AI-boosted resumes increase the chance of being hired

Build better KPIs with artificial intelligence

To get more specific about what “engaging in the best interest of your investors” actually means for LLMs, Lo and his research team turned to retrieval-augmented generation, known as a RAG, which retrieves domain-specific data from external knowledge bases. They created a RAG consisting of financial lawsuits filed between one party and another as a way to train the technology on how to default toward ethical behavior.

“It turns out that when you apply this setting to large language models, ChatGPT 4.0 ends up being relatively fair, but the other large language models have biases — and there are a number of other biases, including gender bias, that large language models exhibit,” Lo said.

In their working paper, Lo and his co-authors note, “Training data can come from all corners of the internet, which contains a glut of biased and toxic content. When trained on this data, LLMs can exhibit harmful biases that are difficult to preemptively identify and control, such as ‘parroting’ historical prejudices about race, ethnicity, and gender, an obviously undesirable outcome.”

The findings could have implications for other industries.

Lo said that studying the application and usefulness of LLMs in finance can be applied across different sectors, such as the medical, accounting, and legal fields.

“Looking at domain-specific applications is a really useful way of developing a better appreciation of some of the theoretical challenges to generative AI and general intelligence,” Lo said.