Credit: Rob Dobi
Ideas Made to Matter
Why finance is deploying natural language processing
Three years into his stint teaching machine learning at MIT Sloan, finance lecturer has just one complaint: It’s hard to keep up. “It's such a fast-moving field, a lot of what’s state-of-the-art now wasn't invented when I taught the course a year ago,” he said.
Officially titled Advanced Data Analytics and Machine Learning in Finance, the course reflects a move in finance, normally a tech-cautious industry, to embrace machine learning to help make faster, better-informed decisions.
Specifically, financial analytics firms are turning to natural language processing to parse textual data hundreds of thousands of times faster and more accurately than humans can, said Shulman, head of machine learning at Kensho. (The startup, which specializes in artificial intelligence and analytics for the finance and U.S. intelligence communities, was acquired by S&P Global in 2018.)
A casual observer might assume financial data to be more numerical than textual, but Shulman said that’s not the case. “Especially in finance, data that can help make timely decisions comes in text,” he said.
Text is unstructured data, and it’s inherently harder to use unstructured data, which is where natural language processing comes into play, Shulman said. A type of machine learning, NLP is able to parse the complexities of audio related to business and finance — including industry jargon, numbers, currencies, and product names.
Earnings reports are one example. “A company will release its report in the morning, and it will say, ‘Our earnings per share were a $1.12.’ That's text,” Shulman said. “By the time that data makes its way into a database of a data provider where you can get it in a structured way, you've lost your edge. Hours have passed.” NLP can deliver those transcriptions in minutes, giving analysts a competitive advantage.
3 use cases for finance
Finance may be relatively new to natural language processing, but as it ramps up, the industry is able to piggyback off of years research and development by tech giants like Google and Facebook, said an MIT Sloan lecturer in finance who teaches the class with Shulman.
“They’ve all worked with language now for decades; that’s their business,” said Kucsko, head of machine learning research and development at Kensho. The same information-sifting tools that allow people to filter out toxic tweets or query the internet from a single search bar hold significant promise for finance, he said.
“Whether you’re doing research on a company or mining some vast data sets on a country you’re interested in that no single human being could ever read, you start to need those same types of technologies,” Kucsko said.
Shulman and Kucsko laid out three instances where NLP can improve decision-making and speed inside financial organizations:
- Automation. NLP can replace the manual processes financial institutions employ to turn unstructured data into a more usable form — for example, automating capture of earnings calls, management presentations, and acquisition announcements.
- Data enrichment. Once unstructured data is captured, adding context makes it more searchable and actionable. “Imagine I get a transcript of that earnings call, and I want to find places where they’re talking about environmental impact,” Shulman said. Machine learning can enrich that raw text with metadata — flagging sections that address environmental impact, financial impact, or other topics of interest.
- Search and discovery. Finance is on a mission to find competitive advantage in broader and more varied types of data, but what’s missing is a search experience that’s as effortless and effective as the Google search bar consumers are accustomed to.
“If you're working at a bank or a hedge fund, and you're trying to search over your proprietary data, it can be a nightmare,” Shulman said. “An analyst might want to type, ‘Show me Obamacare in 10-K filings,’ but no 10-K filing would ever call it Obamacare,” he said. “We need systems that are intelligent enough to know that it's probably going to be called the PPACA.”
Machine learning for the masses
For financial institutions interested in gaining those benefits, the barriers to entry are considerably lower than in the past, thanks to what Shulman called a “democratization of tools” that has made once-arcane computer code available in less expensive, easy-to-learn formats.
“It’s actually pretty feasible now to do cutting-edge, state-of-the-art NLP in finance, or any domain, without a PhD in machine learning,” said Shulman, whose own PhD from Harvard, like Kucsko’s, is in physics.
Competition in the marketplace between Google and Facebook improves the machine learning ecosystem for all players. The tech giants are “pouring oodles of money” into competing machine language frameworks, TensorFlow and PyTorch. In their quest for market dominance, the rivals have made both frameworks open source.
“It’s really easy now to Google around a little bit, grab 10 lines of code, and get some pretty cool machine learning results,” Shulman said.
Early adopters lead the way
As for who in the organization should serve as the code-grabber, and what department should manage the code-grabbers, right now it’s all over the map. “Companies are still trying to figure out the most effective ways to jump into machine learning and not lose out,” Kucsko said. “Across the spectrum in finance, there’s not really one unique solution.”
Futher reading: How to build a data analytics dream team
Companies can bring in machine learning products, build out a data science team, or, for large companies, buy the expertise they’re looking for — as when S&P Global purchased Kensho.
In many instances, firms are likely to see machine learning seed itself into the organization through multiple channels, thanks to a proliferation of both interest and accessible tools. “You can apply machine learning pretty much anywhere, whether it’s in low-level data collection or high-level client-facing products,” Kucsko said.
Often, the impetus comes from individuals who realize they have valuable data being underutilized. “It could be documents. It could be a time series. It could be a list of buy-versus-sell decisions,” said Shulman. “And they say, ‘We have all of this data, but it’s too big for a human to make use of. How can we use machine learning and natural language processing to do that?’"
For financial institutions, which can be reluctant to deploy cutting-edge techniques like machine learning, this socialization process is an important step. “As more and more people see it work and understand the lingo, they see that it’s not a dark art — it’s math,” said Shulman. “You get more and more people on board and excited to be part of it.”