If your organization struggles to corral and analyze unstructured data, you’re not alone. Only 18% of organizations in a 2019 survey by Deloitte reported being able to take advantage of such data.
In fact, a majority of data (80% to 90%, according to multiple analyst estimates) is unstructured information like text, video, audio, web server logs, social media, and more. That’s a huge untapped resource with the potential to create competitive advantage for companies that figure out how to use it.
Unlike structured data — which is organized in a searchable format, like a database — unstructured data doesn’t adhere to conventional data models. These forms of data are often more challenging to interpret, the Deloitte report said, but can deliver a more comprehensive and holistic understanding of the bigger picture.
“Because structured data is easier to work with, companies have already been able to do a lot with it,” saida finance lecturer at MIT Sloan and head of machine learning at Kensho, which specializes in artificial intelligence and analytics for the finance and U.S. intelligence communities.
“But since most of the world’s data, including most real-time data, is unstructured, an ability to analyze and act on it presents a big opportunity.”
Retail and finance lead the way
The industries that have most capitalized on that opportunity thus far are retail and finance, said Althea Davis, director of data practice, data strategy, and CDO advisory services at NXN, a consultancy in Abu Dhabi that specializes in smart cities.
In the mid-2000s, retailers were the first to combine and analyze data from customer emails, images, voice, and store-traffic records to market to particular customers. And in the last few years, finance has made substantial progress. “Retail could run circles around the average bank back then,” Davis said. “But the banks have really caught up.”
Now other industries, including shipping, transportation, legal, and real estate, are leaning into unstructured data.
Parsing financial text
Companies approach unstructured data in two main ways. Some are able to organize and analyze their unstructured data themselves. Other companies have built a business structuring unstructured data, then selling that as a service to others.
“There is a large and growing contingent of companies doing that,” said Shulman — including Kensho, which was acquired in 2018 by S&P Global, formerly Standard & Poor’s, which uses its ability to analyze unstructured data to offer insights to its clients.
Kensho uses natural language processing, a type of machine learning, to parse unstructured finance data, quickly pulling numbers from earnings report documents, for example.
Machine learning is a method by which computers learn to perform tasks by analyzing examples of those tasks. S&P Global uses that capability to automate what were formerly manual and time-intensive processes.
Kensho uses machine learning to increase both the speed and scale of S&P Global's data offerings. “In finance, time is money,” said Shulman. “If you can make decisions faster than your competitors, then you can make money off of that.”
Shipping, legal, and banking
S&P Global offers a similar service for the shipping industry through another acquired startup, Panjiva. The company analyzes shipping manifests, structures the data, and then sells it. “Think about the amazing supply chain insights you might get from data about what goods are shipping where in the world,” said Shulman. That’s valuable for not only Wall Street analysts, but also governments and other organizations involved in geo-politics, he said.
Another company using the data-as-a-service business model is LexisNexis, whose Lex Machina unit provides a similar service for the legal industry. It scans legal briefs and court decisions, pulling and analyzing data that can provide insights into judges, lawyers, and lawsuits, which various parties can use to their advantage in a legal case.
Meanwhile banks — especially digital-native banks — are using unstructured data to market new products. For example, the pioneer online bank ING partnered with AXA to sell insurance online, Davis said.
As a contractor, Davis helped build an analytics platform for the partnership that uses a variety of data, much of it unstructured, to identify and engage potential customers, she said.
For example, a bank customer’s data may indicate that her child (with whom she has a joint checking account) is moving away from home to attend college. Based on data available on the particular university town, ING can offer her theft or renter’s insurance at a click of the mouse.
“They can look at so many different types of data, including social media posts and even images of various types of housing — and pull it into their predictive models, which then spout out how they can best target this product to students and parents,” said Davis.
The policy can be sold — and the appropriate legal documents signed — all electronically. To identify potential customers for travel insurance, the partnership can collect and analyze data from travel agencies and websites to find people who are planning a vacation. “So much of that data is not structured,” Davis said. “They are getting it from many different places, and it’s not in bits and bytes.”
Diagnosing airplane health
Although the airline industry is not very advanced in using unstructured data to target customers, Davis said, she saw how Etihad Airways in Abu Dhabi, where Davis worked as enterprise data governance manager, not only analyzed its own unstructured data to improve its business, but is also selling that capability as a service.
Airlines operate all day every day of the year, so equipment problems that interrupt operations can sap profits. Etihad used advanced analytics to monitor maintenance, repair, and operations data, including sensor data from aircraft, and predict potential problems so the company can take preventative measures, she said.
Etihad then created a new business unit to provide the service to other airlines. “It became a cash cow for Etihad,” she noted.
Changing the future
Davis expects more industries to learn how to leverage the power of unstructured data. NXN, her current employer, is creating a reference-model platform to gather and analyze data for smart cities, she said.
Such data would be valuable to governments, citizens and businesses. Landlords could better monitor and manage their properties and improve the quality of life for tenants by using information from social media, video cameras and police reports, for example. And governments could use a combination of structured and unstructured data to improve cities and better engage with citizens.
“The integration of this data could be beyond tactical,” Davis said. “We are working toward turning data into strategic information you could use to literally change the future of a city.”