Machine learning can help streamline your business, but without the right approach and implementation, it could end up a waste of your time and resources.
That’s where Feature Labs, a Boston-based company spun out of MIT’s Computer Science and Artificial Intelligence Lab, hopes to help. Feature Labs builds software to make machine learning easier to use.
The company recently released a white paper — Machine Learning 2.0 for the Uninitiated — highlighting six steps a company can take for building and deploying a machine learning solution.
“The reason we called it ‘2.0’ is we wanted to position it as starkly different than what a lot of companies are doing today,” said Feature Labs CEO and cofounder Max Kanter. “The way most of them are approaching machine learning is very much as exploratory projects, research papers; it’s very much proof of concept. What ends up happening is the project they do might get some initial results, but they don’t transition to deploying and operationalizing the model.”
The white paper, Kanter said, serves as a blueprint that ranges from the organization of raw data to defining a prediction problem to validating and deploying that machine learning process. Machine learning algorithms need information to be formatted in a particular way: a single table of data with one historical training example as a row, and explanatory variables — or features — as columns.
The algorithm’s job is to use all of those features to predict outcomes, Kanter said. For example, a very simple feature for credit card fraud could be the purchase amount or time of the transaction. Anything above a certain amount, or beyond a certain time, would be considered fraud.
There are more specific and accurate ways of predicting fraud, like the currency in which something is paid for, the country where a charge is made, and the frequency of purchases. But each of those values must be brainstormed, and then put into the dataset for the algorithm.
The challenge then, Kanter said, is hiring people with machine learning expertise, who understand what makes a good feature, or hiring people who know how to write the code to extract those features. Without the right features, machine learning algorithms won’t have the information they need to work well.
The idea for automating the feature engineering process came from Kanter and fellow cofounders Ben Schreck, BS ’15 and MS ’16, and Kalyan Veeramachaneni, a principle research scientist and director of the Data to AI Lab at MIT, after reflecting on their own time as data scientists.
“What we’re trying to do is automate that process, so we can automatically recommend to our customers what features they should be using,” Kanter said. “Help them streamline the process without spending a lot of time manually doing it.”
Kanter said Feature Labs’ clients run the gamut, from companies like Accenture that are using software to build project managers powered by artificial intelligence, to helping large international banks protect against credit card fraud.
Millie Liu, MFin ’12, founder and managing partner at investment firm First Star, said she was drawn to Feature Labs for several reasons, particularly the company’s potential to fulfill a need, and the success of the Feature Labs team to turn corporate sponsorship of its research while at MIT, into actual customers.
“We see that machine learning has a really broad impact in our industry,” said Liu, who is an advisor for MIT CSAIL. “Not only the Facebooks and Googles, but really no matter if you’re a big bank, an investor, or a Proctor & Gamble, almost any industry out there generates a huge amount of data.”
The problem, Liu said, is unless you’re a Google or Facebook, you can’t really afford or have access to hire those machine learning experts.
“That’s where we see a gap,” Liu said. “There is a huge gap in resources and talents, and that’s what Feature Labs technology is solving. We’re really excited about that.”
Feature Labs is also funded by the Defense Advanced Research Projects Agency’s Data-Driven Discovery of Models [D3M] program.
The company also has an open source library called Featuretools, which is designed for data scientists to experiment with its technology.
“We’re trying to make it easier for people to do machine learning,” Kanter said. “Not only people who are doing feature engineering today, but everyone who wants to get involved in machine learning but doesn’t have the skills yet to do it.”