Credit: Feodora Chiosea / iStock
Ideas Made to Matter
7 lessons to ensure successful machine learning projects
When Michelle K. Lee, ’88, SM ’89, was sworn in as the director of the U.S. Patent and Trademark Agency in 2015, she saw an opportunity. The agency was a bit behind on digital transformation and adopting things like cloud computing and artificial intelligence, but the organization had mountains of data — like more than 10 million patents the office has issued since opening in 1802, and 600,000 patent applications received each year.
Lee led a project to use data and analytics to modernize the agency, such as implementing AI solutions to improve patent searches and the speed and quality of patents issued. By gathering data about how patent examiners make decisions, and determining outlying behavior, the office could also pinpoint areas in which examiners would benefit from targeted training.
“If the U.S. Patent and Trademark Office, a 200-plus-year-old governmental agency, has a machine learning opportunity, so too does every organization,” Lee said during a presentation at EmTech Digital, hosted by MIT Technology Review. “The challenge is in identifying those opportunities, and having a team and plan to implement them.”
Lee, who is now the vice president of machine learning at Amazon Web Services and a full-term member of the MIT Corporation, said she’s seen businesses in a wide range of industries successfully using machine learning. She’s also seen some common stumbling blocks, like businesses struggling to find the best use cases for machine learning, businesses failing to have easy access to their data, and businesses lacking necessary technical talent and expertise.
Here are her insights on how to ensure successful machine learning projects:
1. Make sure you have easy access to necessary data — and a comprehensive data strategy
Successful machine learning solutions start with a strong data strategy. “Your machine learning model is only as good as the data it's trained on, and data is often cited as the number one challenge to adopting machine learning,” Lee said.
If there are problems with the data, machine learning scientists will end up spending their time doing data cleanup and management, or they’ll get frustrated because they don’t have the data they need, she said.
Companies should make sure they have the three hallmarks of a strong data strategy:
- Data is viewed as an organizational asset, not the property of individual departments that created or collected the data.
- Data is democratized, meaning it’s available easily, securely, and in compliance with legal and regulatory requirements. Many companies face hurdles in accessing data they already have, Lee said. Some of the most powerful machine learning models are drawn from a combination of disparate data sets.
- Data is put to work through analytics and machine learning to make better decisions, create efficiencies, and drive new innovations.
In addition, Lee suggested four questions to ask when beginning machine learning projects:
What data is available to me today?
- What data is not quite available, but through modest effort could become available?
- What data don’t I have today, but I might want to have in six months or a year — and what steps can I take to begin gathering that data?
- Is there any potential bias in my data and data sources?
2. Carefully select machine learning use cases, and set success metrics
Businesses should start by defining their business problems, seeing which ones could be solved with machine learning, and outlining clear metrics to measure success, Lee said. Things to keep in mind include data readiness, business impact, and machine learning applicability. A high-impact business use case, without much data or machine learning applicability, will result in frustrated data scientists. A use case with lots of data and high machine learning applicability but low business impact probably won’t be adopted. Success metrics could include impact on revenue and efficiency.
“Every company has a machine learning opportunity,” Lee said. “Not every problem is solvable by machine learning.”
3. Make sure technical experts and domain experts work side by side
The right team is critical to choosing the right use case for machine learning, and to make sure the project is successfully implemented. This also helps close cultural gaps, Lee said — if relevant stakeholders are part of the entire process, everyone is most likely to accept, adopt, and implement the solution. When data scientists work in silos, the machine learning models they create are rarely implemented.
4. Ensure executive sponsorship and a culture of experimentation
“Top level guidance and prioritization is really critical,” Lee said — if she hadn’t led the digital transformation project at the U.S. Patent and Trademark Office as the organization’s top leader, it wouldn’t have succeeded. At Amazon, she said, every business leader within the organization was asked about 10 years ago how they planned to leverage machine learning, which forced everyone to work together and answer the question.
The right culture is also important. “Machine learning can be hard and it takes time,” Lee said. If an organization treats failure as something to be avoided at all costs, and not as a learning experience, that will be a barrier. Organizations instead need to take a longer-term view, understanding that models often don't work right away.
5. Assess and address any skills gaps
Many organizations don’t have all the data scientists they need, and may not be able to find or afford those employees. Companies should focus on training the existing workforce in addition to hiring. For example, Amazon developed a Machine Learning University program to train engineers (the content is available for free), and courses are also available through online platforms like Coursera.
Business leaders should also undergo training so they can start looking at business opportunities through a machine learning lens, Lee said.
6. Free your team from unnecessary heavy lifting and invest in the right infrastructure
Companies should focus on creating a machine learning infrastructure that works for employees regardless of their knowledge level, Lee said, and take advantage of existing tools instead of reinventing the wheel. Employees should be freed from undifferentiated heavy lifting — that is, hard work that doesn’t necessarily add value. For example, Amazon Web Services offers suites of machine learning tools that can be used by companies and employees at different levels of knowledge.
“Focus your team's efforts where they can add the greatest value, often in the business domain,” she said.
7. Plan for the long term
Adopting machine learning is not a “one and done” project, Lee said. Machine learning models need to be updated, retrained, and maintained as data changes. Companies shouldn’t think about implementing everything at once — instead start with a small project, show results, get buy-in, and work toward broader goals.
Leaders should also set the right expectations — and get started right away. “It's a long-term investment, and be tolerant of initial under-performance because it can produce better results for you in the end,” Lee said. “Given that machine learning requires data gathering, data cleansing, and prep, given that machine learning requires experimentation and a longer-term horizon … the time to get started is now.”