A new study on the impact of generative AI on highly skilled workers finds that when artificial intelligence is used within the boundary of its capabilities, it can improve a worker’s performance by as much as 40% compared with workers who don’t use it.
But when AI is used outside that boundary to complete a task, worker performance drops by an average of 19 percentage points.
The findings have implications not only for worker productivity but for organizations looking to successfully navigate what the researchers have termed the “jagged technological frontier” of AI — specifically, generative pretrained transformers (GPT), which produce text after being given prompts. Understanding the upper limits of AI’s abilities is imperative, particularly as those abilities continue to expand, the researchers write.
It’s important for managers to maintain awareness of this jagged frontier, said Harvard Business School’s Fabrizio Dell’Acqua, the lead author of the paper, because the researchers found that it was not obvious to highly skilled knowledge workers which of their everyday tasks could easily be performed by AI and which tasks would require a different approach.
A multidisciplinary team of researchers authored the paper, including Karim Lakhani and Edward McFowland III from Harvard Business School; Ethan Mollick from The Wharton School; Hila Lifshitz-Assaf from Warwick Business School; and MIT Sloan’s
Here’s a closer look at the study, along with some suggestions from Dell’Acqua, Kellogg, and Lifshitz-Assaf of steps organizations can take when introducing generative AI to a highly skilled workforce.
For best results, use cognitive effort and experts’ judgment
The study was conducted in collaboration with Boston Consulting Group and involved more than 700 consultants who were assigned a skills assessment task and an experimental task.
The participants were sorted into two groups: One group was given a task designed to fall within GPT-4’s capabilities of producing humanlike outputs, while the other group received a task “designed so that GPT-4 would make an error when conducting the analysis, ensuring the work fell just outside the frontier,” according to the paper.
Within those two groups, study participants were sorted into three conditions: no access to AI, access to GPT-4 AI, and access to GPT-4 AI with an overview of how to use GPT.
Generative AI can improve a highly skilled worker’s performance by as much as 40% compared with workers who don’t use it.
The “inside the frontier” group was asked to imagine that they worked for a shoe company and that their manager had asked them to come up with a new product and present it at a meeting. Participants in this group were also instructed to complete several other actions, including coming up with a list of steps from pitch to launch, creating a marketing slogan, and writing a 2,500-word article describing the end-to-end process for developing the shoe and lessons learned.
AI had a positive effect on that group of participants: The GPT-only participants saw a 38% increase in performance compared with the control condition (no access to AI), while the performance of those who were provided with both GPT and an overview saw a 42.5% increase in performance compared with the control condition.
Interestingly, the researchers observed a bigger jump in performance scores for the participants in the lower half of assessed skills who used GPT-4 compared with those in the top half of assessed skills — at 43% and 17%, respectively — when they were compared with their baseline scores (i.e., no AI use).
The “outside the frontier” group was asked to imagine that they worked for a company with three brands. They were tasked with writing a 500-to-750-word memo to their CEO explaining which of the brands the CEO should invest in to drive revenue, and with suggesting innovative actions the CEO could take to improve the selected brand. The memo needed to include the rationale for their recommendation, and group participants were provided with interview comments and financial data from which they could draw. Additionally, participants were graded on the “correctness” of their recommendation — their reasoning or justification — by human evaluators.
AI had a negative effect on participants in this group. The GPT-only condition saw a decrease in performance of 13 percentage points compared with the control condition, while participants who had been provided with GPT and an overview showed a decrease of 24 percentage points compared with the control condition.
Dell’Acqua said that for the “outside the frontier” group, the researchers observed a performance decrease because people would “kind of switch off their brains and follow what AI recommends,” which was more likely to be incorrect. However, even when an incorrect recommendation was made under one of the AI conditions, the quality of the participant’s recommendation justification improved.
The quality improvement and decrease in performance indicates that, rather than blindly adopting AI outputs, highly skilled workers need to continue to validate AI and exert “cognitive effort and experts’ judgement when working with AI,” the researchers write.
Interface design, onboarding, role reconfiguration, and a culture of accountability
According to the researchers, there are several things organizations and managers should consider as they integrate AI into their employee workflows.
While it’s tempting to use AI for knowledge work because it is fast, can boost rapid idea generation, and produces persuasive text, managers and professionals need to be cautious when using it for important tasks, Lifshitz-Assaf said.
Because some AI-generated answers look credible even when they’re incorrect, “that suggests there’s a role for internal or ‘wrapper’ developers to help design the interface in a way that makes it less likely that people fall into some of these traps,” Kellogg said.
Developers can also help with figuring out where AI can be inserted into workflows and how to design technology for doing that.
Kellogg and Dell’Acqua also recommended that organizations have an onboarding phase so workers can get a sense of how and where the AI works well and where it doesn’t and receive performance feedback. Relatedly, some people are very good at upskilling themselves, and those workers can be helpful as peer trainers. But they should be rewarded and recognized for their work.
Managers will also need to reconfigure roles.
“To use generative AI well, it’s important to investigate the specific tasks along the work process,” Lifshitz-Assaf said. “Some may be within the jagged frontier, and others outside.”
Kellogg said that leaders can encourage role reconfiguration by having people from different positions experiment together to find the most productive structure.
Leaders should also encourage a culture of accountability. In talking with study participants, Kellogg said, one suggestion was that “we need to teach people to be able to explain what they did without using the term ‘generative AI.’”
“Managers and workers need to collectively develop new expectations and work practices to ensure that any work done in collaboration with generative AI meets the values, goals, and standards of their key stakeholders,” Kellogg said.