MIT Study: Only 5% of Enterprise AI Pilots Deliver ROI - The Prompting Gap Explains Why

AI Prompting

Jan 9

Forbes published an article yesterday asking the question every executive is thinking: "Everybody is using AI. Why aren't businesses seeing meaningful returns?"

The answer, according to MIT research, is brutal: only 5% of enterprise AI pilots deliver measurable financial impact. The other 95% stall.

That's not a technology problem. It's the prompting gap.

The MIT Data Everyone's Ignoring

MIT's "The GenAI Divide: State of AI in Business 2025" report reveals something most companies don't want to admit: AI adoption is sky-high, but returns are invisible.

Employees are experimenting. Pilots are live. Vendors are embedded. From the outside, it looks like progress.

The results tell a different story.

According to Kene Anoliefo, founder of HEARD and former product leader at Spotify, Netflix, and Google, the pattern is clear: "Most companies think this is a technology problem. It isn't. It's a leadership and workflow problem."

Here's what she means. AI tools aren't failing because they're not smart enough. They're failing because companies are using them the same way they'd use a search engine: type a few words, hope for something useful.

That works for finding information. It doesn't work for generating it.

Failure Mode #1: AI Has No Context, So Teams Don't Trust It

Forbes quotes Anoliefo describing AI the way it actually works: "AI is like a brilliant but clueless new hire. It can do great work, but only if it understands your company's context."

Most companies hand out AI access without onboarding it. No instruction manual. No training on product principles, brand guidelines, or decision logic. No examples of what good output looks like.

So the AI generates content that feels generic. Or subtly wrong. And once teams stop trusting the output, they stop using the tool.

Here's what that looks like in practice:

Generic Prompt: "Write a customer follow-up email"

What AI Returns: Bland, could-be-anyone content. "Thank you for your interest. Please let us know if you have questions." Zero personality. Zero context. Zero understanding of your customer, your product, or what actually happened in the conversation.

Your team reads it, realizes it's useless, and goes back to writing emails manually. AI adoption stays high on paper. Actual usage quietly drops to zero.

Expert Prompt: "WHO: Account manager at B2B SaaS company, working with 8-month prospect who attended our webinar last week and asked detailed questions about enterprise security features during Q&A

WHAT: Prospect's CTO expressed concerns about data residency during the webinar chat, mentioned they're evaluating three vendors including us, decision timeline is end of Q1. They requested a follow-up call but haven't responded to calendar invite sent 3 days ago.

NEED: Re-engagement email that references their specific security questions from the webinar, offers to connect them with our security architect for a technical deep-dive, acknowledges the decision timeline without creating pressure, and provides two specific calendar options for next week."

What AI Returns: An email that sounds like you actually wrote it. It references the real conversation. It addresses the actual concern. It moves the deal forward.

Same AI. Completely different output quality.

The difference isn't the technology. It's the instructions.

Failure Mode #2: Companies Don't Know What to Prompt For

The second failure mode hits deeper. According to MIT's research, AI doesn't just automate tasks. It exposes unresolved questions leaders didn't realize they were avoiding.

What counts as "done" for this task? Who owns the decision if AI suggests something? What happens when the AI output conflicts with someone's judgment?

In most companies, work runs on tacit knowledge: informal rules, historical exceptions, judgment calls that never make it into documentation. Humans navigate this ambiguity instinctively.

AI cannot.

So when teams try to use AI, they realize they don't actually know what they want it to do. The prompt "write a project status update" seems clear until you try to explain what information should be included, what tone matches your company culture, and what level of detail your VP actually reads.

Most teams give up at this point. They decide AI "doesn't understand our business" when the real problem is they haven't articulated what good output looks like.

This is the prompting gap: the space between having AI tools and knowing how to instruct them effectively.

Failure Mode #3: Generic Outputs Kill Adoption

MIT found that more than half of generative AI budgets flow into sales and marketing, even though the strongest returns appear in operations, compliance, finance, and support.

The reason is predictable. Front-office AI is easy to showcase. Back-office AI quietly saves money.

But here's what's actually happening: marketing teams use AI to generate content, realize it sounds like every other AI-generated marketing content, and either heavily edit it or abandon it entirely.

The excitement follows visibility. The returns follow fundamentals.

And the fundamentals are this: AI generates useful output when you give it useful context. Most people don't.

Forbes quotes Anoliefo's prescription: "Build a context library: the documents you'd share with a new hire, summarized in plain language, stored somewhere accessible, and attached when teams brainstorm or draft with AI."

That's exactly right. But most companies won't do it because it requires the work they've been avoiding: documenting how decisions actually get made.

Failure Mode #4: AI Gets Kept Away From Real Work

The few companies seeing results embed AI directly into decisions that already affect outcomes: pricing, risk review, customer resolution, compliance triage.

Most enterprises do the opposite. They spread AI thin across dozens of pilots without changing how any real decisions are made.

Why? Because integrating AI into real workflows requires answering hard questions about roles, accountability, and quality standards.

It's easier to run pilots that look impressive in board meetings but don't change how anyone actually works.

What the 5% Do Differently

The companies crossing the divide don't deploy more AI. They deploy it with discipline.

They start with one workflow tied to a measurable outcome. They redesign roles so AI removes work instead of adding review layers. And they judge success in financial terms, not dashboards or usage metrics.

According to MIT researcher Aditya Challapally: "The gap isn't between companies with AI and companies without it. It's between those who integrate it into how work actually happens, and those who don't."

Here's what that integration actually looks like:

Not Integration: "Everyone has access to ChatGPT. Use it however you want." Result: 95% of pilots stall.

Integration: "Customer success uses this specific prompt structure for escalation responses. We measure response time, customer satisfaction, and resolution rate. Here's the training, here's the template, here's how we evaluate quality." Result: Measurable financial impact.

The difference isn't access. It's structure.

The Prompting Gap No One's Naming

Forbes, MIT, and Anoliefo are all describing the same fundamental problem without naming it directly: the prompting gap.

It's the space between widespread AI adoption and the ability to use those tools effectively. Between access and execution. Between having AI and knowing how to instruct it.

The article talks about "context libraries" and "onboarding AI" and "clarifying decision logic." All of that is correct. All of that is necessary.

But none of it addresses what happens when someone sits down to actually use ChatGPT, Claude, or Gemini for real work.

They know they need an email, a report, a project plan, a customer response. They don't know how to translate that need into instructions an AI can execute.

So they type something generic:

"Write a sales email"
"Create a project timeline"
"Draft a performance review"

And they get something generic back. Something that requires so much editing they might as well have written it themselves.

The 5% of companies seeing returns aren't doing something magical. They're bridging the prompting gap: teaching people how to give AI the context it needs to produce useful output.

Why This Isn't a Training Problem

The natural response to "people don't know how to prompt" is "we need training."

That won't work.

Training teaches concepts. What people need is structure. They need a framework that works in the moment, when they're trying to get something done, not when they have time to think about prompt engineering theory.

The who plus what plus need framework does exactly that:

WHO = your role, your audience, your relationship context
WHAT = the specific situation, complete with relevant details
NEED = the exact output you want, with tone, length, and format specifications

This isn't a course. It's a system. You use it every time you need AI to do something useful.

The 95% Are Using AI Wrong Right Now

While you're reading this, thousands of employees at companies that "adopted AI" are typing generic prompts into ChatGPT and getting mediocre results.

They're writing "help me respond to this customer complaint" instead of explaining who the customer is, what they're upset about, what resolution authority the employee has, and what tone matches the company's brand.

They're asking AI to "write a project update" without specifying what information stakeholders actually need, what decisions are pending, or what format their VP prefers.

They're requesting "a marketing email" without describing the audience, the offer, the desired action, or the relationship context.

Every single one of those prompts will produce output that feels wrong. And every time that happens, trust in AI drops a little more.

That's the prompting gap. And that's why 95% of pilots stall.

Not because the AI isn't smart enough. Because the instructions aren't clear enough.

What Actually Closes the Gap

Forbes is right: AI needs context. MIT is right: integration requires discipline. Anoliefo is right: you need structure.

But none of that happens without changing how people prompt.

You can build context libraries. You can redesign workflows. You can clarify decision logic. And teams will still type "write an email" and wonder why the output is terrible.

The prompting gap exists because most people treat AI like a search engine when it works like a brilliant but clueless new hire. They give one-sentence instructions when the AI needs complete context. They expect useful output from generic input.

The companies in the 5% figured this out. They built systems that make effective prompting automatic instead of optional.

The other 95% are still hoping better AI will solve the problem.

The Question Forbes Asked

"Everybody is using AI. Why aren't businesses seeing meaningful returns?"

The answer is simpler than MIT's four failure modes, clearer than Anoliefo's context libraries, and more actionable than enterprise workflow redesign:

The prompting gap.

Most people don't know how to give AI the instructions it needs to produce useful output. They're using powerful tools with insufficient context, unclear objectives, and generic structure.

The technology works. The instructions don't.

Every executive reading that Forbes article is wondering why their AI investment isn't paying off. The answer is sitting in their employees' ChatGPT history: thousands of generic prompts producing thousands of mediocre outputs.

You can fix the workflows. You can build the context libraries. You can redesign the roles.

Or you can close the prompting gap by teaching people how to structure their instructions so the AI they're already using actually produces results worth using.

ROCKETS transforms basic prompts into expert instructions. Not through training. Not through documentation. Through a framework that turns "write an email" into a structured prompt that actually produces useful output.

That's the difference between the 5% and the 95%.

AI Strategy, Enterprise AI, MIT Research, AI ROI, Prompt Engineering, Business AI, AI Implementation, GenAI Divide, Workplace Productivity