Learn More

Insights

Why your AI agreeing with you might be a problem.

Article

“It would have taken me a year to put together the work you’ve done in 2 months”

SVP, Chief Clinical Officer

Most of the conversation about AI risk in an enterprise context is about accuracy. Is the model hallucinating? Did it get the facts right? These are real concerns that get a lot of attention. We’ve written about some of the reasons why.

But there's another kind of risk that's gotten a lot less attention. And recent research suggests it might be the bigger one.

AI has been fine-tuned to be extremely pleasant to work with. Who doesn’t like to be told that every idea they have is brilliant? The problem is, AI agrees with you even when you're wrong. And it does this convincingly enough that even experienced practitioners begin to believe the echo chamber.

Three studies published in the last year tackle this from different angles. An MIT CSAIL team actually provided a cognitive-mathematical proof that sycophantic AI causes "delusional spiraling" in users. A Stanford team published peer-reviewed empirical evidence in Science confirming this happens across every major model. And an MIT Media Lab teamused EEG brain imaging to measure what happens to cognitive engagement when people write with LLM assistance.

What’s interesting is that these findings apply even to highly experienced AI users who are fully aware that AI is designed to respond in the fashion it does.  The way these tools are trained to align with humans makes them systematically prone to telling you what you want to hear. And that tendency has measurable downstream effects on how everybodythinks, makes decisions, and trusts mis-leading answers.

What the research found

In February 2026, Kartik Chandra and colleagues at MIT CSAIL, the University of Washington, and MIT's Department of Brain and Cognitive Sciences published a paper called "Sycophantic Chatbots Cause Delusional Spiraling, Even in Ideal Bayesians."

It’s a mouthful. But the title tells you a lot. They didn't model gullible people. They modeled what they call an "ideal Bayesian," a hypothetical agent who updates beliefs perfectly rationally based on new evidence. In other words, they have zero cognitive bias, and perfect logic.

That person still ended up with high confidence in false beliefs after extended interaction with a sycophantic chatbot. And did so every time they ran the simulation.

You share a thought. The AI agrees. You share a stronger version. It agrees harder. Your confidence climbs. You go further. It follows you. Each step feels rational because the AI is providing what looks like confirmatory evidence. But the direction of the conversation is being shaped by the AI's training objective, which is to produce responses that users rate positively. And users rate agreement positively.

The MIT team draws an analogy to "Bayesian persuasion" from behavioral economics. A strategic prosecutor can raise a judge's conviction rate even if the judge knows the prosecutor is cherry-picking evidence. The selective presentation of information changes the beliefs of even a perfectly rational observer.

Sycophantic AI does the same thing.

One month after the MIT paper, Myra Cheng, Dan Jurafsky, and colleagues at Stanford published the empirical confirmation in Science. They tested 11 state-of-the-art models and 12,000 prompts with 2,400 human participants.

Every model they tested was sycophantic. Compared to how humans respond to the same situations, the AI models affirmed users' positions about 50% more often.

In one particularly pointed test they took 2,000 posts from Reddit's "Am I The Asshole" community, selecting only cases where the entire community agreed the poster was in the wrong. When they gave the same posts to ChatGPT, Claude, Gemini, and the other leading models, the AI sided with the poster 51% of the time.

They also tested prompts involving harmful actions. Manipulation, deception, illegal behavior. Across all 11 models, the AI endorsed the harmful behavior 47% of the time.

And the issue was that the users couldn't tell. The Stanford study found participants rated sycophantic and non-sycophantic AI as equally "objective." It’s invisible to the people experiencing it. And so users who interacted with sycophantic AI became more convinced they were in the right. Less willing to apologize or make amends. More likely to want to use that AI again.

In other words, people preferred the AI that was degrading their judgment. The feature that causes the harm is the same feature that drives the engagement.

The Stanford team calls this a "perverse incentive" loop. AI companies train models on human feedback. Users give higher ratings to responses they like. Users like agreement. So the training loop pushes toward more agreement. Which drives more engagement. Which generates more positive feedback.

What happens to your brain

MIT Media Lab recently reported the results of study called "Your Brain on ChatGPT" that approaches the question from a neuroscience angle. After the 95% AI failure rate debacle in the MIT Nanda State of AI in Business report, Kosmyna, et al. take great pains to point out small sample size, etc. However, their results are suggestive, and provide some actual neural evidence on what might be happening.

54 participants were assigned to three groups. One group wrote essays using an LLM. One used a search engine. One used no tools at all. The researchers recorded brain activity using EEG throughout the writing process, and ran the study across multiple sessions over four months.

The results were unambiguous. Brain connectivity, the measure of how actively different regions of the brain are communicating during a task, scaled down systematically with the amount of external AI support. The brain-only group showed the strongest and widest-ranging neural networks. The search engine group showed intermediate engagement. The LLM group showed the weakest overall neural coupling.

Now, you'd expect less mental effort when a tool is doing some of the work. But the more interesting finding came in the fourth session. They swapped conditions. LLM users were asked to write without any tools. Brain-only users were given access to the LLM.

The LLM-to-brain participants showed weaker neural connectivity and under-engagement of alpha and beta networks compared to the brain-only group. In other words, their brains didn't just turn off while using the tool. They appeared to have trouble turning back on when the tool was removed. The LLM users also reported lower ownership of what they'd written and had difficulty recalling their own essay content minutes after finishing.

Why the obvious fixes don't work

“So just tell people the AI might be sycophantic, and make sure it doesn't hallucinate.” Problem solved right?

Doesn’t work. The MIT CSAIL paper tested both of those mitigations explicitly.

First, they constrained the chatbot to only present true information. No hallucinations. A "factual sycophant" that uses something like retrieval-augmented generation to ground its responses in real data.

It still caused delusional spiraling. In fact, the factual sycophant was more effective at inducing spiraling than one that hallucinated, because selectively presented true information is harder to detect as biased than outright fabrication. The AI didn't have to make things up. It just had to choose which truths to emphasize.

Second, they informed users that the AI might be sycophantic. Gave them full knowledge of the chatbot's strategy. Again, informed users still spiraled. The rate was somewhat reduced, but the vulnerability remained.

Disclaimers, transparency labels, or "factual grounding" don’t appear to be very effective, so what will?

What to do about it

Th issue is the tools are doing what they’re supposed to do. The training objective that makes LLMs helpful, the responsiveness to user feedback, is the same mechanism that makes them sycophantic. You can't fully separate the two. Which means the mitigation has to happen at the organizational level, not the model level.

So what do you do? We’ll be the first to admit that we don’t have all the answers. And as models become more and more capable there’s a complexity issue. Every time adjustments are made in one dimension, unintended behavior and results pop up in others. Still, we certainly don’t recommend not using AI. The productivity and insight gains are real, even if somewhat Machiavellian.

But we think all this means you need to be more deliberate around how your teams interact with these tools. Here’s a few quick and dirty suggestions:

Classify your AI use cases by cognitive risk.

Not every AI task carries the same sycophancy risk. Data retrieval, formatting, code generation against a well-defined spec, scheduling. These are "AI for speed" tasks where the output is verifiable, and the human isn't asking the AI to think for them.

On the other hands, strategy recommendations, diagnostic reasoning, evaluating options, competitive analysis... these are "AI for judgment" tasks where the sycophancy risk concentrates.

Consider running a workshop with your teams. Have them list every AI-assisted workflow and tag each one as a speed task or judgment task. The judgment tasks get governance rules. The speed tasks get left alone.

Build disagreement into your AI workflows.

One major reason delusional spiraling loops work is because there's no pushback. But you can break that loop.

For example, before acting on any AI-assisted recommendation, require the user to run a counter-prompt in a completely different session. "What are the three strongest arguments against this conclusion?" or "Assume this plan failed completely in six months. What went wrong?" We’ve even seen some teams building this into their internal tooling as a required step.

Another approach: for high-stakes decisions, run the same question through multiple models. The Stanford study showed all 11 models are sycophantic, but they're sycophantic in different ways. Evaluating output coherence will surface disagreements you wouldn't see with a single model.

Redesign review processes to catch AI-assisted overconfidence.

The Stanford study found that sycophantic AI made people more convinced they were right and less willing to change course. That's a specific problem for any organization with peer review or deal approval processes.

If someone used AI to build a recommendation, the reviewer shouldn't know which parts were AI-generated. Forces evaluation on the merits. For AI-assisted decisions above a certain dollar or risk threshold, require the presenter to explain the reasoning chain in their own words without looking at the AI output. If they can't articulate it, the AI did the thinking, not them.

In strategy sessions or deal reviews where AI was used to prep, consider assigning someone the explicit role of questioning the AI-informed conclusions.

Measure cognitive engagement.

These might be hard to do. But just as a thought exercise:

  • Track your error catch rate: are people finding and correcting mistakes in AI output? If the rate is near zero, that probably doesn't mean the AI is perfect.
  • Track how often team members reject or significantly revise AI recommendations. You want that number meaningfully above zero.
  • Do periodic recall tests. Ask someone who just completed an AI-assisted analysis to summarize the conclusions from memory. The MIT Media Lab data suggests this is a reasonable proxy for cognitive engagement.

Add training around sycophancy to your change management processes.

The MIT Media Lab study showed cognitive effects within weeks of regular use. AI usage guidelines should cover not just how to use the tools but when to use them, when not to, and what to watch for.

Your people aren’t stupid. But they need to be aware.

None of this is to suggest your team is dumb or easily manipulated. MIT's whole point is that perfectly rational people are vulnerable. This is a structural problem built into how these models are trained. AI tools are trained to agree with you.

Implement AI with your eyes wide open about how it affects human cognition and decision quality.

Partner with Us

In today’s data-driven landscape, the ability to harness and transform raw data into actionable insights is a powerful competitive advantage.

Making better decisions leads to measurably better outcomes. With a solid data and AI foundation, businesses can innovate, scale, and realize limitless opportunities for growth and efficiency.

We’ve built our Data & AI capabilities to help empower your organization with robust strategies, cutting-edge platforms, and self-service tools that put the power of data directly in your hands.

Self-Service Data Foundation


Empower your teams with scalable, real-time analytics and self-service data management.

Data to AI

Deliver actionable AI insights with a streamlined lifecycle from data to deployment.

AI Powered Engagement

Automate interactions and optimize processes with real-time analytics and AI enabled  experiences.

Advanced Analytics & AI

Provide predictive insights and enhanced experiences with AI, NLP, and generative models.

MLOps & DataOps

Provide predictive insights and enhanced experiences with AI, NLP, and generative models.

Ready to embrace transformation?

Let’s explore how our expertise and partnerships can accelerate impact for your organization.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.