top of page

The Human Biases AI Is Exposing

  • Writer: Lynda Elliott
    Lynda Elliott
  • Dec 21, 2025
  • 7 min read

Updated: Jan 15


There is a particular hot, stinking pile of embarrassment that AI has a habit of dropping people into.


Before AI, most mistakes at work were generally small and human-sized: you missed a detail or CC'd in the wrong James. Occasionally, you may have misspelt someone’s name.


At its worst, you may have sent a WhatsApp to your colleague, saying you wouldn’t be at work that day, and autocorrect decided to change your groggy message to: “I’m not coming in today. I’m feeling dick”.


Embarrassing, sometimes gut wrenchingly so… but recognisably human. The sort of thing you could plausibly blame on tiredness, context, a bad Tuesday - or autocorrect.


AI enters the room and, boy oh boy. The landscape shift brings another level of 21st century angst.


Unless you specifically ask for available evidence, AI confidently invents it without even blinking. No verified source? No problemo, let’s imagine one into being! And when that sort of thing happens, the embarrassment isn’t really about being wrong. It’s about how you were wrong.


Because what gets exposed isn’t a lapse in attention, it’s the moment when you handed part of your thinking over to a confident assistant and trusted it to run the shop.

Most people using AI at work already know this, even if they don’t own up to it. Jakob Nielsen’s recent research shows that 69% of professionals use AI by stealth. They paste AI text in, then hide the evidence. Dust off those pesky em dashes. Tweak the rhythm, wiggle a few phrases into place until it creaks like human effort. A final grooming of the ghost in the machine before sharing their work. All good so far.


When AI makes a small faux pas, it’s irritating. But when it makes a big one (inventing research, misrepresenting sources, producing things that never existed) it doesn’t just fail.


It outs you.


Which brings us, inevitably, to Deloitte.



The moment private AI use became very public


In late 2024, Deloitte was commissioned by the Australian government’s welfare department to review a major IT system.


Let’s be clear. This wasn’t a blog post or an internal draft. It was a formal, government-commissioned report intended to inform public policy. The kind of document that’s supposed to survive contact with scrutiny; marinated in process, sign-off, and institutional reassurance. And it didn’t come cheap. The report was published in July 2025.


At first, nothing much happened. It sounded authoritative, and did what these reports are designed to do: exist quietly and not cause alarm.


Then, in October, it hit the news.


Reviewers had started digging into the references and discovered something surprising: court judgments that didn’t exist. Academic papers that couldn’t be found. Citations that looked perfectly respectable until someone tried to follow them.

Er… awkward.


Deloitte corrected the report, acknowledged that generative AI had been used in parts of the document, and partially refunded the contract.


Embarrassing? Certainly. Public? Painfully so.


On its own, this could have been written off as an isolated failure, an early AI growing pain.


But in May 2025, Deloitte had also published a separate, government-commissioned report in Canada, which focused on healthcare workforce planning. Once more, serious subject matter with serious impacts. This is the kind of work that’s supposed to be robust by design.


In November, just weeks after the Australian story had finished doing the rounds, journalists uncovered similar issues in the Canadian report.


A different report, a different country, but the same failure mode. Searingly close together. 


Ouch.


What makes this sequence interesting isn’t that mistakes happened. It’s that the same kind of mistake happened twice, inside environments explicitly designed to prevent exactly this type of outcome.

These were high-stakes, public artefacts produced inside organisations whose entire value proposition is rigour. The failures didn’t happen because AI was casually or recklessly deployed.


And yet the same errors slipped through, which points to something quite revealing. To the people doing the work, it probably didn’t feel like a failure. It felt finished.



Why this didn’t feel reckless at the time


At this point, it’s tempting to reach for explanations like incompetence, negligence, or corner-cutting. Those explanations are emotionally satisfying. But if the answer is simply “people didn’t care enough”, we wouldn’t see this pattern repeat in places staffed by highly competent, well-meaning professionals.


A more uneasy explanation is that, at the moment the work was being written, reviewed, and signed off, it didn’t feel reckless.


Most professional work doesn’t happen in calm, spacious conditions where someone sits back and asks, Have I interrogated every assumption here? It happens between meetings, under time pressure, with competing priorities, across multiple team members, and a steady background hum of other things that also need attention.


In those conditions, nobody consciously decides to lower their standards. What happens instead is unconscious: we look for signals that tell us it’s safe to move on.


Does it read clearly? Does it sound confident? Does it look like the kind of thing that usually passes review?


When the answer is yes, the work progresses. Not because anyone is careless, but because this is how human cognition manages effort under load. And crucially, that trust didn’t come out of nowhere.


Automation bias trained us to trust machines long before generative AI arrived: calculators, spellcheck, GPS. Systems that were narrow, predictable, and reliably correct. Over time, that trust became a reflex.

Then generative AI folded in a new dimension - the fluency heuristic.  Smooth language, confident structure; the feeling of completed thought. Fluency is a powerful cue. It doesn’t just make text pleasant and polished. It tells the brain that further effort is unnecessary, to trust the output.


Put the two together and you get something subtle and dangerous: a fluent output that feels trustworthy and finished, even when it isn’t.



The brain isn’t built for rigour. It’s built for efficiency.


There’s a tacit belief that expertise makes us more thorough. In reality, it often makes us faster.


The human brain evolved under constraints that rewarded speed over perfection: limited energy, limited time, incomplete information. In that environment, a decision that’s good enough and timely beats one that’s theoretically optimal but arrives too late.


This ancient optimisation strategy is still running. Cognitive scientists call it the Principle of Least Effort, anthropologists call it Economy of Effort, this human tendency to conserve mental energy by relying on shortcuts that usually work well enough.


These shortcuts aren’t flaws, but adaptations. They allow us to function in complex environments without grinding to a halt.


The problem is that modern AI systems plug directly into those shortcuts.



Why AI output slips past our internal alarms


AI doesn’t just save time. It changes how effort feels.


When you ask a language model for help, it doesn’t hand you fragments or half-formed thoughts. It hands you something that looks complete: proper sentences, logical flow, a beginning, a middle, an end. References sitting politely where references are supposed to sit.


And that matters, because the human brain is not a validation engine. It’s a filtering system shaped by prediction, deciding what deserves attention, and what can be safely ignored. We scan for clues and mental breadcrumbs that tell us we're on the “successful” path.


The “AI can make mistakes” disclaimer is technically present, but suffers from classic banner blindness: it sits outside the interaction and gets filtered out, while the coherent output triggers a fluency override that the brain weights far more heavily in the moment.

There’s another force at work here - humans have a deep aversion to uncertainty. Not knowing is disquieting. It keeps the mind open, unsettled, and working. Certainty, by contrast, feels like relief - even when it’s false. A confident answer closes the loop.


When uncertainty disappears, so does the impulse to check.


Verification (checking sources, interrogating claims, tracing things back) is a different cognitive mode entirely. It’s slower, and it burns energy. Unless something actively forces us into it, we tend not to go there. Rigour is expensive.


So when an AI assisted document looks tidy and confident, a quiet signal fires: this has already been thought through.


It hasn’t, of course. What’s happened is pattern completion by the AI, not understanding. But the brain doesn’t experience that distinction viscerally. It experiences fluency.


And fluency has always been a reliable cue, until now.



When authority makes things worse, not better


There’s another layer here, too. The Deloitte reports didn’t just come out of “an AI”. They came wrapped in institutional credibility: a prestigious firm, formal processes, professional formatting; the implicit assurance that someone, somewhere, had already done the hard work.


In complex environments, authority doesn’t belong to just one person. It’s spread across teams, processes, and approvals. Each round of eyeballing communicates the same message: this has already been handled.


As those signals compound, scrutiny fades. By the time a document reaches its final form, responsibility is diffuse, and checking no longer feels like diligence. It can feel more like overkill.

So for the Deloitte team, it’s likely nothing stood out as reckless or lazy. All the right steps appeared to have been taken. Authority acted like a watermark everyone assumed was present. And this is precisely why these failures are so unsettling when they surface.



This isn’t a mistake. It’s a collision.


What we’re seeing is the collision between three things that usually work well on their own:


  1. Human cognitive efficiency, which defaults to “good enough” under load

  2. AI systems that produce outputs that look finished and authoritative

  3. Organisational incentives that reward speed, scale, and visible progress more reliably than slow, quiet verification


When those line up, errors don’t announce themselves, they become camouflaged.


The work looks done. It reads cleanly and sounds confident. The seams are smooth and the joins don’t show. Nothing jars, nothing trips an alarm. So the brain stays in progress mode, not verification mode.


These documents pass review not because nobody is responsible, but because responsibility is being exercised through surface signals that suggest the thinking has already been done. Somewhere, by someone else.



Why the embarrassment cuts so deep


Which brings us back to that pile of hot, stinking embarrassment.


The discomfort people feel when AI-assisted work goes wrong isn’t just about being incorrect. It’s about exposure. It's about being seen to have outsourced something that sits too close to judgement, synthesis, or sense-making. These are the very things we still like to believe are distinctly ours.


Perhaps this is why so many people use AI behind the scenes, why they smooth over the tells and smudge the AI polish. Not because the output is useless, but because being seen to rely on it carries social risk. It reveals the shortcut. And shortcuts, once visible, are hard to defend.



Expect more of this, not less


None of this means AI shouldn’t be used, but it does mean we should stop assuming these failures indicate human incompetence.


AI doesn’t create a new human flaw, but it does expose an optimisation strategy that has served us well for a very long time, now misfiring in a new cognitive environment.


Left to default cognition, economy of effort wins.

Slowing down to interrogate superficially fluent, probabilistic text generators is not something our brains naturally want to do.


This behaviour has to be learned, supported, and designed for - because nowadays, our shortcuts can have unexpected consequences.


Comments


bottom of page