WEEK 6 · AI FRIDAY · MAY 22, 2026

Confident
Wrong.

Why your assistant gets it wrong with a straight face — and the verification discipline that catches it before it ships.

Preston Magouirk · DC CAP Enterprise AI Leadership Pilot · Diligence in depth.

The setup

Hallucination is what the model does by design.
Verification is what you do by design.

Week 5 named the mechanics. Today we go deeper on one — Claude getting things confidently wrong — and build the stack newer users can run on every draft. Eight questions you've all asked once you've watched a hallucination land.

Why does Claude hallucinate?

Why does it sound so confident?

Where is it most likely wrong?

What is context overload?

Why do my early rules fade?

What manual checks should I run?

How do I ground in real sources?

How do I build verification gates?

You have all seen a confident wrong answer. Today we put names on what to do about it.

Where confident-wrong hides

Five danger zones.
Check first. Every time.

External memory — what the model absorbed in training — is strong on patterns and weak on specifics. These five are where the pattern fails most reliably. They are also where the verification stack lives.

Names

People, programs, partners.

Numbers

Counts, percentages, dollars.

Dates

Deadlines, events, history.

Citations

Papers, quotes, sources.

Your World

DC CAP partners, scholars, internal facts.

The closer the question gets to your specific world, the further it lands from Claude's strong patterns. Your world is the highest-risk zone of all — it never appeared in training data.

Q1 · Diligence

Why does Claude hallucinate?

A real work scenario

Eyewitness reconstruction.

When eyewitnesses describe a crime, they don't replay a recording. They reconstruct what likely happened from fragments and patterns. Elizabeth Loftus's classic work showed witnesses inserting details — a yield sign that wasn't there, a "stop" sign that was — because the pattern of a traffic accident filled the gaps.

The witnesses weren't lying. They were generating. And the gaps filled in the most plausible-sounding way, with full subjective confidence.

Loftus & Palmer, "Reconstruction of automobile destruction" (1974). Replicated for fifty years across forensic and lab settings.

In your Claude work

Every answer is a completion.

Claude predicts the next word that fits the pattern of an answer. There's no database query happening underneath. When the pattern is strong — "the capital of France is" — the prediction is reliable. When the pattern is weak — "the most-cited paper on nonprofit AI adoption is" — the model generates something that looks like an answer.

A plausible title. A plausible author. A plausible journal. None of which need to exist.

The takeaway: Every confident answer is a completion. Same mechanism the witness uses — at machine speed and machine scale.

Q2 · Diligence

Why does Claude sound confident when it's wrong?

A real work scenario

Death by GPS.

Between 2009 and 2011, Death Valley National Park logged a cluster of incidents — drivers following GPS routes onto closed roads, dry lakebeds, and dead-end canyons. People died. Park rangers started calling it "death by GPS." The National Park Service eventually posted warning signs at the boundaries.

The GPS didn't say "I'm 30% sure this turn is right." It said "Turn right in 200 feet" in the same confident tone whether the road existed or not. The interface had one register — certain — regardless of what was actually underneath.

Death Valley NPS incident reports, 2009–2011; New York Times, "Death by GPS in the Desert" (2011).

In your Claude work

Confidence is the default register.

Claude has no internal sense of how sure it is. There's no quiet "I'm 30% confident" threaded through every sentence. Training data is mostly written by humans who sound confident, so the model does too. A fabricated case citation reads the same as a real one — to the model, both are answers shaped like answers.

The fix is to ask: "For each claim in this draft, mark high / medium / low confidence and tell me why." The hedging the model wouldn't volunteer comes out when you ask.

The takeaway: Confidence is the model's default register. Calibration is something you have to request.

Q3 · Diligence

Where is Claude most likely to be wrong?

A real work scenario

Simultaneous interpreters at the UN.

A simultaneous interpreter carries roughly 95% of general meaning across a fast speech. But the 5% loss isn't even. Studies of UN-level interpreters — Gerver in the 1970s, Christoffels and others since — show the failure mode is specific. Names, numbers, and dates are where interpreters trip. Even at the highest level, those three categories produce the most errors and the most "I'll come back to that" pauses.

Why those three? They carry no surrounding pattern. A name is itself or it isn't. A number has no synonym. Specifics have nothing to recover from.

Gerver, Empirical Studies of Simultaneous Interpretation (1976). Christoffels, Simultaneous Interpreting: A Cognitive Perspective (2004).

In your Claude work

Five danger zones. Same logic.

Names, numbers, dates, citations, your world — that's where Claude's pattern-matching fails most reliably. The interpreter's three weak spots are three of yours. Citations sit alongside as a related risk: a paper title and citation number is just another specific.

Your world is the deepest weakness. Claude wasn't trained on your partner list, scholar count, or renewal calendar. It pattern-matches to the nearest thing — "education nonprofit in DC" — and invents plausible specifics.

The takeaway: Specifics carry no pattern to recover from. That's why the five danger zones are where every verification pass starts.

Q4 · Description

What is context overload?

A real work scenario

The middle of a long trial.

Mock-jury research has shown a stable pattern for decades. Jurors recall opening arguments and closing arguments at high fidelity. Middle testimony — even when it carries the decisive evidence — fades. Reid Hastie's mock-jury studies found jurors' final verdicts correlated more strongly with opening framing than with witness specifics, in part because the decisive witnesses landed in the attention-weakest part of the trial.

The mechanism is serial position. The first and last items in any sequence get more attention than the middle — Bennet Murdock named the effect in 1962 and it has replicated under hundreds of conditions since.

Hastie, Inside the Juror (1991). Murdock, "The Serial Position Effect of Free Recall," J. Experimental Psychology (1962).

In your Claude work

Lost in the middle.

Loading a 50-page PDF doesn't mean Claude reads it evenly. Attention degrades through long contexts. The "Lost in the Middle" finding (Liu et al., 2023) showed models score highest on questions answered at the start or end of a long document — and noticeably worse on questions answered by the middle.

Three plays: excerpt the three pages that matter rather than dumping the PDF; surface key facts twice (top and bottom of long context); make Claude quote before answering. If it can't quote it cleanly, the answer isn't grounded.

The takeaway: Sharper context produces sharper answers. The middle is where errors hide.

Q5 · Description carried forward

Why do my earlier rules fade?

A real work scenario

Bartlett's War of the Ghosts, 1932.

Frederic Bartlett gave English subjects a Native American folktale — "The War of the Ghosts" — and asked each to retell it from memory after a delay, then to retell their retelling. Each pass drifted further from the original. Specifics dropped out. Concepts that didn't fit European cultural patterns — canoes, ghosts, hunting magic — got smoothed into "boats," "phantoms," "fishing."

Bartlett called this serial reproduction. Each retelling pulled the story toward the listener's prior expectations. Drift wasn't error. It was pattern completion running on people.

Bartlett, Remembering: A Study in Experimental and Social Psychology (1932). Foundational text in cognitive psychology, replicated continuously since.

In your Claude work

Same mechanism, faster.

By message 30, the rule you set at message 2 is gone. Attention weighting tends to favor recent tokens — that's the architecture. And the model's own outputs become its future inputs, so a small wobble in message 10 feeds the wobble in message 15. By message 30, the wobbles have compounded into a different voice and a different set of working assumptions.

Three plays: re-anchor every 8–10 messages; start fresh more often (one task per chat); move durable rules into a Project's reference files — internal memory beats re-anchoring every time.

The takeaway: Drift is built into the architecture. Build the workflow that fights it.

Q6 · Diligence

What manual checks should I always run?

A real work scenario

The WHO Surgical Safety Checklist.

In 2008, Atul Gawande led the development of a 19-item checklist for surgical teams. Three pause-points: before anesthesia, before incision, before the patient leaves. The checks themselves were obvious — verify the patient, confirm the procedure, count the instruments.

The result, published in the New England Journal of Medicine: a 36% drop in major complications and a 47% drop in deaths across eight hospitals from Tanzania to Toronto. The discipline was the small fast check that catches the obvious mistake.

Haynes et al., "A Surgical Safety Checklist to Reduce Morbidity and Mortality" (NEJM, 2009). Gawande, The Checklist Manifesto (2009).

In your Claude work

Ninety seconds. Every time.

Three checks. Scan for the danger zones — names, numbers, dates, citations, your-world specifics. Ask Claude for the claim table — "list every factual claim, your confidence, the source." Open one source. Just one. The one you check is worth more than the ten you assume.

Manual verification is the floor. The checks always run; heavier gates stack on top for higher stakes.

The takeaway: A checklist works because you run it every time.

Q7 · Description

How do I ground Claude in real sources?

A real work scenario

Evidence in the record.

A judge cannot rule on a fact that isn't entered into evidence. A witness who testifies but isn't sworn doesn't count. A document referenced but not introduced is invisible. The record is what the court can use. Everything outside the record is hearsay.

The rule governs what's available for use. The same fact — out of the record, useless; in the record, decisive.

Federal Rules of Evidence, esp. Rule 802 (hearsay). Foundational principle across common-law systems for over a century.

In your Claude work

Internal memory is your record.

Most hallucinations happen when Claude is recalling. Loading the source — making the answer retrieve from internal memory instead of generate from external memory — is the single highest-impact move in the stack.

Three patterns: paste the source inline; constrain the answer to the source ("if a fact isn't in the document, say 'not in source' instead of guessing"); two-pass self-audit ("for every claim, mark whether you traced it to a source I gave you or generated it from training").

The takeaway: Grounded answers are retrievals dressed as generations. Move the fact from training to context, and the model can find it.

Q8 · Diligence carried forward

How do I build verification gates into the workflow?

A gate is a checkpoint where work doesn't pass until specific checks have run. Three gates, light to heavy. The cadence you pick is what your team's defaults become — for every artifact, every week.

Light · pre-send check

Run before drafting.

"List every name, number, date, and citation you intend to use, with the source. Wait for me to confirm before drafting."

Forces the danger zones open before the model commits to a draft. Runs in about a minute. Default for every Tier 4 (public) artifact.

Standard · two-window method

Drafting chat + audit chat.

One chat drafts. A second, fresh chat audits the draft cold.

The audit chat has no investment in the draft and catches what the drafting chat protected. Closest thing to a peer in one person's hands. Default for Tier 3 internal strategy.

Full · adversarial audit

Four lenses, blind to each other.

The adversarial-audit Skill from Week 4. Convergent flags are the verdict.

Two lenses landing on the same line for different reasons = the line is the problem. Reserved for board, funder, external publication. Tier 2 sensitive partner work earns this gate.

The takeaway: Manual checks are what you do. Gates are what your team inherits.

The stack in action

Five hallucinations against the public record.
Three minutes catches all five.

A recap note for pilot teammates who couldn't make Friday's session, summarizing the Moffatt v. Air Canada case from Week 5. Generated from recall on the left. The same paragraph after the danger-zone scan, claim table, and source-loaded rewrite on the right. Every v2 fact is in the public ruling — go check it yourself.

Draft v1 — generated from recall

What Claude returned with no source loaded.

For teammates who couldn't make Friday's session: the standout case we walked through was Moffatt v. Air Canada. In 2023, attorney David Moffatt argued before the Canadian Supreme Court that Air Canada's chatbot had hallucinated a bereavement-fare policy. The court ordered the airline to refund approximately $2,400 and held that Canadian companies are liable for AI-generated statements on their websites. The ruling has been cited in at least a dozen subsequent cases.

Five highlighted spans — year, name and role, court, settlement amount, citation scope — all flagged by the danger-zone scan and confessed as low confidence in the claim table.

Draft v2 — source-loaded rewrite

Same paragraph after the stack.

For teammates who couldn't make Friday's session: the standout case we walked through was Moffatt v. Air Canada, 2024 BCCRT 149. In early 2024, passenger Jake Moffatt represented himself before the British Columbia Civil Resolution Tribunal after Air Canada's chatbot fabricated a bereavement-fare policy and the airline refused to honor it. The Tribunal ordered Air Canada to pay Moffatt $812.02 and held that the airline is responsible for the information its chatbot generates. The ruling is being cited as Canadian precedent on AI liability [verify scope].

Sources loaded: the actual BCCRT decision and the Week 5 deck slide with the verified facts. Five fixes in three minutes. Same draft, grounded in the public record.

The compound: The diligence stack — danger-zone scan, claim table, source-loaded rewrite — is what every leader at this table can run on every draft, every week. Q8's heavier gates stack on top for higher stakes.

Your turn

Bring a draft. Run the stack.

One artifact you wrote with Claude this week — counselor email, partner update, internal memo, LinkedIn post. The simpler the better. We walk through one together.

01

Name the zone.

Which of the five danger zones — names, numbers, dates, citations, your world — did the stack catch on your draft? Which one is your blindspot?

02

Show the catch.

What did the claim table reveal that the draft hid? What was different about the prompt that produced the sourced rewrite?

03

Pass it forward.

The verification cadence for your capstone build is ______ gate, run by ______, before ______ ships. Where does it sit in your scope?

Preston Magouirk [email protected] dccapinnovation.org

ConfidentWrong.

Hallucination is what the model does by design.Verification is what you do by design.

Five danger zones.Check first. Every time.

Why does Claude hallucinate?

Eyewitness reconstruction.

Every answer is a completion.

Why does Claude sound confident when it's wrong?

Death by GPS.

Confidence is the default register.

Where is Claude most likely to be wrong?

Simultaneous interpreters at the UN.

Five danger zones. Same logic.

What is context overload?

The middle of a long trial.

Lost in the middle.

Why do my earlier rules fade?

Bartlett's War of the Ghosts, 1932.

Same mechanism, faster.

What manual checks should I always run?

The WHO Surgical Safety Checklist.

Ninety seconds. Every time.

How do I ground Claude in real sources?

Evidence in the record.

Internal memory is your record.

How do I build verification gates into the workflow?

Run before drafting.

Drafting chat + audit chat.

Four lenses, blind to each other.

Five hallucinations against the public record.Three minutes catches all five.

What Claude returned with no source loaded.

Same paragraph after the stack.

Bring a draft. Run the stack.

Confident
Wrong.

Hallucination is what the model does by design.
Verification is what you do by design.

Five danger zones.
Check first. Every time.

Five hallucinations against the public record.
Three minutes catches all five.