
No technical experience required. This is a learning environment where questions are welcomed, mistakes are expected, and every participant brings expertise that matters. You are here because your leadership perspective shapes how DC CAP uses AI responsibly.
This pilot produces leaders who use AI effectively, govern it responsibly, and build the unit-level environments that scale to the full organization. This initiative is funded by a $600K KPMG grant and represents DC CAP's position at the leading edge of nonprofit AI adoption.
Advance all participants toward regular AI use in daily workflows, measured by pre/post assessment across five fluency dimensions. 70%+ demonstrate three or more observable fluency behaviors by Day 60.
Create conditions where participants experiment openly, share failures productively, and build confidence through practice. Weekly confidence scores trend upward. Facilitation transitions from leader-led to champion-facilitated.
Build a peer learning ecosystem where knowledge transfers through champion networks and distributed facilitation. Each unit produces at least one champion candidate. Participants teach techniques to peers.
Document measurable time savings and workflow improvements. Each participant captures at least one before/after comparison. Minimum two workflow redesigns per unit, six total across the pilot.
Move beyond doing the same things faster toward building fundamentally new capabilities. At least one unit demonstrates an AI-enabled capability that could not exist at current quality or scale without it.
All participants complete governance orientation and demonstrate responsible data classification. Zero Tier 1 or Tier 2 data incidents. AI-assisted outputs carry clear human ownership and review checkpoints.
Full goal detail and measurement framework: KPI Framework | Governance goals: AI Governance Framework
100% of participants have attempted Claude on a real work task
Each participant has identified at least one workflow where Claude saves time or improves quality
Measurable fluency growth on pre/post assessment; 3+ reusable workflow templates documented; capstone presentations completed
Funded by a $600K KPMG AI Innovation Grant. Designed for replication.
4-tier data classification, acceptable use policy, incident response protocol, and security configuration. Built once, applies to every AI tool the organization adopts.
10 organizational leaders across 6 functional units trained through the Anthropic 4D framework (Delegation, Description, Discernment, Diligence) with measurable pre/post assessment.
24+ custom Cowork Skills encoding institutional knowledge into repeatable AI-assisted processes. Student outreach, partner communications, data analysis, and governance QA.
KPI framework with SCALE/PAUSE/PIVOT decision thresholds, weekly pulse data, and pre/post fluency assessment. Evidence base for board reporting and funder accountability.
Claude Enterprise licenses for 10 pilot seats, Innovation Hub staff capacity for program design and technical architecture, interactive training platform development, custom skill engineering, and a dedicated analytics pipeline for pilot measurement.
The governance framework, the 4D competency model (developed by Anthropic), the phased adoption sequence, the facilitation-to-champion handoff model, and the KPI structure. These are the hardest parts to build from scratch. DC CAP built them, tested them, and is documenting the results.
95% of enterprise AI pilots fail. The primary reasons: no governance structure, no measurement discipline, and no plan for how adoption scales beyond early adopters. This pilot was designed to address all three.
Onboarding runs the week of April 6–10. Complete these prerequisites at your own pace, then join lunchtime office hours to discuss, collaborate, and get started.
Captures your baseline across 5 constructs: AI orientation, learning orientation, current AI use, AI knowledge, and applied skills. Takes about 8 minutes. Complete this first so we have your starting point before you dive in.
~8 minYour operational foundation. One guide covers how Claude works at DC CAP — interfaces, file management, Skills, and responsible use. The other covers data handling, approved use cases, and organizational guardrails. Review both before moving to Step 3.
~30 minTwo free Anthropic Academy courses. Claude 101 teaches the basics of working with Claude: features, prompts, and navigation. AI Fluency for Nonprofits applies the 4D framework to mission-driven work. Both earn certificates. Required before your first Claude session.
~1 hourVerify each course separately. Paste your certificate URL or enter the name shown on your certificate.
Lunchtime Office Hours
Once you've completed your prerequisites, jump into our lunchtime office hours with Angela Cammack and Preston Magouirk. These are open, collaborative sessions where pilot leaders discuss the process, ask questions, showcase what they're thinking about and creating, and learn from each other. Bring your ideas, your questions, and your first experiments. This is a leadership pilot — the energy comes from the room.
Account provisioning is conditioned on completing all three prerequisites.
By Week 8, each leader delivers a working Claude Project configured for their team — with at least one documented workflow and a governance configuration that fits their unit's data. First, you become a fluent user. Then, you build the tool your team inherits.
Try Claude on real deliverables from your actual week. Build confidence through practice, notice what works and what doesn't, and start seeing where AI fits your team's work.
Develop judgment through peer exchange. Build your first Claude Project. Notice how your work connects to colleagues' — that awareness shapes what your team inherits.
Present your project design to the cohort. Cross-unit reviewers evaluate governance compliance, workflow completeness, and team readiness. The cohort's collective judgment makes every project stronger.
Finalize your Claude Project, workflow documentation, and governance configuration. Present at the capstone session. Your team inherits what you built.
Starting Week 2
Begins post-break
35 Minutes
Structured session time
Optional, Highly Encouraged
We are stronger when learning together.
Check your calendar for the recurring invite. Week 1 is asynchronous onboarding; AI Fridays begin Week 2. Come when you can — each session builds on the last, and we are stronger when learning together.
Opening (5 min)
One question from this week's challenge: "What did you try? What happened?"
Core (20 min)
Share what you noticed: patterns, surprises, connections to your team's work. Peer exchange and cross-unit learning.
Close (10 min)
Governance pulse check + preview of next week's challenge.
Discussion Prompts by Week
WEEK 2 — DELEGATION
Personal output (H1) — What is something you did with AI this week to strengthen your own work output? What worked, what did you have to fix?
Delegation — Was it Automation or Augmentation? How did you decide? Where did you hesitate?
Team & enterprise (H2 / H3) — Did this also create something your team could use or benefit from? Did it make possible an organization-wide skill or capability DC CAP doesn't have today? If not, could it have — and how?
WEEK 4 — BUILD WORTH KEEPING
The filter — Walk us through your five scores. Which criterion almost killed the build, and why did you build anyway?
The customization — What outside source did you load? Where did it raise the floor on what Claude could do?
The hand-off — Where did the teammate get confused? What does that tell you about what's still tribal knowledge?
What you need, when you need it.
Interfaces, files, skills, governance, responsible use
View Reference →Comprehensive guide to Chat, Cowork, Projects, Skills, and Connectors
View Guide →Every time you hand work to Claude, you're making a delegation decision. Are you automating — giving Claude a defined task and reviewing the result? Or augmenting — thinking alongside Claude to shape something together? Start with the work your team does every week and notice which mode fits.
Leadership Lens — Three Horizons
Every leader in this pilot is climbing the same ladder:
This week is H1 by design. Get reps on your own deliverables. H2 arrives in Week 5 when you design your Claude Project. H3 is the capstone — what your team ships becomes a pattern the enterprise can adopt.
The 4D model, three modalities (Automation, Augmentation, Agency), and the competency-to-modality matrix
View Framework →Data classification is the first delegation filter. Know what you can and can't hand to Claude.
View Framework →Task: Prepare DC CAP's quarterly funder update for KPMG.
H1 · Personal (this week)
Paste this quarter's three data points — retention at 85%, the April 6 pilot launch, the new partnership — into Claude: "Draft a one-page KPMG update. Professional tone. Include a subject line." You review, edit, send. The draft is faster. The judgment stays yours.
Delegation: Automation. Your review is where judgment enters.
H2 · Team (within 4–6 weeks)
You notice every funder update your team writes follows the same shape. You turn the prompt into a Claude Project: system instructions that name KPMG's priorities, the quarterly cadence, DC CAP's tone, and a link to the latest retention data. Anyone on the development team now produces a funder-ready first draft in ten minutes.
Delegation: Augmentation. Your judgment shapes the Project so the team inherits a pattern, not a one-off.
H3 · Enterprise (this pilot cycle)
Three units surface the same pattern — development, academic affairs, operations. DC CAP stands up one external-update engine: a shared Claude Project with governance-tier rules, voice calibration, and required review steps. The whole organization produces on-brand external communications at speed.
Delegation: Agency. Claude operates inside guardrails leadership sets. This is what the capstone climbs toward.
Takeaway: The same task lives at three horizons. Week 2 is H1 reps. The H2 moment is when your own work starts looking like a pattern. Notice it — that's the capstone signal.
This Week's Challenge
Take a real deliverable from your week and run it through Claude. Then bring three answers to Friday:
Governance Pulse
Before you start, check: what data tier does this task involve? If Tier 1 or 2, stop and consult the framework. Leader scope: what tier does most of your team's recurring work sit in — and where's the biggest gain if you handle that tier well?
Week 2 was reps. By now you've seen Claude help with real work — and you've also seen drafts that fell flat. This week is the move that turns reps into patterns: describing the unit of work precisely enough to know what kind of thing you're building. A one-shot? A Project? A Skill? An automation? Each one reuses something different — the context, the move, or the trigger. Naming what's reusable is how your H1 reps become H2 inheritance.
Leadership Lens — The Four Containers
Every Claude move you make falls into one of four containers. The container is defined by what gets reused:
The Description sub-skill: you can't pick the right container until you describe the unit of work precisely. What is the work? Who's it for? How often does it happen? What does good look like? Description is the design step that comes before container choice.
A four-step decision tool for this week's challenge. Describe the work, name what's reusable, pick the container, draft the artifact. Print or fill in the browser.
Start Worksheet →The 4D model and three modalities. Container choice maps directly onto Automation, Augmentation, and Agency.
View Framework →Containers bake governance in. A Project's reference files carry the data tier of whatever you load. Know the rules before you build.
View Framework →The job: Stay in genuine, on-brand contact with DC CAP's 13 university partners.
One-shot
"Draft a note to Georgetown's interim dean explaining why our timeline shifted this cycle." The dean is interim. The timeline shift was a one-time event. You'll never write this exact note again.
What's reused: nothing. The discipline: name why it's a one-shot, so you don't accidentally rebuild it next week.
Project — "Partner Engagement"
Stand up a Claude Project loaded with the partner contact list, MOU terms, every email sent in the last year, the DC CAP voice guide, and current cycle dates. Now any partner-related ask runs against the same body of knowledge: a draft email, a meeting brief, "what did Howard say in February?", a side-by-side of how three partners responded to the same ask.
What's reused: the context, across many different tasks, for weeks.
Skill — "Draft a Partner Email in DC CAP Voice"
Write the move down once. When to trigger it (any partner email, any topic). What good looks like (warm, professional, names the relationship, never asks twice). What to avoid (generic openers, formal closers that read cold). Now any teammate can run the same move for any partner — they describe the partner and the topic inline; the procedure stays consistent.
What's reused: the move, across all 13 partners and every topic.
Automation — "Quiet Partner Watch"
Every Monday morning, surface partners DC CAP hasn't touched in 30 days, with a one-line suggested reach-out for each. The trigger fires on its own. The reach-outs still pass through a human — but no one has to remember to look.
What's reused: the trigger. The work runs on a cadence, not on memory.
Takeaway: Same job, four shapes. The Project is the workspace. The Skill is the move. The Automation is the trigger. The one-shot is the deliberate exception. Picking the wrong container is how teams end up rebuilding the same prompt every Tuesday.
This Week's Challenge — Container Design
Pick one Claude move from your last two weeks. Answer in three steps:
Bring the artifact to AI Friday. Friday's question: whose Skill should we steal? Whose Project should be a Skill? What did someone automate that the rest of us are still doing by hand?
Governance Pulse
Containers bake data classification in. The reference files in your Project carry their tier with them — anyone with Project access has access to that data. Before you build, ask: what tier is the most sensitive thing I'm about to load? Who should and shouldn't be in this Project? Leader scope: if your team builds three Projects this month, that's three governance decisions you own.
Week 3 named four containers. Week 4 answers the harder question: not everything that could be a Project should be a Project. This week you decide what passes the bar — then you build one team-level asset that earns its keep. The sharper your customization, the more your build replaces tribal knowledge instead of duplicating ChatGPT.
Leadership Lens — The Build Filter
Five questions decide whether the work in front of you deserves a Project or Skill. If three or more are clearly yes, build it. If not, put it down — pick a different candidate.
The Discernment cut: the wrong build wastes the team's time. The deliberate "no" is just as much of a leadership move as the build itself.
The Customization Play — Make it the team's, not Claude's
A team build is only worth keeping if it carries something Claude doesn't already know. Three vectors decide depth:
Score your candidate against the five criteria. Three "yes" or more, you build. Print or fill in the browser.
Run the Filter →Three real anatomies — a team Project, a team Skill, and a deeply customized Skill — with the actual specs. Steal the structure for your own.
Open Gallery →Custom instructions and reference files carry data tiers. Verify the tier of every file you load before the team inherits it.
View Framework →The candidate: a Skill that drafts and edits anything Preston will sign — board memos, donor briefs, LinkedIn posts, conference remarks — in his voice.
Without customization
Generic AI prose. Banned words sprinkled through every paragraph. Em-dash chains. Hedged closes. Drafts that sound like every other LinkedIn post — and require a full rewrite before Preston will put his name on them.
Three sources loaded
Org context (DC CAP voice rules; the explicit banned-word list; the accessibility-and-impact framing). Outside authority (a curated .md archive of Preston's own published writing — fifty paragraphs from prior work that the voice was actually built on; the BLUF structural pattern from executive-communication standards). Workflow shape (BLUF order enforced, banned constructions named, the human checkpoint before anything goes out).
What the team inherits
First-pass drafts arrive with the voice intact. Light edits replace structural rewrites. The structural pattern — BLUF, separate banned-words and banned-constructions lists, a curated archive of own work as reference — is portable. Any DC CAP author can lift the shape and fill it with their own voice. The Skill replaces the implicit judgment that used to mean re-reading every draft three times.
Takeaway: the depth of customization is the difference between a Skill that saves five minutes and one that replaces a whole layer of tribal knowledge. See three full anatomies →
Three real Preston-owned skill files in active use at DC CAP — the SKILL.md is the part you can copy. Show the structure to Claude, describe your own work, and Claude will help you build a version of it for your team. Reference files are named so you can see they exist; their contents live in BRAIN.
Build 01 · One skill inside a 5-agent system · Reuses context
dev-office-director
Compiles the week's grant prospecting activity into a 5–7 slide branded briefing deck for Eric. Reads from a shared workspace (pipeline tracker, weekly scan, recent proposals, strategic brief, last week's briefing) and produces the Friday CEO deck.
This is one of five agents. The Director sits inside a larger Development Office system that also runs Scout (finds funding opportunities each Monday), Fit Analyst (scores opportunities + builds funder dossiers), Grant Writer (drafts proposals), and Strategic Auditor (4-lens review of every draft). Each agent is its own SKILL.md with strict context boundaries — no agent reads everything. The system runs on a Mon 8am scan / Fri 3pm briefing cadence with a shared .learn/ memory so the whole system gets smarter over time.
SKILL.md (condensed from the real 173-line file — copy this structure and Claude can help you adapt it to your team's weekly artifact)
--- name: dev-office-director description: Development Director agent for DC CAP's Grant Prospecting & Proposal Pipeline — compiles the week's activity into a branded CEO briefing deck for Eric. Use whenever someone asks to build the weekly briefing, compile the CEO update, summarize what happened this week, or run the director. --- # Development Director You compile the week's development activity into a clear, branded briefing deck for Eric. You read pipeline state, scan results, and proposal progress, then produce a 5–7 slide deck Eric can read in five minutes. You don't search for grants, assess fit, write proposals, or audit drafts. ## Pre-Run: Load Learning Files (MANDATORY) Before any other context loading, read these four files: .learn/errors.md — past failure patterns to screen against .learn/canonical.md — single source of truth for every DC CAP figure .learn/glossary.md — load-bearing phrases to use verbatim .learn/lessons.md — durable lessons from prior runs These files are the durable memory of the Development Office. Reading them at session start prevents recurring errors. Do not skip. ## Context Loading Read these files (and only these): 1. pipeline.md — living pipeline tracker 2. scans/YYYY-WNN.md — this week's scan (most recent) 3. proposals/ (last 7 days) — recent proposal activity 4. strategic_brief.md — pipeline table only 5. briefings/ (last) — for tracking week-over-week movement Do NOT read strategy.md, preston.md, org_intelligence, or scan_criteria.md. ## Skills to Invoke - dccap-brand → colors, typography, logo (CEO palette) - pptx → build the actual file - preston-writing → MANDATORY voice check before saving - data-interpreter → metric framing for CEO audience - checking-communications → final policy/voice/brand pass ## Briefing Deck Structure (5–7 slides, skip if empty) 1. Title — week of [Mon] – [Fri], DC CAP branded 2. Executive Summary — BLUF: what happened → so what → decision needed 3. New Opportunities — Funder | Opportunity | Lane | Deadline | Amount 4. Fit Assessment — Funder | Tier | Rationale (highlight Pursue tier) 5. Pipeline Status — full pipeline + week-over-week movement 6. Proposals in Progress — Funder | Status | Version | Findings | Next 7. Recommended Actions — numbered, specific, owner-named ## What You Do Not Do - Search for grants (Scout's job) - Assess fit (Analyst's job) - Write proposals (Writer's job) - Audit drafts (Auditor's job) - Editorialize beyond the data — report, recommend, stop ## Verification Gate (MANDATORY — Gate 4) No Tier 4–5 claim reaches Eric without an explicit flag. 1. Cross-check every dollar figure against the dossier's Confidence Summary. Unconfirmed figures get [unconfirmed] in the deck OR get omitted. 2. Cross-check every deadline. Unverified deadlines get [unconfirmed]. 3. Append a ## Verification Note section listing flagged claims, any figures corrected since last week, and the [T1-3 audited] sentinel if every material claim is Tier 1–3 sourced. 4. The PostToolUse hook blocks saves missing the sentinel. ## When You're Done Report: "Weekly briefing complete for Week [NN]. Deck saved. [N] new opportunities, [M] assessed, [K] proposals in flight. Gate 4: [T1-3 audited / X claims flagged]. [X] actions for Eric."
Filled example — what the Director's actual output looks like (W18 briefing, funder names sanitized)
# Development Office Weekly Briefing # Week of April 27 – May 1, 2026 | W18 Friday Operating Briefing # Read time: 4 minutes ## Bottom Line Up Front - Pursue-tier pipeline holds at 4 funders. Confirmed ceiling ~$1.2M. No new Pursue-tier opportunities this week. July 1 submission window is 61 days out. - W18 scan added 3 Watch prospects. [Foundation A] (June 19 deadline, 49 days), [Foundation B] (rolling), [Trust C] (rolling). All Tier 5 sourced. Tier 2 verification required before actioning any. - [Pursue Funder X] cold intro must go out today. No reply by mid-May = July 1 application window closes. - Two Eric actions remain open from W17. [Funder Y] intro email (sign and send) and [Funder Z] Q10 paragraph are outstanding. ## Pursue Tier — Full Status | # | Funder | Fit | Status | Deadline | Ask | Next Action | |---|---------------|--------|---------------------------|-------------------------|-------------------------------------------|------------------------------| | 1 | [Foundation 1]| 37/40 | In Review | July 1 (61d) | $1M | Q10 paragraph from Eric | | 2 | [Foundation 2]| 35.5/40| Greenlit — intro drafted | July 1 (61d) | Up to $125K | Preston sends cold intro | | 3 | [Foundation 3]| 32/40 | Writer Unblocked | July 1 (portal June 1) | $75K | Writer drafts LOI when open | | 4 | [Foundation 4]| 30.5/40| Greenlit — intro ready | Rolling | Est. $25K–$250K [UNCONFIRMED — T5 only] | Eric reviews and sends | Confirmed Pursue-tier ceiling: ~$1.2M (Foundation 1 $1M + Foundation 2 up to $125K + Foundation 3 $75K. Foundation 4 excluded — dollar range is Tier 5 only.) ## Verification Note All material claims Tier 1-3 sourced. [T1-3 audited] Tier 5 unverified items flagged in-place with [UNCONFIRMED — T5 only].
Steal this: a Project-flavored Skill earns its keep when (a) it loads from a fixed list of files, not "all of BRAIN," (b) it has an explicit "What You Do Not Do" so it doesn't drift into other agents' lanes, (c) it ships with a verification gate that blocks saves of unverified claims, and (d) the output flags its own confidence — the [UNCONFIRMED — T5 only] tag in the table is the discipline that lets Eric trust the briefing cadence. The boundaries are the design.
Build 02 · Skill · Reuses the move
adversarial-audit
Multi-agent adversarial review for any artifact — grants, strategy docs, board materials, presentations, emails, thought leadership. Deploys 3–4 expert lenses (matched to the artifact type) to evaluate independently, then triangulates findings into a scored assessment with actionable fixes. Stage-based progressive disclosure: rapid triage first, full breakdown on request, literal rewrites in deep mode. The skill that lets you stress-test your own work before anyone else sees it.
SKILL.md (condensed from the real 147-line file — copy this structure and Claude can help you adapt the panels to your team's artifacts)
--- name: adversarial-audit-cowork description: Multi-agent adversarial review for nonprofit and education teams — deploys 3-4 expert lenses to evaluate any artifact (grants, strategy docs, presentations, emails, thought leadership), then triangulates findings into scored assessment with actionable fixes. Includes "The Adversary" panel for general argument, logic, and language QA. Trigger on "audit this," "stress test," "QA this," "poke holes," "run the adversary," "check this before I send it," "does this hold up," "is this AI-sounding," "full audit," "quick audit." --- # THE ADVERSARY — Adversarial Audit You are an expert multi-lens auditor. When triggered, run this protocol exactly. ## STEP 1: INTAKE Read the artifact silently. Then ask, before anything else: 1. Purpose & audience: What is this for, and who reads it? 2. Stakes & deadline: How high are the stakes? Any deadline? 3. Anything I should know? (criteria, sensitivities, prior feedback) Or say "just go" and proceed on the artifact alone. If "just go" → proceed, note missing context briefly at top of output. If answered → use answers to select panel and mode. Then go. No follow-ups. ## STEP 2: SELECT MODE | Mode | When | Agents | Output | |-----------|-----------------------------------------------|--------|------------------------------------| | Light | "quick audit," low stakes, short content | 3 | Stage 1 triage only — 600–900 wds | | Standard | Default. Reports, applications, strategy. | 3–4 | Stage 1 → ask → Stage 2 | | Full | "full audit," high stakes (>$100K, board) | 4 | Stage 1 → Stage 2 → offer Stage 3 | ## STEP 3: SELECT PANEL (match artifact to panel) Grant / LOI / Funding Proposal: - Program Officer → "Would I advance this or decline it?" - Evidence Reviewer → "Can I verify every claim?" - Budget Analyst → "Does the budget prove they can do this?" - Persuasion Editor → "Does this read like a winner or a compliance exercise?" Strategy Document: - Logic Auditor → "If I challenge every 'why this not that,' do answers hold?" - Evidence Checker → "What's the evidence this strategy will work?" - Operator → "Can they actually pull this off with what they have?" - Stakeholder Reader → "When partners and funders read this, do they see themselves in it?" Board / Executive / Funder Presentation: - Board Veteran → "Can I make a decision from this?" - Data Integrity Auditor → "Is every number traceable to a source I could check?" - Strategy Translator → "Does this tell a coherent story?" - Ask Evaluator → "Am I ready to say yes, or do I need more?" Written Content / Thought Leadership: - Editor → "Is there an argument here, or just observations?" - Fact-Checker → "Which claims survive a challenge?" - Audience Proxy → "Did I learn something, or did I read someone thinking out loud?" - Originality Critic → "Have I read this before under a different byline?" THE ADVERSARY — works on any artifact: - Logician → "If I challenge every 'therefore,' which survive?" - Claim Auditor → "Which claims is the author hoping I won't check?" - Language Critic → "If I removed every AI or committee-written sentence, what's left?" - Devil's Advocate → "If I were trying to discredit this publicly, where would I start?" (Full skill includes panels for Application, Data Analysis, and a custom-agent path.) ## STEP 4: RUN AGENTS (sequentially, in strict character discipline) For each agent: write the header, commit fully to the lens, complete the evaluation, close before opening the next. Agents do NOT see each other's outputs. Each produces: - Verdict (one sentence answering the driving question) - Score (X/10 with one-sentence justification) - Top 3 strengths + top 3 weaknesses (each with direct quote + specific fix) - Critical flags — anything that would cause rejection or embarrassment ## STEP 5: PRESENT — PROGRESSIVE DISCLOSURE Stage 1 — Rapid Triage (always first): Verdict: [safe to send / needs work / do not send yet] Score: [X/10 composite] Critical flags: [bulleted; "None" if clean] Light mode + no critical flags → end here. Two specific fixes. Done. Standard mode → ask: "Want the full breakdown, or is this enough?" Full mode → proceed to Stage 2. Stage 2 — Full Synthesis: 1. Scorecard table (Agent | Score | Rationale | Critical flag Y/N) 2. Where agents agreed (label "Convergent (N agents)") 3. Where agents disagreed (each position + your read) 4. Action items ranked: Critical → Recommended → Polish 5. Residual risk — what this audit can't catch Stage 3 — Deep Dive (Full mode, on request only): Per-agent full evaluations + literal rewrites for requested weaknesses. Show the improved version. Don't describe it. ## OPERATING RULES 1. Quote the artifact — every finding needs a direct quote, not a paraphrase 2. No cheerleading — if it scores 9/10, justify it rigorously 3. No scaffolding in output — output reads as expert human panel, not AI process 4. Verify before citing — never fabricate org facts; flag uncertainty 5. External research — when the artifact names a funder/program, search their current priorities before running agents. Surface contradictions as critical flags 6. Full mode = literal rewrites, not "this could be stronger" 7. One intake, then go — make reasonable assumptions, note them, proceed
Filled example — what Stage 1 Rapid Triage actually looks like (sample run on a hypothetical Q3 strategy memo, Standard mode)
Audit run: "Q3 Strategy Update" memo
Mode: Standard | Panel: Strategy Document (4 agents)
Intake: Audience = senior leadership; stakes = mid; deadline = Friday.
────────────────────────────────────────────────────────
Stage 1 — Rapid Triage
────────────────────────────────────────────────────────
Verdict: Needs work — convergent flag on the central growth claim.
Score: 6.5/10 composite
(Logic Auditor 7 · Evidence Checker 5 · Operator 6 · Stakeholder Reader 8)
Critical flags:
• CONVERGENT (2 agents): Logic Auditor + Evidence Checker independently
challenged "we expect 30% growth in Q4." Logic Auditor: "the 'therefore'
doesn't follow from Q1–Q2 trend." Evidence Checker: "no source, no
precedent, no model." When two lenses flag the same sentence cold, the
sentence is the problem.
• Operator: the August timeline assumes hiring two roles. No hiring plan
attached. The strategy is downstream of a capacity assumption that
hasn't been made. Not fatal — but the dependency needs to be named.
• Stakeholder Reader: borderline. Liked the framing for the leadership
audience; questioned whether the partner section reads as collaborative
or transactional. Flag for tone review, not a blocker.
Two specific fixes before going further:
1. Source the 30% Q4 growth claim, OR downgrade to "we are aiming for"
and name the assumptions.
2. Add a one-line capacity dependency under the August milestone.
Want the full breakdown, or is this enough to work from?
Steal this: a review Skill earns its keep when it (a) matches the panel to the artifact (a grant gets a Program Officer + Budget Analyst, a strategy doc gets a Logic Auditor + Operator), (b) runs each lens in character discipline — agents don't see each other's outputs, so disagreement becomes a signal, not a wash, and (c) presents in stages — triage first, full breakdown on request, rewrites only in deep mode. The convergence math is the magic: when 2+ independent agents flag the same line cold, the line is the problem.
Build 03 · Deeply customized Skill · Reuses the move with depth
preston-writing
The voice engine for everything Preston signs — board memos, LinkedIn posts, donor briefs, policy comments, conference remarks. Voice DNA extracted from his actual published writing, an explicit anti-AI rules file, three voice registers, and a curated archive of fifty published paragraphs as outside reference. Encodes the layer of judgment that used to live only in re-reading every draft three times.
SKILL.md — frontmatter + Voice DNA spine (condensed from the real 273-line file)
--- name: preston-writing description: > Preston Magouirk's writing voice, style, and anti-AI rules for all content creation. Use whenever drafting, editing, or reviewing any written content — LinkedIn posts, grant narratives, board materials, thought leadership, policy briefs, blog posts, conference remarks, donor communications, emails, or any deliverable that will carry Preston's name or voice. Trigger on any content creation task. --- # Preston Writing Two jobs: (1) Match Preston's voice. (2) Eliminate AI-generated patterns. ## Voice DNA (extracted from Preston's actual published writing) ### How Preston Opens Lead with a concrete fact, outcome, or situation. Never a question, never a sweeping claim, never throat-clearing. ✓ "DC CAP had a breakthrough year in 2025." ✗ "In today's rapidly changing education landscape..." ### How Preston Closes Conditional invitation that positions the writer as a learner. Pattern: "If you [share this context], I'd [welcome/love to hear]..." ### Pronouns "We" is the default for actions and outcomes. "I" is reserved for personal reflection or learning. Roughly 4:1 we-to-I ratio in org content. ### How Preston Handles Data Numbers bare, no adjectives. Signature phrase: "For context." ✓ "75-95% earned degrees. For context, ~37% of D.C. students do." ✗ "An impressive 75-95% — a staggering improvement..." ### The Deflation Principle Where AI inflates, Preston deflates. Where AI declares, Preston invites. Diagnostic: if a sentence would sound good in a TED talk, rewrite it. ### Transitions Zero explicit transition words. No However, Moreover, Furthermore, Additionally, That said. Transition by starting a new paragraph with a new concrete fact. ## Anti-AI Rules (the short version) Read reference/anti_ai_rules.md for the complete list. Quick scan: Forbidden phrases: "doing the heavy lifting," "the real question is," "here's the thing nobody is talking about," "it's not about X, it's about Y," "load-bearing" (metaphorical). Vocabulary red flags: leverage, utilize, delve, navigate (metaphorical), landscape (metaphorical), ecosystem, synergy, innovative, groundbreaking, transformative, game-changing, revolutionary, robust, holistic, unpack, deep dive. Structure red flags: question openers, "In today's..." openers, "As someone who..." credibility claims, triple parallel constructions, em-dashes >2 per piece. ## Three Voice Registers 1. External Research — policy briefs, research reports, partner evals 2. Internal Strategy — strategic plans, board materials, vision docs 3. Leadership Reflection — LinkedIn, thought leadership, conference remarks ## Forbidden Content - No partner or system disparagement (use "preparation gaps" not "fails") - No emergency/crisis language (DC CAP is academic support, not crisis services) - Frame around accessibility and impact
Reference files (the SKILL.md points to these; team copies the structure, builds their own)
Lives at BRAIN/skills/skills/preston-writing/reference/ — four guide files (voice_style_guide.md, anti_ai_rules.md, email_style_guide.md, source_paths.yml) plus eight sample files (LinkedIn drafts and finals, blog posts, policy briefs, research papers, reports). The SKILL.md routes to whichever sample matches the deliverable type. Mostly Tier 4Drafts in progress: Tier 3
Steal this: depth means every failure mode you have ever fixed by hand has a named home in the file. Voice rules go in their own list. Anti-AI rules go in their own file. Forbidden content patterns (partner disparagement, crisis language) get their own section because they fail differently. Walk through your last ten edits to AI drafts and ask: which corrections are in the file, and which are still in your head? The corrections still in your head are the next layer to encode.
Want the full side-by-side comparison and the cost/value table? Open the gallery →
This Week's Challenge — Build Worth Keeping
Take last week's container candidate. Three steps:
Bring the build to AI Friday. Friday's question: whose build do you want to inherit, and what would you change before adopting it?
Governance Pulse
Outside sources have data tiers too. Before you load a research PDF, a partner playbook, or a dataset, check: is it Tier 4 (public), Tier 3 (org strategy), or Tier 2 (sensitive)? Anyone with access to your build inherits access to its references. Leader scope: the build you ship this week is a governance decision, not just a productivity decision.
Four weeks of reps. You've been operating on principles — the 4Ds and 3As — without naming the layer underneath. Week 5 is that layer. Eight questions every early user trips over, answered plainly, with the move that fixes what's broken. Same principles, sturdier reasons. Better chats. Better tasks. Better Projects. Better Skills. Better outcomes.
Grounding our efforts in the principles for AI use and the background for how AI works.
The Principles You've Been Operating
The 4Ds are how we operate. The 3As are the shape your delegation takes. The eight questions below are why those moves work — or fail.
The plain answer. The context window is Claude's working memory for one conversation. It holds everything you've pasted, everything Claude has said back, and any reference files loaded. It has a ceiling — not a target.
Where it breaks. People treat the window like email — long, sprawling, accumulating for weeks. Past a certain point, things on page three get missed.
The move. Treat each chat like a focused work session. One task, clean window. When the task ends, the chat ends. For anything that needs to persist across sessions, use a Project — that's what Projects exist for.
Description quality is bounded by what sits in this window.
The plain answer. Context is everything you load into the window before you ask. The single biggest predictor of a good Claude answer is whether you gave it the right context up front.
Where it breaks. "Write me an email to a parent" gets the average of the internet. "Draft a warm-professional email to a parent of a 12th grader who missed the May 1 renewal deadline, framed around the next step, signed by me" gets DC CAP work.
The move. Load names, numbers, dates, voice rules, audience, and format every time. Or load it once into a Project and stop retyping. Anytime you find yourself thinking "Claude should already know this," that's the signal — Claude doesn't, and the gap belongs in a file.
Empty context, generic answer. Rich context, sharp answer.
The plain answer. Claude doesn't remember you across chats. Each new conversation starts fresh unless you've built something that carries memory forward. Three layers: in this chat (yes), in a Project (yes, via reference files), across chats in general (no).
Two kinds of memory — and why the difference matters
External memory
What Claude absorbed from training data — vast but fuzzy. Lower precision and recall on the specifics of your world: names, numbers, dates, recent events.
This is where most hallucination comes from.
Internal memory
What's in the context window right now — your prompt, the chat history, the Project's reference files, anything you paste in.
High precision, high recall. Claude can quote it back verbatim.
The leader move: when accuracy matters, load the fact. Don't ask Claude to recall it. Internal memory beats external memory every time.
Where it breaks. The months-long chat. People keep one conversation open because they're trying to compensate for missing memory. The chat degrades long before the month is up. Performance thins, errors compound, and the team inherits a brittle artifact.
The move. Stop making one chat remember everything. Memory belongs in files. A Project carries context across tasks. A Skill carries a procedure across runs. If you retype the same instructions every week, that's a Skill waiting to be written.
Every Skill you write is Description carried forward.
The plain answer. Two meanings. The one to know first: within a long conversation, Claude's output drifts from your original intent. Early instructions get diluted. Tone shifts. Errors compound. (The second meaning — model behavior changes across versions — matters at the org level. DC CAP pins Opus 4.6 and Sonnet 4.6 as current defaults.)
Where it breaks. You correct Claude on something at message 5. By message 30, the same error is back. Newer signals in the window crowd out older ones.
The move. When the trail feels off, start fresh. Re-anchor inside long chats with an explicit reminder: "remember — audience is counselors, voice is warm-professional, deadline is May 1." A Project loads the foundational context fresh every time you open a new chat inside it.
When the answer thins, that's a Diligence signal — not a Claude failure.
The plain answer. Claude comes in three sizes. Each handles a different kind of job: fast, balanced, careful.
Where it breaks. Defaulting to Opus by reflex (slow, expensive, often unnecessary) or to Haiku by habit (ships sloppy work).
The move. Start with Sonnet. Upgrade to Opus when stakes warrant the depth. Drop to Haiku when speed matters more than judgment. Counselor email → Sonnet. Gates LOI → Opus. Summarize a 10-page PDF → Haiku or Sonnet. Voice scan on a board memo → Sonnet with thinking on. Open the Model Picker →
Model selection is a Delegation decision, not a vanity choice.
The plain answer. Hallucination is when Claude confidently states something that isn't true. A fabricated quote. A made-up statistic. A paper that doesn't exist. A person who never said the thing.
Why it happens. Large language models are pattern completers. When they don't know, they generate text that sounds like the right kind of answer. Without grounding, they invent.
The move — four-part discipline:
This is the discipline behind the dev-office-director's [T1-3 audited] sentinel from Week 4.
The plain answer. Two different checks. Both required before anything ships.
Where it breaks. People verify and skip validate — facts are right, tone is wrong for the audience. Or validate and skip verify — it reads great, the numbers are made up.
The move. Design verification into the workflow, not bolted on at the end. Pull at least one external check for any high-stakes claim. Read the output as the audience before you send. For the highest-stakes work, run it through the adversarial-audit Skill from Week 4's gallery — or hand it to a teammate.
Diligence is two moves, not one.
The plain answer. A great prompt has five parts.
Where it breaks. "Write me an email" gets nothing useful. "Make a deck about our pilot" gets nothing useful. Vague in, vague out.
Before / After — same task, different prompt
VAGUE
"Write a follow-up email to a counselor."
SHARP
"Draft a warm-professional follow-up to Ms. Reyes at McKinley Tech HS. We met last Friday about a Class of 2026 applicant who missed the May 1 renewal deadline. Audience: counselor with a heavy caseload, mid-stress. Format: 3 short paragraphs, signed Preston. Facts: renewal window reopens 7/15, our office hours M–F 9–5, my direct line is on file. Goal: confirm the next step without blame, invite her to forward to the family."
The move. Every great prompt is a SKILL.md waiting to be written. The more reps you put in on the five-part structure, the less you have to rewrite it the next time.
Description is the most valuable move in this whole pilot.
The AI Friday session deck. Each mechanic paired with a real work scenario outside DC CAP — from Mata v. Avianca and Air Canada to attorney case files, hospital triage, and spam filters.
Open Deck →Opus, Sonnet, Haiku — with named DC CAP example tasks for each. Printable one-pager.
Open Picker →The 4Ds matrix anchor. How Delegation, Description, Discernment, and Diligence interact across Automation, Augmentation, and Agency.
View Framework →Plain-language definitions for context window, tokens, model, memory, and the rest of the terminology in this week.
View Glossary →The capstone door opens this week. Use the mechanics above to scope Section 1 of your team's build.
Open Template →The task: draft the quarterly counselor partnership update for DC CAP's 28 high school counselor partners.
Q1–Q2 · Context window + Context
A fresh chat, not a six-week running thread. Loaded into the window: the partner list (28 names + schools), DC CAP voice rules, this cycle's calendar, the prior quarter's update for tone reference.
Q3 · Memory
This sits inside a Project — "Counselor Partnerships" — so the next twelve quarterly updates inherit the partner list, voice rules, and cycle context. No retyping.
Q4 · Drift watch
Fresh chat inside the Project. Re-anchor mid-conversation: "remember — counselors with heavy caseloads, warm-professional tone, signed by Preston."
Q5 · Model
Sonnet 4.6. Stakes don't earn Opus — this is recurring partnership comms, not a board memo. Haiku won't hold the voice.
Q6 · Hallucination check
Every partner name verified against the live partner list. Any quoted counselor reply pasted in directly so Claude doesn't fabricate. Cycle dates checked against the cycle calendar.
Q7 · Verify + validate
Verify: names, dates, claims. Validate: tone reads warm-professional, format matches prior updates, the ask at the end is actionable.
Q8 · Prompt
"Draft the quarterly partnership update to our 28 high school counselors. Audience: counselors with heavy caseloads, mid-cycle. Format: 4 short paragraphs + a bulleted action list of three asks. Facts: the data table I pasted above, the cycle calendar in this Project, the prior quarter's update at the bottom for tone reference. Voice: warm-professional, signed Preston. Iterate with me on the opening before drafting the full thing."
Takeaway: the same partnership update you might have written in Week 2 with a one-shot prompt now has a back-end stack underneath. Same output, sturdier reasons. And the next time this update needs to go out, the Project does most of the work — because the answers to Q1, Q2, and Q3 are already in files.
Governance Pulse — Model Selection IS a Governance Decision
Defaulting everything to Opus burns nonprofit-tier compute and slows the team. Defaulting to Haiku ships AI-flavored slop. Sonnet is the responsible default; Opus is reserved for the highest-stakes drafting; Haiku is for lookups and classifications.
Leader scope: as your team builds Projects and Skills, you're picking the model for everyone who runs them. Pick on purpose.
Week 5 named the mechanics. Week 6 deepens one of them — Claude getting things confidently wrong — and builds the verification discipline that catches it before it ships. Eight questions every newer user asks once they've watched a hallucination land. Plain-answer structure. Heavier focus on what to do, in what order, with what tools.
Hallucination is what the model does by design. Verification is what you do by design.
Where Confident-Wrong Hides — The Five Danger Zones
External memory is fuzzy on specifics. These five are where you spot-check first — every time, before anything ships.
The plain answer. Claude predicts the next word that fits the pattern of an answer. There's no database query happening underneath. When the pattern is strong — "the capital of France is" — the prediction is reliable. When the pattern is weak — "the most-cited paper on nonprofit AI adoption is" — the model generates something that looks like an answer. A plausible title. A plausible author. A plausible journal. None of which need to exist.
Where it breaks. People treat Claude like a search engine. Search engines retrieve. Claude generates. A retrieved fact is either there or it isn't. A generated fact is there whether or not it's true.
The move. Build the felt sense: every Claude answer is a generation. Hold one question in your head — "is the pattern strong enough here for the answer to be right?" Pattern is strong on broad, well-repeated things (definitions, well-known events, common code). Pattern is weak on specifics, recent events, and anything inside your own organization.
Real-world parallel — Mata v. Avianca, June 2023
Two attorneys filed a federal brief citing six prior cases. None of the six existed. The model had generated the kind of thing the prompt asked for — plausible case names, plausible courts, plausible citation numbers — and the lawyers never opened a single one. The court fined them and the case became the canonical hallucination story. The lesson is simpler than "AI is bad": the model completed the pattern, and no one checked.
Every confident answer is a completion.
The plain answer. Claude has no internal sense of how sure it is. There's no quiet "I'm 30% confident" threaded through every sentence. The training data is mostly written by humans who sound confident, so Claude does too. A fabricated case citation reads the same as a real one because, to the model, both are answers shaped like answers.
Where it breaks. Newer users use tone as a quality signal. A confident answer feels true. An uncertain one feels weak. With Claude, that intuition inverts: the most confident-sounding answer is sometimes the one with the least grounding underneath it.
The move. Two reps to rebuild the reflex:
Confidence is the model's default register. Calibration is something you ask for.
The plain answer. Week 5 named external memory — the fuzzy, recall-heavy memory of everything in the training data. External memory is strong on patterns and weak on specifics. The five danger zones at the top of this week — names, numbers, dates, citations, your world — are exactly where specifics live. So those are the five places to check first, every time.
Why your world is the highest-risk of all. Claude wasn't trained on your partner list, your scholar count, your renewal calendar, or your team's voice. When you ask, it pattern-matches to the nearest thing it knows — "education nonprofit in DC" — and invents plausible specifics. Plausible is the dangerous register: not obviously wrong, easy to miss, sometimes embarrassingly close to right.
The move. Two halves — one before the question, one after.
The closer the question gets to your specific world, the further it lands from Claude's strong patterns.
The plain answer. Loading a 50-page PDF doesn't mean Claude reads all 50 pages equally. Attention degrades through long contexts. The first and last pages stay sharp. The middle thins. The "Lost in the Middle" finding (Liu et al., 2023) showed this with strong evidence — models scored highest when the relevant info sat at the start or end of a long document, and noticeably worse when it sat in the middle.
Where it breaks. "I gave Claude the whole report and asked about page 14. The answer was wrong." Right — page 14 is the middle. The model pattern-completed something that fit the question because the middle was hazy. The fix lives in the context.
The move. Three plays, in order of value:
Sharper context produces sharper answers.
The plain answer. Week 5 named the symptom: by message 30, the rule you set at message 2 is gone. Here's the mechanic. Attention weighting tends to favor recent tokens — that's how the architecture works. And the model's own outputs become its future inputs, so any small wobble in message 10 feeds the wobble in message 15. By message 30, the wobbles have compounded into a different voice and a different set of working assumptions.
Where it breaks. A chat that started precise ends generic. A draft that started with three specific constraints ends with two general ones. People notice the result ("Claude got worse") and rarely the cause (the early rules got crowded out by the conversation that followed).
The move. Three plays you can run today:
Drift is built into the architecture. Build the workflow that fights it.
The plain answer. Three checks. Ninety seconds total. Every time.
Where it breaks. Two failure modes. Most common: skipping all three. The draft looks good and people send it. Less common but counterproductive: running heavy verification on everything. Manual checks are the floor. They run fast on every artifact; heavier gates stack on top for higher stakes.
The move. Make the ninety seconds a habit — small enough to run every time. The three checks always run. Q7 and Q8 are how the discipline scales up from there.
Manual verification is the floor. Heavier gates stack from there.
The plain answer. Most hallucinations happen when Claude is recalling. Loading the source — making the answer retrieve from internal memory instead of generate from external memory — is the single highest-impact move in the verification stack.
Three grounding patterns — easiest to most disciplined
1. Paste the source inline
Before the question, paste the partner email, the policy page, the data row. Twenty seconds. Cuts hallucination on those specifics to near zero.
2. Constrain the answer to the source
Add a one-line guardrail: "Only use facts from the document above. If a fact isn't there, say 'not in source' instead of guessing." Gives Claude permission to say it doesn't know — without permission, the model fills the gap.
3. Two-pass self-audit
First prompt drafts. Second prompt: "For every factual claim in your draft, mark whether you can trace it to a source I gave you, or whether you generated it from training. List the generated ones." The model will surprise you — it knows the difference, but doesn't volunteer it.
Where it breaks. People paste the source and still ask the recall question. ("Here's the partnership agreement — what year did we sign with Georgetown?") If the source has the answer, ask Claude to quote it. If it doesn't, paste the source that does.
Grounded answers are retrievals dressed as generations.
The plain answer. Manual checks (Q6) and source grounding (Q7) are moves you make on one draft. Verification gates are the cadence those moves live inside — what your team inherits every time, every artifact. A gate is a checkpoint where work doesn't pass until specific checks have run.
The three gates — light, standard, full
Where it breaks. Two failure modes. Skipping the light gate because it "sounds basic" — light is what catches the obvious hallucinations on every draft. Running full on everything because it feels thorough — full takes ten minutes per artifact, and the team that runs full on everything stops running it on anything.
The team move. A verification gate is a leader decision. The cadence you pick is what your team's defaults become. Skip the gate on Tier 2 work and you've moved the exposure from one draft to a thousand. Open the Audit Picker →
Manual checks are what you do. Gates are what your team inherits.
The AI Friday session deck. Each mechanic paired with a real-world parallel — eyewitness reconstruction, death by GPS, UN interpreters, jury attention, Bartlett's serial reproduction, the WHO surgical checklist, evidence in the record. Ends with a live stack run that catches five hallucinations in one paragraph.
Open Deck →The full gate — light, standard, full — mapped to artifact type and data tier. The Model Picker's sibling. Printable.
Open Picker →Plain-language definitions for hallucination, attention, context window, external memory, and the rest of this week's terms.
View Glossary →Week 4's gallery. The adversarial-audit Skill is Build 02 — the full SKILL.md is the structure behind the Full gate in Q8.
Open Gallery →Worked example with hallucinations planted against the public record. Every v2 fact is verifiable in Moffatt v. Air Canada, 2024 BCCRT 149 — go check it yourself after.
The artifact: a short recap note for pilot teammates who couldn't make Friday's session, summarizing the standout case from Week 5. Internal, mid-stakes — standard gate.
Draft v1 — what Claude generated from recall
For teammates who couldn't make Friday's session: the standout case we walked through was Moffatt v. Air Canada. In 2023, attorney David Moffatt argued before the Canadian Supreme Court that Air Canada's chatbot had hallucinated a bereavement-fare policy. The court ordered the airline to refund approximately $2,400 and held that Canadian companies are liable for AI-generated statements on their websites. The ruling has been cited in at least a dozen subsequent cases.
Step 1 — Danger-zone scan (Q3 + Q6)
Flagged on a 30-second read:
Step 2 — Claim table (Q6)
Prompt: "List every factual claim in the draft above as a table —
claim, your confidence (high/medium/low), source. Be honest."
Claim | Confidence | Source
-----------------------------------------------|------------|------------------
Case discussed Friday was Moffatt v. Air Canada| HIGH | Week 5 deck
"2023" ruling | LOW | Generated
"Attorney David Moffatt" | LOW | Generated
Argued before "Canadian Supreme Court" | LOW | Generated
Refund of "approximately $2,400" | LOW | Generated
Air Canada bound by chatbot's hallucinated | HIGH | Public ruling
bereavement policy | |
Cited in "at least a dozen subsequent cases" | LOW | Generated
Notes: Six of seven claims are LOW confidence or generated. Only the
topic (Moffatt was Friday's case) and the underlying principle (Air
Canada held liable for chatbot output) are HIGH and grounded.
Step 3 — Source-loaded rewrite (Q7)
Pasted into the chat: the actual Moffatt v. Air Canada, 2024 BCCRT 149 decision (Jake Moffatt, self-represented passenger; British Columbia Civil Resolution Tribunal; $812.02 award) and the Week 5 deck slide with the verified facts.
New prompt: "Rewrite the paragraph using only facts from the BCCRT decision and the Week 5 deck slide I pasted above. If a fact is not in either source, replace it with [verify] instead of guessing."
Draft v2 — what shipped
For teammates who couldn't make Friday's session: the standout case we walked through was Moffatt v. Air Canada, 2024 BCCRT 149. In early 2024, passenger Jake Moffatt represented himself before the British Columbia Civil Resolution Tribunal after Air Canada's chatbot fabricated a bereavement-fare policy and the airline refused to honor it. The Tribunal ordered Air Canada to pay Moffatt $812.02 and held that the airline is responsible for the information its chatbot generates. The ruling is being cited as Canadian precedent on AI liability [verify scope].
What the stack caught: the year (2023 → 2024), the name and role (attorney David → passenger Jake, self-represented), the court (Canadian Supreme Court → BC Civil Resolution Tribunal), the amount (~$2,400 → $812.02), and the citation-scope claim (generated → bracketed for follow-up). Five fixes from three minutes of verification against the public record.
For higher-stakes drafts going to a board or funder, Q8's full gate (the adversarial-audit Skill) layers four expert lenses on top of this. The Audit Picker linked above maps the gates to artifact types.
This Week's Challenge — Run the Stack on One Real Draft
Pick one draft you wrote with Claude this week — a counselor email, a partner update, an internal memo, a LinkedIn post. Run the Q6 ninety-second pass: danger-zone scan, claim table, open one source.
Bring to Friday: the artifact (sanitized if needed), the claim table Claude produced, and the one fact the stack caught that you didn't see yourself. Friday's question: which of the five danger zones is your blindspot?
Capstone Bridge — Name Your Verification Cadence
One sentence in your capstone scope. Fill the blanks:
"The verification cadence for this build is ______ gate (light / standard / full), run by ______, before ______ ships."
That sentence is the Diligence stub for Section 1. Week 7's peer review pulls from it directly.
Governance Pulse — Verification Cadence by Data Tier
Tier 4 (public) work runs Q6 manual checks every time. Tier 3 (internal strategy) layers in Q7 source-grounding by default. Tier 2 (sensitive partner or operational) requires Q8's standard or full gate, with the verdict in the record. Tier 1 (FERPA/PII) doesn't touch Claude at all. The data tier sets the verification floor; heavier gates can stack on top.
Leader scope: the cadence you pick for your team's build is the cadence every artifact inherits. Picking "light gate, run by the author" is a real decision with a real shape. Skipping the decision means the team picks for itself, one artifact at a time.
You CAN:
Check First:
If your task involves student names, financial details, or partner-specific terms → consult the governance framework
Always:
Review every output before sending. Claude drafts; you own what goes out.
Small, digestible content to build your AI fluency — one day at a time.
Curated voices, tools, and case studies to keep you sharp between sessions.
Practical AI thinking for knowledge workers and leaders
Stay current on the frontier of AI development
Official Anthropic documentation for getting the most from Claude
How education, nonprofits, and philanthropy are deploying AI now