An AI Gave Me an A3 Evaluation. I Needed a Coach.

131
1

TL;DR: AI tools can evaluate your A3, but evaluation creates dependence — coaching develops thinking. I used A3 thinking on the Lean Hospitals Coach itself to fix the difference.

There are AI tools out there that will evaluate your A3. Upload the document, get a report back. Strengths, weaknesses, suggestions for improvement. Some of them are pretty good at it.

You get a report. You don't get a conversation.

It's like going to therapy and your therapist just hands you a printed evaluation. ‘You scored a 6 out of 10 on coping mechanisms. Good luck out there.”

No follow-up question. No “what did you notice?” No pushing back when you say “the root cause is training” for the third time. No holding you in clarification when you want to jump to countermeasures.

A report tells you what someone else sees. A coaching conversation helps you see it yourself. One makes you dependent on the evaluator. The other makes you a better problem solver.

That's what I'm building with the Lean Hospitals Coach — and more recently, a generalized Lean Coach for people outside of healthcare. And the story of how I got there is itself an A3 problem-solving exercise, because the first version of the coach wasn't coaching. It was consulting. I had to retrain it.

My AI Coach Was Consulting, Not Coaching

I uploaded a real healthcare team's A3 — a perioperative services kaizen from a children's hospital. Solid work. Data-driven current conditions, named countermeasures with owners and deadlines, results showing a 137% increase in urgent block time and a 30-minute reduction in patient wait times.

I asked the Lean Hospitals Coach to review it in “Coach Me” mode. It produced a 400-word response covering every section, citing chapters from my book, identifying strengths and gaps. Ended with a thoughtful question.

I didn't like it. Here's why — this is the actual output, trimmed:

A few key strengths stand out: The problem is framed as future-based: demand outstripping current capacity… Targets are concrete: 16,000 cases now, 18,000 in the future… The countermeasures go beyond classic “find time, fill time”…

What happened after these initial changes — did the gains stick beyond the first year?

That's an evaluation with a question stapled to the end. After I retrained the coaching behavior, the same A3 produced this:

There is a lot of work in this A3. Before I share what I notice, what section are you most confident about, and where do you think the thinking is weakest?

The first response tells me what the coach sees. The second asks what I see. One develops dependence. The other develops thinking.

John Shook teaches us to resist the instinct to evaluate. The A3 is not a document to be judged — it's a tool for revealing gaps in thinking. A coach who evaluates before asking has done the thinking for the learner. The A3 author walks away knowing what the coach thinks but never had to develop their own eye for quality.

This is the hardest coaching discipline to maintain. The more you know about A3s, the more tempting it is to demonstrate that knowledge. And AI models — trained on millions of examples of helpful, comprehensive responses — are deeply predisposed to demonstrate.

Think about what this costs. Every A3 that gets a thumbs-up when it should have gotten three hard questions. Every root cause that stops at “training” because nobody pushed back. That's the cost of evaluation without coaching. It compounds.

Coaching the Coach

There's a second layer that comes up all the time: someone brings you an A3 that their team member wrote. They want to know what you think.

I built the coach to recognize this scenario. When a user says “my staff member brought me an A3” or “my team leader thinks the root cause is training,” the AI treats it as a meta-coaching opportunity. Don't coach the underlying problem. Coach their coaching.

If I say “I think the root cause section is weak,” a coach who evaluates would say “I agree, here's why.” A coach who coaches the coach asks: “What questions did you ask the team when they presented it? What did you learn about their thinking?”

That's the higher-leverage move. Help the manager resist the same righting reflex the AI itself had to learn to resist. Most CI directors have nobody coaching their coaching. That's the real gap in most organizations — not the A3 template, not the training class. It's that the people developing others have nobody developing them.

How I Fixed It (Using A3 Thinking on the AI Itself)

The AI had the same failure mode as a new A3 coach: it knew too much and showed it too early. An uploaded A3 gave it deep context about the document — but told it nothing about the person's thinking.

I treated the behavior as a problem to solve:

Current condition: User uploads an A3 in Coach mode, AI produces a 400-word evaluation with a question tacked on.

Target condition: AI asks what the user sees before sharing what it sees. 80-150 words. One question.

Root cause: The coaching instructions had been written for one mode — basically Tell Me mode. They said “deliver feedback one section at a time.” So it did. The specific instruction overrode the general rule to ask questions first.

Countermeasure: Rewrite the instructions to treat Coach mode A3 review as a separate behavior. Add a sample exchange showing the wrong response and the right one. Reinforce the rule: “Document uploads are not rich context about the person's thinking.”

The first iteration fixed the big problem. The AI stopped dumping evaluations. But it introduced a smaller one: my own sample response had included a brief evaluative observation — meant to feel warm and human. The AI imitated it and expanded it. Three observations instead of one. It learned from my example, but learned the wrong lesson.

This is worth understanding about how these AI tools work: the specific examples you give them shape their behavior more than the rules you write. Tell the AI “don't evaluate” but show it an example that evaluates, and it follows the example almost every time.

Same as in any organization. Leaders who evaluate A3s in front of their team are training their team to evaluate. Leaders who ask questions are training their team to think.

I tightened the example, removed the evaluative observation, tested again. Clean.

I spent more time defining the problem and understanding why the AI behaved this way than I did writing the fix. The fix itself was a few sentences. That's always the case with good A3 thinking.

An evaluation report is not a substitute for coaching. Not for your team. Not for your A3s. Not for the person trying to develop as a problem solver.

If you want to get better at A3 thinking — or get better at coaching others through it — try the Lean Hospitals Coach. If you're in other industries, the Lean Coach works the same way across industries. Upload an A3 and see what happens. It won't hand you a report. It'll ask you a question.

Get New Posts Sent To You

Select list(s):
Previous articleThe Starbucks Mobile Order Timing Problem That Chick-fil-A Already Solved
Next articleStrategy Deployment: Are You Playing Catch Ball or Chucking Rocks?
Mark Graban
Mark Graban is an internationally-recognized consultant, author, and professional speaker, and podcaster with experience in healthcare, manufacturing, and startups. Mark's latest book is The Mistakes That Make Us: Cultivating a Culture of Learning and Innovation, a recipient of the Shingo Publication Award. He is also the author of Measures of Success: React Less, Lead Better, Improve More, Lean Hospitals and Healthcare Kaizen, and the anthology Practicing Lean, previous Shingo recipients. Mark is also a Senior Advisor to the technology company KaiNexus.

1 COMMENT

  1. This really resonated with me, especially the idea that a report can’t replace the value of a coaching conversation. The shift you described—from evaluating to asking better questions—feels like a small change but clearly has a huge impact on how people develop their thinking. I also found the “coaching the coach” concept interesting because it highlights a gap that isn’t talked about enough in organizations. It makes me curious to see how far AI can go in actually building those coaching habits instead of just delivering polished feedback.

LEAVE A REPLY

Please enter your comment!
Please enter your name here