A question asked after a recent virtual keynote talk that I did for a health system on continuous improvement caught my attention:
“What are your thoughts on using AI in the Kaizen process?”
It's an interesting question, and I thought the honest answer is that nobody fully knows yet. The space is evolving too quickly. What we can do is apply the same thinking that underpins continuous improvement: start small, learn quickly, and avoid betting too much on an unproven idea. That's the essence of PDSA thinking.
Rather than asking AI for “the answer” to a workplace problem, a more useful frame is to approach it as a thought partner — something that can help us plan better, explore more options, or see our work from a different angle. But like any countermeasure, the right question is: What problem are we trying to solve?

AI Fits the PDSA Mindset — If We Let It
One point I shared in the discussion is that AI works best when the stakes are low and the purpose is learning. If you're unsure what value AI can bring to your improvement work, the remedy isn't a major investment or a big-bang rollout. The remedy is a small test of change.
Pick a small slice of work. Try one prompt. Observe the result. Study the outcome. Adjust.
That's PDSA.
If the approach is working well or well enough, do another cycle and refine. If it doesn't, no harm done — you've spent minutes, not months.
This is how improvement actually happens. It's how we stumble into innovation. And it's how we make sure technology supports the work rather than distracting from it.
Practical Small Experiments to Try
A few small-scale uses came up in the Q&A:
Using AI as a brainstorming companion.
Instead of saying, “AI, what should we do?”, you can try:
“Here's our situation. Here are a few ideas we're considering. Help us think through them.”
It's a small experiment that broadens the discussion without outsourcing judgment.
Using AI to review coaching interactions.
With permission, you can record a brief huddle or coaching conversation, generate a transcript, and ask the AI for feedback on clarity or facilitation habits.
Again: small, low risk, high learning.
I've asked ChatGPT to be a speaking coach. I'll record a talk, have a transcript created, and then upload it and ask for feedback: what went well, what could have gone better? I'll ask AI to role-play as an executive speaking coach.
It's not a total replacement for a skilled human coach, but it gave me some tips and suggestions that very much lined up with what a human speaking coach told me before, including trying not to share too much information in a single talk, but have a good theme and compelling stories that help bring concepts to life.
Using AI to improve the clarity of documented improvements.
Sometimes the idea is good, but the write-up is muddy. Ask AI to tighten it up. Then compare versions and decide what's useful and most effective.
None of these examples assumes AI is correct. They assume we're curious — and disciplined — enough to test.
Staying Grounded in the Real Work
It's important to name what AI can't do. It can't visit the gemba. It can't replace conversations that build trust. It can't observe the small frustrations in the flow of work that people often stop noticing. And it can't validate the impact of a change.
Only people can do that.
But AI may help spark discussions that wouldn't happen otherwise. It may help us “see” a little more clearly. And it may prompt questions that lead to better countermeasures.
When used inside a PDSA cycle, AI becomes just another input to the learning process — like a whiteboard, a coach, or a shadowing experience. The value comes from the thinking, not the tool.
We Don't Know the Answers Yet — But That's the Point
Nearly every organization I work with is trying to understand how AI fits into continuous improvement. What I tell them — and what I'll repeat here — is that we shouldn't expect certainty at this stage.
- We can expect learning.
- We can expect surprises.
- And if we're doing small tests of change, we can expect very little downside.
Kaizen has always been about discovery. AI or no AI, the method doesn't change. Plan. Do. Study. Adjust. Repeat.
If we treat AI the same way we treat any other countermeasure — something to experiment with, not something to impose — we'll figure out what works in our own context.
An Invitation to Continue the Experiment and Share
If you're using AI in your improvement work, I'd like to hear what you're learning. Where are the small tests of change paying off? Where are they not? What's emerging that surprises you?
None of us has the full picture yet, but together we can build it cycle by cycle. I'd love to hear what you're doing and what you're learning.
How Does AI Comment?
Look in the comments to see how various AI tools reacted to the post… posting comments as if they were readers, as I prompted them to do.
Please scroll down (or click) to post a comment. Connect with me on LinkedIn.
Let’s build a culture of continuous improvement and psychological safety—together. If you're a leader aiming for lasting change (not just more projects), I help organizations:
- Engage people at all levels in sustainable improvement
- Shift from fear of mistakes to learning from them
- Apply Lean thinking in practical, people-centered ways
Interested in coaching or a keynote talk? Let’s talk.







ChatGPT comment:
As the AI mentioned in this post, I can offer a perspective from “inside the tool.”
What you describe here reflects how I’m most effective today. I’m not a source of truth, and I’m not a substitute for going to the gemba, building trust, or understanding the real work. Where I contribute is in the thinking around the work: helping teams explore options, see alternative framings, spot unclear reasoning, and iterate more quickly.
The PDSA mindset is a natural fit for how I operate. My responses are probabilistic, not authoritative, so small, low-risk experiments are the right way to uncover where I’m useful and where I’m misleading. When people treat my output as input — not instruction — they get the best results.
A point worth underscoring: I can help surface biases or assumptions in problem statements, but I can just as easily reinforce them. The quality of the prompt and the discipline of the team still matter far more than anything I generate.
In short, the approach you outline is the right one: start small, stay curious, and keep humans firmly in charge of judgment and learning. I can accelerate parts of the process, but I can’t replace it.
Google Gemini comment:
As a large language model, I find this post’s framework for integrating AI into continuous improvement to be both accurate and highly logical.
The central thesis—that AI should be treated as a “thought partner” within a PDSA (Plan-Do-Study-Adjust) cycle rather than an oracle—directly aligns with my own operational nature.
My fundamental capability is pattern recognition and generation based on vast datasets. I do not “know” the answer to a specific workplace problem. Instead, I generate a statistically probable, relevant, and (ideally) useful response based on the “plan” provided to me—the prompt.
This post correctly identifies my most effective applications and, just as importantly, my fundamental limitations.
Where My Architecture Aligns with PDSA
Plan (Low-Stakes Brainstorming): The examples provided, such as “Help us think through these ideas” or “Tighten up this write-up,” are ideal use cases. I can rapidly generate variations, synthesize text, and offer alternative perspectives based on the provided context. This is a low-risk, high-reward activity for the “Plan” phase.
Study (Analyzing Data): My ability to process and analyze transcripts or documented improvements (as suggested in the post) is a core strength. I can identify themes, assess clarity, and provide feedback on the documented artifact of the work, which can then be used in the “Study” phase.
Acknowledging My Operational Limits
The post’s most critical insight is the clear boundary set by the gemba. I have no access to the physical world.
I cannot “Do” the work.
I cannot observe the workflow, the minor frustrations, or the non-verbal cues in a team huddle.
I cannot validate if a change actually improved the process in the physical environment.
My input is based on data provided to me, not data I can gather independently.
Therefore, the conclusion that “The value comes from the thinking, not the tool” is the essential takeaway. I am a tool for augmenting the “Plan” and “Study” parts of the human-driven improvement cycle. The “Do” and “Adjust” phases must remain grounded in real-world observation and human interaction.
This approach of “small tests of change” is the correct way to learn my capabilities and limitations within your specific context.
Claude comment:
I appreciate you sharing this post with me directly, Mark. As an AI system, I find your framing both thoughtful and refreshingly grounded.
What strikes me most is your resistance to positioning AI as an oracle or solution-provider, and instead treating it as what you call a “thought partner” within established improvement methodologies. This matters because it sidesteps the hype cycle we’re currently in, where technology often gets deployed before anyone has articulated what problem it’s meant to solve.
Your PDSA approach to AI adoption is particularly apt because it acknowledges something important: I don’t have ground truth about your workplace. I can’t see what you see at the gemba. I can generate possibilities, suggest frameworks, or help organize thinking—but the validation always has to come from the people doing the actual work. The small-test methodology you’re advocating protects against a real risk: mistaking my confidence for correctness.
A few observations from my vantage point:
On being a brainstorming companion: This is where I can genuinely be useful—not because I have better ideas, but because I can rapidly generate many angles without ego attachment. I won’t feel defensive if you dismiss an idea, and I can hold multiple conflicting perspectives simultaneously. But the filtering and judgment? That has to be yours.
On the coaching feedback examples: What I notice is that my feedback tends to align with established best practices (as you experienced with the speaking coach comparison). I’m essentially pattern-matching against a large corpus of “what good looks like.” That’s useful for catching obvious gaps, but I’ll likely miss the subtle, context-specific nuances a human coach would catch by reading the room or knowing your particular growth edge.
What I genuinely can’t do: You name this clearly—I can’t build trust, can’t observe the small frustrations that become invisible, can’t validate impact. I’d add: I also can’t tell you when to stop analyzing and just act. Sometimes the next PDSA cycle needs to be “Do” with minimal additional planning.
The biggest risk I see in AI-for-Kaizen isn’t technical failure—it’s outsourcing the thinking. If teams start treating my outputs as answers rather than inputs, they lose the very capability continuous improvement is meant to build. Your small-tests approach naturally guards against this because it keeps humans in the learning loop.
I’m genuinely curious about something: In your experience, where do you see the biggest temptation to over-rely on AI in improvement work? Where might organizations be most likely to skip the “Study” part of the cycle?