Iowa Democratic Caucus 2020: “App Problem” or “System Problem”? Implications for Our Work?
Let's start today's post with Point #8 from The Toyota Way management philosophy:
“Use only reliable, thoroughly tested technology that serves your people and processes.”
Today's blog post isn't about politics or candidates — it's about process.
Initial reports described the inability to quickly (or accurately) tabulate votes as an “app problem.”
The company that developed the app was quickly thrown under the bus, but my initial reaction was that the problem probably can't be simply explained by blaming that company. What about training? What about backup plans? What risks or problems did they anticipate?Embed from Getty Images
Yes, the app had many flaws (which included problems connecting to the central Iowa Democratic Party database), but somebody chose the vendor. Somebody helped define requirements or use cases for the vendor.
In a complex situation like this, there's not likely any single “root cause” to be found.
What's described as “an app problem” could be described, more broadly, as “a technology problem.”
It could also be described as a “system-design problem” (which would include backup plans that didn't work).
Even if the technology had functioned perfectly, there also “training problems” and, more broadly, “planning problems.”
The situation could be described as a “procurement problem” or even “a management problem.” Who owns the system? Top party leaders (in this case, probably at the state level).
Here is one summary:
“An untested technology, novel reporting requirements, nearly a dozen competitors to tally across 1,600 precincts — what could go wrong? As the Iowa Democratic Party discovered Monday night, nearly everything.”
I'll repeat Toyota Way Principle #8 so you don't have to scroll up:
“Use only reliable, thoroughly tested technology that serves your people and processes.”
The caucus app was, per reports, “…quickly put together in the past two months and was not properly tested at a statewide scale.”
That's not an app problem, that's a system-design problem, a planning problem, and a management problem.
Why was there a rush? Good question. One factor involves the national DNC, which means they can't simply throw the Iowa state party under the bus (although the state leader did resign afterward):
“…the party decided to use the app only after another proposal for reporting votes — which entailed having caucus participants call in their votes over the phone — was abandoned, on the advice of Democratic National Committee officials.”
I haven't seen a good reason why this “plan” was put into motion just two months ago. They've known the date of the caucus for years.
This report showed other planning flaws:
“…people were struggling to even log in or download it in the first place. After all, there had never been any app-specific training for the many precinct chairs.”
The iOS and Android apps were not something that was a simple installation from the Apple or Google app stores. It was basically a “beta” app or a private app that requires a number of hoops to jump through before you can even download it. That might be good for security, but it probably wasn't realistic to assume users could jump through those hoops. It was a bad assumption that everybody had a smart phone.
“…as Dan Callahan, the chair in Buchanan County, put it, “Some of our chairs use flip phones.”
A longer and more interesting way of saying it was from this editorial:
It's enough to make you wonder: Have these party officials ever been to a polling site or a caucus venue?
They are not pristine WeWorks with blazing fast internet connections and an army of Geek Squad workers on call. They are mostly high school gyms, nursing homes and church basements with cinder-block walls and horrible cellphone service. The people who work at them are volunteers, and many are — how can I put this delicately? — members of the generation that still refers to the TV remote as “the clicker.”
Did the people who built the app (and those who designed the system) ever “go and see” at “the gemba” (the actual workplace) as we'd say in the Lean lingo?
From another report:
“I got access to the app on Friday evening. And we were just given access to the app and told, you know, play around in there a little bit. And that was about as much training as we got.”
That's starting to sound outright irresponsible. Planning problem. Management problem.
In the event that the app didn't work (or if they couldn't use it), caucus leaders were told to just call in their results, but the phone bank was apparently understaffed (due to a probable assumption that the technology — and the system design as a whole — would be effective):
“Hold times stretched past 90 minutes…”
The lack of training and the lack of planning reared its head in the call center:
“Soon, a backlog of calls developed inside the boiler room as volunteers struggled to answer questions related to the app and as precinct leader after precinct leader said they would instead plan to call in results later that night, after their caucus.
The volunteers answering phones had no official directive for how to adjust their plans as a result of the meltdown.”
One small positive is that there was some data collection being done during the day (before the reporting chaos really began):
“Paper signs hung from the wall of the room listing categories of phone calls. They included things like, “chairperson not present,” “delegate misallocation,” and “where is my caucus location?” Each had a handful of tally marks beneath the corresponding heading.
Keep tally of problems is exactly like the old quality improvement tool called a “check sheet.” That's one of the “seven basic Q.I. tools” that were a major part of the Total Quality Management movement (and these tools are taught to everyone at Toyota).
But volunteers said there were between 75 and 100 tally marks noted under the headline, “the app isn't working.”
“Our initial instructions were if someone was having problems with the app to tell them to just call in their results,” another volunteer said.
There's that bad assumption that the call center would be able to handle the resulting volume of calls. Maybe that should have been considered as a “worst-case scenario.”
As an engineer and a system designer, it's drilled into your head that you have to be proactive and to think about what could go wrong. It would have been prudent (if not more costly) to have higher phone bank staffing in the assumed possibility that the technology would crash and burn.
With everything being so rushed (including a lack of testing and a lack of training), I would have planned for that “crash and burn” scenario.
There's also the complicating factor that I wouldn't have anticipated, “internet trolls” publishing that hotline phone number and then flooding it with calls.
This made the wait times even worse:
“The phone number to report Iowa caucus results was posted on a fringe internet message board on Monday night along with encouragement to “clog the lines,” an indication that jammed phone lines that left some caucus managers on hold for hours may have in part been due to prank calls.”
Once it became clear that many, if not most (or all), precincts were going to be calling in results, the “system” that had been put in place seemed to also be very poorly designed.
Again, this wasn't just an “app problem” — it was a system design problem, a planning problem, and a management problem.
Implications and Lessons
It's easy for me to sit back and second guess what happened in Iowa. I think it's more constructive to think about how I go about the design and implementation of new processes, whether new technology is involved or not. I can think of situations in the last year where I was able to coach others to:
- Go to the gemba and understand the current state, along with the people who are doing the work
- Don't choose a technology just because it seems cool (or because the organization is already paying for it) — follow the Toyota principle of using reliable, tested technology that serves your people and your process
- Start a new process with a backup plan and “what could go wrong?” in mind
- Check key assumptions — “What must be true for this work?” is a great question
- Be ready to adjust when something goes wrong… because something will go wrong, because no system is perfect (and the people designing systems are never perfect
I'm not perfect, but I've gotten better at those things over time. I've had good coaches.
I wonder if you see parallels to work that you or your organization have done. I read about similar problems with Electronic Health Record (EHR) systems in healthcare — the apps can be hard to use, training budgets are kept to the bare minimum, bad assumptions are made, and the rollout is rushed.
People involved in the Iowa debacle claim to be looking to learn from their mistakes. I hope that's the case — and we can learn from those mistakes, as well. What can you do differently in your work?