A shared-services team that processes high volumes of routine requests for several business units came to us after a first attempt at automation had quietly failed. Someone had built a bot for a repetitive, rules-based task — copying fields between two systems and applying a few eligibility checks. It worked in the demo. Then a form changed, the bot kept going against the old layout, and for the better part of a week it wrote plausible-looking nonsense into a live system before anyone noticed. The clean-up cost more than the manual work it had replaced.
The demo is the easy ten percent. The exceptions are the job.
The challenges we had to solve
- The underlying process was not standardised — three analysts did the same task three slightly different ways, so there was no single rule set to automate.
- The original bot had no concept of an exception; anything it did not expect, it guessed at, silently.
- There was no audit trail, so when something went wrong nobody could see what the bot had touched or why.
- The team had lost trust in automation and needed a reason to believe a second attempt would be different.
How we approached it
We did not write a line of automation for the first few weeks. We sat with the analysts, documented the task as it was actually done, and agreed one standard way to do it — including what counts as a normal case and what does not. Automating a messy process only gives you a faster mess, so the standardising was the real work. Only once the rules were written down and agreed did we start building, and we built for the unhappy path first: every input the bot could not handle with confidence goes to an exception queue for a person, rather than being forced through.
Every action the bot takes is logged — what it read, what it wrote, which rule fired — so the team can reconstruct any item after the fact. We set it to fail loudly and stop, not to improvise, the moment a screen looks different from what it expects. We measured against a target the team set themselves: the share of items that flow through untouched, watched week by week so a rising exception rate is an early warning that something upstream has changed.
Where it stands
The routine majority now passes through without a person touching it, and the analysts spend their time on the exceptions — the cases that genuinely need judgement. When a form changed again recently, the bot stopped and flagged it instead of carrying on; the fix took an afternoon, not a week of clean-up. The team’s measure of success is no longer how much the bot does, but how quietly it does it and how fast they hear about it when something breaks.