The pressure to add AI to operations rarely comes from the operations. It comes from a board slide, a competitor’s announcement, a vendor demo that ran a little too smoothly. So the question on the table becomes “where can we use AI” — which is the wrong question, and the reason so many of these efforts quietly stall.

The better starting point is a problem you can already describe without mentioning AI at all. A report your team rebuilds by hand every Monday. An approval queue that backs up whenever one person is on leave. A decision that depends on three systems that do not talk to each other. Name the problem first. The technology, if it belongs anywhere, comes second.

Name the problem first. The technology, if it belongs anywhere, comes second.

Start from the problem, not the model

Applied AI works the same way any other tool does. It earns its place when it removes a real cost — time, errors, delay, or the strain of judgement work that falls on too few people — and it does not when it is added because the category felt unavoidable.

A useful test, before anything is built: write down what the process looks like today, who touches it, and where it actually hurts. If you cannot fill that page without reaching for the word “AI”, the work is not ready. The clearest projects can be explained to a sceptical manager in two sentences, and the payoff is obvious enough that nobody needs a chart to believe it.

Data readiness decides most of it

The factor that quietly determines success is rarely the model. It is the state of the data the work depends on. AI applied to inconsistent, undocumented, or contested data does not fix the mess — it industrialises it, producing wrong answers faster and with more confidence than before.

Before committing, it is worth being honest about a few questions. They are unglamorous, and they decide the outcome.

  • Is the data in one place, or stitched together from systems that disagree with each other
  • Does someone own its accuracy, or has nobody trusted these numbers for years
  • Are the records consistent enough that two people would read them the same way
  • Can you explain how a figure was produced when someone challenges it

Honest candidates, and the hype

Set against the noise, the genuinely good use-cases in operations are narrower than the marketing suggests — and more durable. They tend to fall into three groups.

The work that does not fit this shape — open-ended judgement with no clear right answer, decisions that carry legal or safety weight, anything where a confident wrong answer is worse than no answer — is where the hype runs ahead of what the tools can responsibly do. That is not a permanent verdict. It is a reason to wait until the case is real rather than aspirational.

  • Automating repetitive judgement — classifying, routing, and matching the high-volume, low-ambiguity decisions that currently consume skilled people’s attention
  • Decision support — drafting, summarising, and flagging exceptions so a person decides faster, with the person still deciding
  • Surfacing what matters — pulling the signal out of records, tickets, and documents that nobody has the hours to read in full

Keep a person in the loop

The most reliable applied-AI work in operations does not replace the person making the call. It does the gathering, the sorting, and the first pass, then hands a clearer picture to someone accountable for the outcome.

This is not caution for its own sake. It is what keeps the system correctable. When a person reviews the edge cases, you find out where the tool is wrong before it becomes a habit — and you keep the institutional knowledge that lets you tell good output from plausible nonsense.

How to tell whether it actually helped

Decide how you will measure the result before you build, using the same numbers you would have used to judge any process change. Time taken end to end. Error and rework rates. The size of the backlog. How often the output is overridden by the people who rely on it — a number that tells you more than any accuracy score from a demo.

Run it alongside the existing way of working for long enough to compare like with like, including the weeks when the data is messy and the volume spikes. A pilot that only shines on clean inputs has not been tested; it has been flattered. Give the comparison enough running time to survive a full cycle of real conditions before anyone calls it a success.

If the honest answer after that is that nothing measurable improved, the right move is to stop — and that is a successful result, not a failure. Knowing where these systems break, and where they do not belong, is the part that took us the longest to learn. It is also the part that saves you the most.

Leave a Reply

Your email address will not be published. Required fields are marked *

Talk to us about your project.

A short conversation is usually enough to tell whether we are the right fit for the work. We will be straight with you either way.