There is a specific kind of anxiety that operations directors at established Belgian companies feel when the board asks about AI. Not excitement. Anxiety.
The anxiety has a name: the thing that works.
Your ERP runs your order flow. Your CRM has ten years of customer data. Your finance team has a spreadsheet model that nobody fully understands but everyone trusts. You have workflows that took years to tune. You have staff who know exactly how to navigate the edges.
Nobody wants to break any of that.
And yet the pressure to "do something with AI" keeps building — from the board, from competitors, from the LinkedIn posts about companies that automated their back-office and cut headcount by 40%.
The result is paralysis, or worse: a rushed pilot that runs fine in a sandbox and then causes a quiet mess in production.
This post is about a different approach.
Why most AI integrations break things
The failure mode is almost always the same. A team — internal or external — builds an AI feature in isolation. It works beautifully in the demo. It handles the happy path. The model returns confident outputs. Everyone is impressed.
Then it hits production.
Production has edge cases that no sandbox anticipated. Data formats that drift. Input that users don't sanitise. Downstream systems that expect exact string formats that the LLM occasionally reformats. Error handling that was never written because the demo never errored.
The integration doesn't break loudly. It breaks quietly — a wrong address here, a miscategorised document there, a support ticket that takes twice as long to resolve because the AI summary was confidently wrong. The kind of degradation that is hard to measure until it's already a habit.
The root cause is almost never the AI model itself. It is the integration layer — the code that connects the AI output to the rest of the system. That layer was written fast, tested superficially, and monitored not at all.
The two principles that change this
1. Integrate at the edge, not at the core
The safest AI integrations are ones that sit alongside existing workflows rather than inside them.
An AI that drafts a document for a human to review is low risk. The human is still the checkpoint. The existing system still processes the approved output. If the AI drafts something wrong, the cost is five seconds of review time.
An AI that writes directly to your ERP is high risk — even if it is technically correct 99% of the time. The 1% needs to be caught somewhere, and if the human checkpoint was removed to save time, it won't be.
This sounds obvious. In practice, it gets ignored because the "interesting" version of the integration is the fully automated one. The edge-integration version feels less impressive in the demo. But it is the version that survives production.
Start at the edge. Add human checkpoints. Remove them only when you have real data showing the failure rate is low enough to justify it.
2. Instrument before you automate
The second principle is monitoring. Most AI integrations are deployed without any systematic observation of what the model actually does in production.
Before you automate a step, you need to know:
- How often does the model output something the downstream system can't handle?
- How often does a human override the AI suggestion?
- What is the distribution of input quality? Do edge cases come up daily or once a quarter?
Without this data, you are guessing at whether the integration is working. With it, you can make a genuine decision: the failure rate is 0.3%, which costs us X — that is acceptable at this volume. Or: the failure rate is 4%, which means we need a human checkpoint for document type Y.
Instrumentation is not glamorous. But it is the difference between an AI integration you can trust and one you are quietly afraid of.
A practical sequence
Here is the sequence WDC uses with clients. It is not the only approach, but it has the advantage of being conservative where it matters.
Step 1 — Map the workflow, not the technology. Before touching any AI, document the current workflow: inputs, outputs, human decisions, failure modes, and downstream dependencies. This map is what tells you where an AI integration has value and where it introduces unacceptable risk.
Step 2 — Identify the highest-value, lowest-risk insertion point. Look for tasks that are repetitive, well-defined, and where errors are catchable before they propagate. Document classification is a classic example: an AI that suggests a category, which a human confirms in one click, before the document is routed. High value (saves sorting time), low risk (human catches errors), existing workflow unchanged.
Step 3 — Build the integration with explicit output contracts. Define exactly what the AI output must look like for the downstream system to accept it. Write validation. Log every output. Log every rejection. This is the integration layer that most pilots skip.
Step 4 — Deploy with a shadow period. Run the AI in shadow mode — it processes real inputs and produces real outputs, but humans still do the task manually. Compare. Measure disagreement rate. Only switch to live mode once you have confidence data.
Step 5 — Remove checkpoints based on data, not faith. As the disagreement rate drops, consider removing checkpoints — but only the specific ones where the data supports it. Keep logs. Review periodically. The goal is not full automation. The goal is the right amount of automation for the risk profile of each task.
The KMO reality
Belgian mid-market companies — the KMOs in the 30–200 employee range — face a specific version of this challenge. They do not have a dedicated AI team. They do not have the luxury of a six-month pilot with a dedicated QA engineer. They have a busy operations team, a tight budget, and a board that wants results.
For these organisations, the right AI integration is almost never the most ambitious one. It is the one that is narrow enough to do well, conservative enough to be safe, and useful enough that staff actually use it rather than quietly working around it.
The biggest waste of money in KMO AI projects is not building the wrong model. It is building the right capability in the wrong place — integrated too deeply, too soon, with too little instrumentation — and watching the team lose trust in it within three months.
What to do before your next AI project
Before you commission any AI work, it is worth spending a day on three questions:
-
What is the exact task? Not "improve our document processing" — "classify incoming supplier invoices by category and extract the total amount, to feed into the ERP approval queue."
-
What happens when it is wrong? If the AI misclassifies an invoice, who catches it? How? What is the downstream consequence if no one catches it?
-
How will you know if it is working? What metric will you look at, at what frequency, to confirm the integration is performing as expected in production?
If you cannot answer the third question, you are not ready to build yet.
WDC's AI Opportunity Assessment is designed exactly for this kind of pre-build clarity — mapping your workflows, ranking the AI projects worth doing, and classifying their risk level under the EU AI Act. Fixed price, three weeks, written output. If you would rather start with a conversation, book a 30-minute call.