Why Companies are Spending More Money on AI and Seeing Little Returns

Jun 14

Recently we've been helping a client — a leading global data infrastructure company — bring AI into their HR function. This isn't a company that's skeptical of AI; it's the opposite. There's executive support, real budget, strong technical talent, and a genuine appetite to transform operations around it. Which is what made what happened next so interesting: very soon after we started, I noticed that instead of simplifying work, AI was making some processes more complex.

On reflection, this is almost human nature. Hand someone a tool that feels powerful and unconstrained and they'll want to use it on everything, because it's fresh and the possibilities feel endless — and in the process they lose sight of the problem they actually set out to solve. Many business teams have spent years feeling boxed in by their systems of record — HRIS, ERP, CRM, ATS — rigid platforms full of guardrails, approval flows, and data models that are painful to change. So when AI shows up, it feels like a release: a flexible layer above the old systems that can answer any question, automate any process, generate any report on demand. In theory that's empowering. In practice it gets expensive and unfocused fast.

When AI Creates More Work Instead of Less

The first place this shows up is scope. AI projects tend to start with ambition rather than operational discipline.

Take a recruiting team with a simple, measurable goal: cut the time recruiters spend manually reviewing candidates. Practical. But once AI is in the room, the project rarely stays there. Now it should also write job descriptions, score candidates, generate interview questions, recommend comp bands, draft hiring-manager feedback, predict fit, and build workforce-planning dashboards. Each idea sounds reasonable on its own. Together they turn "summarize resumes" into a full redesign of the recruiting operating model — more stakeholders, more integrations, more compliance questions, more testing.

The office of the CFO does exactly the same thing. A finance team starts with one concrete objective — reduce manual effort in invoice processing, or shorten the month-end close. Then AI enters and the scope balloons into automated variance commentary, conversational dashboards, predictive forecasting, budget-owner assistants, contract review, and procurement analysis. Some of that may genuinely pay off. But when every idea joins the roadmap, you end up with more pilots, more tools, and more technical debt without necessarily removing a single hour of manual work.

This is how AI quietly increases headcount pressure rather than relieving it. The pitch was fewer people; the reality is you now need product managers, business analysts, data engineers, security reviewers, and change-management owners just to run the AI program itself. To be fair, traditional software needs care too — the honest figure is the delta, not the gross — but a sprawling AI transformation layered on top of already-complex systems tends to add net new coordination work, not remove it.

The fix isn't to do less AI. It's two things: discipline and focus, and a clear-eyed understanding of where AI actually earns its keep.

1. Discipline and Focus

For back-office functions — the office of the CFO, the office of the CHRO, and their equivalents in legal, procurement, and IT — the objective is almost always the same: drive efficiency to lower long-term cost. There are only two real levers.

The first is to eliminate manual work: data entry, copy-paste, repetitive lookups, status checks, reconciliation, the hours lost searching across systems. The second is to make the existing team more scalable: support more employees, more transactions, more complexity without growing headcount at the same rate.

Everything else is secondary. A more polished interface is nice, but it isn't efficiency. An impressive AI-generated dashboard that no middle manager actually acts on isn't efficiency. A chatbot that answers broad HR questions feels modern, but if employees still open tickets and verify the answers by hand, the business impact is close to zero. This is where a lot of enterprise AI loses the plot: it prioritizes what's visible over what's valuable. A flashy assistant is easier to demo than a behind-the-scenes workflow that removes 40% of manual ticket triage — but the second one is worth far more.

A useful gut-check before greenlighting anything: does this remove manual work, or does it let us scale the team? If the honest answer is "neither, but it's exciting," that's the moment for discipline.

Concretely, instead of a general-purpose HR chatbot employees can ask anything, it's often more valuable to fully automate one high-volume process — say, employment verification requests — as a deterministic workflow that:

receives the request and validates the employee's identity
checks policy eligibility
pulls the right data from the HRIS
generates the correct document
routes exceptions to a human
logs everything for audit

AI may help build or enhance parts of that. But the end result is a reliable operational system, not an open-ended AI experience.

2. Not Every Problem Needs Generative AI

The second thing is understanding where AI actually earns its keep — and that gap runs through both business and technical teams. The most common, and most expensive, mistake is reaching for a model when a plain API call, a rules engine, or a database query would be more reliable and far cheaper.

If an employee asks "how many vacation days do I have left?", the system shouldn't infer the answer from a policy document — it should call the HRIS and return the actual balance. If a manager asks "has this candidate cleared their background check?", don't generate a probabilistic answer; query the ATS. If a finance user asks "which invoices are overdue?", don't summarize a spreadsheet that may be stale; pull structured data from the ERP. For facts that already live in a system of record, deterministic retrieval wins on accuracy, cost, and trust.

This isn't an argument against LLMs — it's an argument for using them where they're genuinely strong: interpreting messy or unstructured inputs, summarizing documents, classifying ambiguous requests, drafting first-pass responses. Extracting key terms from a stack of inconsistently formatted vendor contracts is a fair use of a model. Extracting fields from a clean, fixed-schema HRIS payload is not — that data is already structured, and handing it to a model just adds cost, latency, and a small chance of getting it wrong, for no benefit.

The cost consequences of getting this line wrong are real. Every time you route a question through an LLM that a structured query could have answered, you pay token cost, you add latency, and you invite inconsistency — and unlike a one-off mistake, that cost recurs on every interaction and grows with adoption. A single bad design decision in a high-traffic path can quietly multiply your AI bill.

The Problem with Putting AI in Front of Every User

This is also why I'm cautious about the instinct to put AI directly in front of every user — chatbots, copilots, assistants everywhere. There's a place for those, but they shouldn't be the default for back-office work, because open-ended AI interfaces bring two recurring problems.

The first is inconsistency. Back-office processes run on policies, approvals, audit trails, and compliance — environments where you do not want a slightly different answer each time. If two employees ask the same policy question and get different answers, trust erodes. If two managers get different guidance on a performance review, risk goes up. If finance gets inconsistent explanations of the same budget variance, the process becomes harder to control. To be precise: this variance isn't strictly inevitable — temperature, grounded retrieval, and tight prompting can largely tame it — but it's a risk you have to engineer against deliberately, and most teams badly underestimate how much work that takes.

The second is scalability. AI cost tracks usage: the more people use the assistant, the more tokens you burn, so success and spend rise together — the opposite of what efficiency is supposed to do. (And tokens aren't the whole bill; embeddings, vector storage, evaluation, and monitoring all add up, especially for anything self-hosted.) More use cases mean more prompts, more evals, more monitoring, more support.

AI Should Help Build Systems, Not Become the System

The reframe that tends to unlock real returns is this: in most back-office workflows, the highest-value use of AI is helping your technical teams build better systems — not becoming the system itself.

AI is genuinely excellent at the build stage. It can help engineers write code, generate test cases, draft documentation, map data, and accelerate workflow design, so robust systems get built much faster. But once that system is live, the production workflow should usually be deterministic, rules-based, and wired into the systems of record. The payoff is in the contrast: you capture AI's speed while you're building, without paying a per-interaction token tax every time the system runs afterward. A deterministic system isn't free to operate — it still needs hosting, maintenance, and monitoring — but you've taken the model out of the hot path, and that's where the runaway cost lives.

The most valuable enterprise AI work, in other words, often doesn't look like a chatbot. It looks like cleaner workflows, automated exception handling, faster reconciliation, and internal tooling that lets a smaller team support more volume. AI may classify an incoming HR request, but routing and resolution follow structured logic. AI may extract terms from a vendor contract, but payment approval, budget validation, and accounting treatment run through controlled systems and rules. AI may interpret an IT support ticket, but access provisioning still flows through identity, security, and approval workflows. The goal isn't to avoid AI — it's to put it in the right place.

Conclusion

Enterprise AI will create enormous value — but not for the companies that treat it as a magic layer for redesigning every workflow at once. For back-office operations, the aim isn't to make every process feel AI-powered; it's to make the business more efficient, scalable, and reliable.

That takes two different postures from two different groups. Business teams have to stay disciplined about the outcomes that actually matter — fewer manual hours, lower long-term cost, higher throughput, a team that scales without proportional headcount — and walk away from the merely exciting. Technical teams, the builders, have to be empowered with the right judgment about when to build a deterministic system and when, sparingly, to actually generate.

Get both right and AI becomes the cost-saver it was sold as. The companies that see the best returns won't be the ones using AI most visibly. They'll be the ones using it most precisely.

Pengfei Chen