Build vs buy AI: when to build the custom layer and when to buy the model

The build vs buy AI question is usually posed as a binary, and that framing is the first mistake. Almost nobody trains a frontier model anymore, and almost nobody gets a durable edge from a thin wrapper over someone else’s product. The real decision is which layer you buy and which layer you build. This guide draws that line with the evidence behind each axis.

By Kanika Mathur, Head of Service Delivery

Reviewed by Resourcifi engineeringPublished Mar 23, 2026Updated Mar 23, 202611 min read

Strategy

Key takeaways

The short version

Build vs buy AI is a layering decision, not a binary. Buy the commodity layer (the model, the generic capability), build the differentiated layer (your product, your data loop, your orchestration).
The model layer is commoditizing fast, so buy the model. Inference cost for equivalent quality falls roughly 10x a year, which makes renting per token the near-universal default over owning weights.
Bought solutions reach successful deployment far more often than internal builds: vendor and partnership efforts succeed about 67% of the time, internal builds about one-third as often (MIT NANDA 2025).
But off-the-shelf agents rarely unlock strategic advantage (McKinsey 2025). The value ceiling lives in what you build, so build only where you hold a real edge at the volume to justify it.
Buying defers the build risk; it does not delete the risk surface or vendor diligence. The OWASP LLM Top 10 and NIST AI RMF responsibilities stay with you either way, and Gartner judges only about 130 of thousands of agentic vendors real.

Build vs buy AI is a layering decision, not a binary

Posing build vs buy AI as a single yes-or-no is the first mistake. The market evidence points in two directions at once, and the honest answer is the synthesis: buy the commodity layer (the foundation model, the infrastructure, the generic horizontal capability) and build the differentiated layer (your product surface, your data loop, your orchestration, your workflow). The real question is not whether to build or buy, but which layer goes in which column.

Two facts have to be held together. On the buy side, internal-build AI projects fail far more often than purchased ones, and enterprises have visibly shifted toward buying off-the-shelf apps as the ecosystem matured.¹ On the build side, off-the-shelf agents and horizontal copilots "rarely unlock strategic advantage," and the organizations capturing the most value show a strong preference for customized, bespoke solutions tied to their own processes and data.² These are not contradictory. They resolve into one rule: buy where the capability is generic, build where it is yours.

There is hype to strip out before any of this is useful. Vendor "agent washing" inflates the buy option: Gartner estimates only about 130 of the thousands of self-described agentic-AI vendors are real, with much of the rest rebadged chatbots and RPA.⁴ So buy is not automatically the safe, fast path, and build is not automatically the brave one. This is a genuine engineering decision, with a longer-horizon view in our production-first AI cornerstone, and it deserves a real framework rather than a procurement reflex.

The build vs buy AI decision matrix

Six axes decide it: differentiation, data, control and risk, speed to value, total cost of ownership, and lock-in. For each one, ask what pulls toward building the custom layer and what pulls toward buying the API or off-the-shelf product. The plain decision rule that falls out: build where you have an unfair advantage (proprietary data, a workflow moat, or a hard control or compliance requirement) and the volume to justify owning it, and buy everywhere else, especially the model itself.

Build vs buy AI: what pulls each way, per axis
Axis	Favors build (custom)	Favors buy (API / off-the-shelf)
Differentiation	The capability is the product, or a moat tied to a process rivals cannot copy	The capability is table stakes: a generic copilot, summarization, generic support
Data	You hold proprietary, hard-to-replicate data and a workflow that generates more of it	The problem is solvable with public or general knowledge, no proprietary data edge
Control & risk	You must own security, residency, auditability, latency and model behavior	The vendor controls, SLAs and roadmap are acceptable to you
Speed to value	You can fund a dedicated team and tolerate a longer path to production	You need a working capability now, and the vendor app ships faster
TCO	Volume is high enough that vendor per-seat or per-call pricing exceeds owning it	Volume is low or spiky, and you avoid hiring, MLOps, eval and on-call
Lock-in	You want portability across models and protection from price or roadmap shifts	You accept dependence in exchange for someone carrying the upgrade burden

The matrix is not a scorecard you tally to a number. It is a way to find the one or two axes that actually decide your case. A regulated workload may be settled by control and risk alone. A high-volume internal tool may turn entirely on TCO, where the arithmetic belongs in our AI TCO calculator rather than a universal crossover point. The failure evidence in the next section is overwhelmingly about teams that built where they had no edge on any of these axes, which is exactly the case the rule tells you to buy.

Why internal AI builds fail more often than bought ones

Internal builds fail more often because teams build where they have no edge. The largest study of build versus buy success found vendor and partnership efforts reach successful deployment about 67% of the time, while internal builds succeed only about one-third as often. The pattern behind the gap is consistent: a from-scratch build needs proprietary data and deployment infrastructure that teams routinely underestimate.

Bought solutions reach production far more often than internal builds

Share of efforts reaching successful deployment in MIT Project NANDA’s 2025 study of enterprise GenAI. The internal-build figure is derived from the report’s "about one-third as often" relative to the 67% buy rate, so read it as directional and not a per-project probability.

Data behind this chart
Path	Reaches successful deployment
Buy via vendor or partnership	about 67%
Internal build (derived)	about 22% (one-third as often)

Source: MIT Project NANDA, The GenAI Divide: State of AI in Business (2025). The 22% internal-build figure is derived from the report’s relational "one-third as often," not a directly reported number.

The wider failure context says where the risk concentrates. RAND found more than 80% of AI projects fail, about twice the rate of non-AI IT work, driven by stakeholder problem-misalignment, poor data, chasing technology over the problem, and inadequate deployment infrastructure.⁵ Gartner expects at least 30% of generative-AI projects to be abandoned after proof of concept, and over 40% of agentic-AI projects to be canceled by end of 2027, on cost, unclear value and weak risk controls.⁶ MIT NANDA reports that about 95% of organizations see no measurable P&L return from their GenAI pilots.¹ S&P Global recorded AI-initiative abandonment jumping from 17% in 2024 to 42% in 2025.⁷ Informatica’s top three obstacles, data quality and readiness at 43%, technical maturity at 43%, and a skills shortage at 35%, all make a from-scratch build harder than estimated.⁸ None of this says do not build. It says do not build where you have no data edge, no infrastructure, and no clear problem.

Build where you have an unfair advantage and the volume to own it. Buy everything else, especially the model.

The decision rule, in one line

Buy the model, build the product around it

The hybrid resolves the two halves. Buy the model because the model layer is commoditizing and getting cheaper fast, so renting per token beats owning weights for almost everyone. Build the orchestration, the product surface and the data flywheel, because that is where the durable advantage lives. Almost every real system is a hybrid, and the skill is drawing the line in the right place for your specific edge.

The case for buying the model is the cost curve. Inference cost for equivalent performance has fallen roughly 10x a year, on the order of 1,000x over three years for GPT-3-class quality, which is why training or owning a frontier model is almost never the build decision.³ The case for building is where the value sits. a16z's enterprise work finds the moat is the orchestration across models and the domain workflow, never the model itself, with apps now combining the orchestration of cutting-edge models, domain-specific interfaces, and the feature surface that is now cheap to build.¹¹ The buildable moat is the orchestration and the data loop, never the weights themselves.

The mature posture is an explicit portfolio rather than a one-time choice. McKinsey names the target architecture an "agentic AI mesh" capable of integrating both custom-built and off-the-shelf agents: off-the-shelf for routine and horizontal work, custom for the high-impact, proprietary processes, composed together.² Designed for production from day one, that is the same discipline our production-first AI guide argues for, and it is the work our AI application development team does on the custom layer once the line is drawn.

The hybrid, layer by layer
Layer	Default call	Why
Foundation model	Buy (rent per token)	Commodity, falling about 10x a year; owning weights rarely pays off
Infrastructure / hosting	Buy or rent	Generic, undifferentiated, heavy to operate
Orchestration / routing	Build	Multi-model routing is the differentiator and hedges lock-in
Product surface / workflow	Build	Domain UI and workflow fit are cheap to build and hard to copy
Data loop / flywheel	Build	Proprietary data and the loop that grows it are the moat

The risks build vs buy advice gets wrong

Four things this decision must not over-claim. The 67% and 22% split is a sampled success-rate observation, never a per-project law. Buying defers the build risk; it does not delete the risk surface. Lock-in is a cost to price, never an automatic disqualifier. And vendor vetting on the buy side is real diligence, not a shortcut. Hold all four or the framework reads as a sales pitch for one column.

The success-rate split reflects MIT NANDA’s sampled 2025 GenAI pilots and conflates many use cases. It does not mean your build will fail at that rate; treat it as evidence about where risk concentrates rather than as a probability for your project.
Whether you build or buy, the running system carries the full risk surface. The OWASP LLM Top 10 failure modes, prompt injection, sensitive-information disclosure, supply chain, excessive agency and unbounded consumption, apply to vendor products too.⁹ The NIST AI RMF governance functions, govern, map, measure and manage, stay with you either way.¹⁰ Buying does not outsource accountability.
Lock-in trades control for maintenance relief. The honest move is to price the switching cost (re-prompting, re-evaluation, integration rework) instead of treating dependence as fatal. Multi-model orchestration reduces single-vendor exposure without eliminating it.
TCO is scenario-dependent. Buy looks cheap at low or spiky volume and expensive at scale; build is the reverse, plus a large fixed team cost. Any TCO claim depends on volume and time horizon, so model it rather than assume a universal crossover.

The buy-side blind spot is worth stating once more. With Gartner judging only about 130 of thousands of agentic vendors real, buying can mean buying agent-washed RPA, so the buy column is not the low-diligence option it looks like.⁴

How to decide for your case

Run your workload through the six axes and find the one or two that decide it. Start by assuming you buy the model and the generic capability, then justify each thing you choose to build by a specific edge, a data moat, a workflow rivals cannot copy, or a hard control requirement. If you cannot name the edge, that part should be bought.

In practice the sequence is short. Separate the layers first, model the TCO at your real volume and time horizon, and pressure-test every build candidate against the differentiation and data axes. Then design the hybrid for production rather than as two disconnected procurement tracks. Drawing that line is precisely what our AI consulting engagements open with, and building the custom layer once it is drawn is what our AI application development team takes on. The goal is one coherent system where the bought commodity and the built moat fit together, instead of a binary you regret in either direction.

Frequently asked

Build vs buy AI questions

Should I build or buy AI?

Buy the commodity layer, the foundation model and generic horizontal capabilities, and build only where you hold a real edge (proprietary data, a workflow moat, or a hard control requirement) at enough volume to justify it. The evidence is that internal builds succeed far less often than vendor solutions (about 67% versus one-third as often, MIT NANDA 2025), but off-the-shelf rarely unlocks strategic advantage (McKinsey 2025), so the real answer is a hybrid drawn along your specific moat.

Is it cheaper to build or buy AI?

It depends on volume and time horizon, never on a fixed rule. Buying, on per-call or per-seat pricing, is cheaper at low or spiky volume because you avoid hiring a team and standing up MLOps. Building can be cheaper at sustained high volume once that fixed cost amortizes, but it adds maintenance, evaluation and on-call that teams routinely under-count. Model the real total cost of ownership before deciding, and do not assume a universal crossover point.

Why do so many internal AI builds fail?

Because teams build where they have no edge. RAND found more than 80% of AI projects fail, about twice the non-AI IT rate, driven by problem-misalignment, poor data, chasing technology over the problem, and inadequate deployment infrastructure. MIT NANDA found internal builds succeed only about one-third as often as bought solutions. The pattern is that a from-scratch build needs proprietary data and infrastructure that most teams underestimate, so building without that edge is the common failure.

What does buy the model, build the product mean?

Frontier-model inference is a commodity getting roughly 10x cheaper a year, so almost no one should train their own; you rent the model per token instead. What you build is everything around it: the orchestration across models, the domain interface, the workflow integration, and the data flywheel. a16z's enterprise finding is that the moat is the orchestration across models and the domain workflow, never the model itself, so the model is bought and the product around it is built.

Does buying AI mean I avoid the risk?

No. Buying defers the build risk, but the running system still carries the full risk surface. The OWASP LLM Top 10 failure modes, including prompt injection, data leakage, excessive agency and unbounded consumption, apply to vendor products too, and the NIST AI RMF governance responsibilities stay with you. You also take on vendor risk: Gartner estimates only about 130 of thousands of agentic-AI vendors are real, so buying requires genuine diligence and is no shortcut around it.

Kanika Mathur

Head of Service Delivery, Resourcifi

Kanika Mathur runs Service Delivery at Resourcifi, where her pods sit on both sides of this call: integrating bought vendor agents and building the custom orchestration and data loops that vendors cannot sell. She spends most of her scoping time pulling teams off the binary and onto the harder question of where their actual moat is, because that is the line that decides what gets built.

Resourcifi on LinkedIn →

Sources

MIT Project NANDA, The GenAI Divide: State of AI in Business 2025 (2025). Buy via vendor or partnership succeeds about 67% of the time versus one-third as often for internal builds; about 95% of organizations see no measurable P&L return.
McKinsey QuantumBlack, Seizing the agentic AI advantage (2025). Off-the-shelf agents "rarely unlock strategic advantage"; the "agentic AI mesh" integrates custom-built and off-the-shelf agents.
Guido Appenzeller / a16z, Welcome to LLMflation (2024). Inference cost for equivalent quality falls roughly 10x a year, on the order of 1,000x over three years for GPT-3-class models.
a16z (Sarah Wang, Shangda Xu, Justin Kahl, Tugce Erten), How 100 Enterprise CIOs Are Building and Buying Gen AI in 2025 (2025), and Notes on AI Apps in 2026. Differentiation is the orchestration across models plus the domain workflow, never the model itself; model differentiation by use case is the main reason enterprises buy from multiple vendors.
Gartner, Over 40% of Agentic AI Projects Will Be Canceled by End of 2027 (2025). Flags "agent washing" and estimates only about 130 of thousands of agentic-AI vendors are real.
Ryseff, De Bruhl and Newberry / RAND Corporation, The Root Causes of Failure for AI Projects and How They Can Succeed (2024). More than 80% of AI projects fail, about twice the non-AI IT rate, with five root causes.
Gartner, 30% of Generative AI Projects Will Be Abandoned After Proof of Concept by End of 2025 (2024). Abandonment driven by poor data quality, weak risk controls, escalating cost and unclear value.
S&P Global Market Intelligence, Voice of the Enterprise: AI & Machine Learning (2025). AI-initiative abandonment rose from 17% in 2024 to 42% in 2025; about 46% of proofs of concept scrapped before production.
Informatica, CDO Insights 2025 (2025). Top obstacles to AI success: data quality and readiness 43%, technical maturity 43%, skills shortage 35%.
OWASP, Top 10 for LLM Applications 2025 (2024). Failure modes from prompt injection (LLM01) through unbounded consumption (LLM10) apply to built and bought systems alike.
NIST, Artificial Intelligence Risk Management Framework (AI RMF 1.0), NIST AI 100-1 (2023). Four governance functions: govern, map, measure and manage.