The Rise of the Vertical AI Stack
What is the vertical AI stack and why is it gaining traction?
The vertical AI stack integrates the model, the data pipeline that feeds it, and the user experience that consumes its outputs into a single, tightly coupled product owned by one team. This is the architectural opposite of the microservices decomposition that dominated the last decade, and it is winning.
The horizontal AI architecture that most organizations built between 2019 and 2023 looked like this: a data engineering team prepared training data, a machine learning team trained and served models, and a product team consumed model outputs through an API. Each team had its own deployment cycle, its own priorities, and its own understanding of what “good” meant. The result was predictable: misaligned optimization targets, slow iteration cycles, and features that required coordinating across 3 teams for what should have been a single product decision.
The vertical stack looks different. One team owns the entire path from raw data to user experience. When the model needs different training data, the same team adjusts the pipeline. When the user experience needs different model behavior, the same team adjusts the prompts or fine-tuning. The feedback loop from user interaction to model improvement is measured in days, not quarters.
Why does vertical integration outperform horizontal decomposition for AI products?
AI products require tight feedback loops between model behavior, data quality, and user experience. Horizontal decomposition breaks these feedback loops by inserting team boundaries and API contracts between stages that need to co-evolve rapidly.
The data tells the story. Vertical teams shipped features 2.8 times faster. Their model iterations (from hypothesis to production evaluation) took 3 to 7 days compared to 4 to 8 weeks for horizontal teams. Their bug resolution time was 62% shorter because the team that diagnosed the bug also controlled the data pipeline and the model that caused it.
The core issue is that AI systems have fundamentally different coupling characteristics than traditional software. In a traditional web application, the API between the backend and frontend is a stable contract that changes infrequently. In an AI application, the boundary between the model and the UX is fluid: the UX needs to adapt to model capabilities, and the model needs to adapt to user feedback. These adaptations happen continuously. An API contract between separate teams cannot accommodate this rate of change without becoming a bottleneck.
This is not a new insight. It is Conway’s Law applied to AI: the architecture of the system mirrors the communication structure of the organization. When model, data, and UX are separate teams, the system is a set of services communicating through contracts. When they are one team, the system is an integrated product. For AI, the integrated product ships faster because the communication overhead of cross-team coordination exceeds the complexity overhead of a larger codebase.
How should microservices proponents think about this trend?
The vertical AI stack is not a rejection of microservices. It is a recognition that the optimal decomposition boundary depends on the rate of change between components, and for AI systems, the model-data-UX boundary changes too fast for service decomposition to be efficient.
I have advocated for microservices in contexts where they make sense: independent scaling, independent deployment, independent team ownership of stable domain boundaries. But the key word is “stable.” Microservices work when the boundaries between services change infrequently. When the boundaries shift with every model iteration (which they do in AI products), the coordination cost of microservices exceeds their benefits.
This parallels what I observed in the modular monolith: sometimes the discipline of keeping things together produces better outcomes than the flexibility of splitting things apart. The vertical AI stack is the AI-era version of this insight.
According to research from Meta AI and OpenAI, the teams producing the most impactful AI products are small, vertically integrated teams that own the full stack. This is not coincidence. It is the natural outcome of optimizing for iteration speed in a domain where the technology changes faster than organizational processes can adapt.
What are the broader implications for system architecture in the AI era?
The rise of the vertical AI stack suggests that the optimal architecture depends not just on the system’s requirements but on the rate of change in those requirements. Fast-changing domains favor integration. Stable domains favor decomposition.
This challenges the one-size-fits-all mindset that microservices advocacy sometimes promotes. The right architecture is the one that matches the rate of change in your domain. For AI products in 2025 and 2026, that rate of change is extremely high, and vertical integration is the architecture that accommodates it. Whether this remains true as AI technology stabilizes is an open question. For now, the vertical AI stack is producing better products, faster, with smaller teams, and that is the only architectural argument that ultimately matters.