Streaming Architectures for Teams That Do Not Need Real-Time

After implementing Kafka for a team that processed 2 million events per day, I found that 94% of their analytics consumers queried data at hourly or daily granularity. The streaming infrastructure cost $4,200 per month. An equivalent micro-batch architecture would have cost $800 per month and delivered the same analytical outcomes. Most teams adopting streaming do not need real-time. They need “fast enough.”

Why do teams adopt streaming when they do not need it?

Teams adopt streaming because “real-time” sounds like an obvious improvement, because vendor marketing conflates real-time ingestion with real-time value, and because the social cost of saying “we don’t need real-time” in a technical organization is higher than the financial cost of implementing it unnecessarily.

A streaming architecture processes data continuously as it arrives, typically using message brokers (Kafka, Kinesis, Pulsar) and stream processing engines (Flink, Spark Streaming, ksqlDB). It provides sub-second data availability but at higher operational complexity and cost compared to batch or micro-batch alternatives.

I have been in 7 architecture discussions where someone proposed Kafka. In 5 of those, I asked: “What decision will be different if data arrives in 1 second versus 15 minutes?” In 3 of the 5, the honest answer was “none.” The dashboards refreshed hourly. The reports ran daily. The ML models retrained weekly. Sub-second data delivery was a technical capability without a business requirement.

The real-time and batch convergence is real at the infrastructure level. But at the requirements level, most analytical workloads are fundamentally batch: they aggregate, summarize, and compare data over time windows. Streaming those workloads adds complexity without improving outcomes.

What are the hidden costs of unnecessary streaming?

The hidden costs include operational complexity (24/7 monitoring of brokers, consumers, and partitions), staffing requirements (Kafka expertise is specialized and expensive), debugging difficulty (distributed system failures are harder to diagnose than batch failures), and the opportunity cost of engineering time spent managing infrastructure instead of delivering analytical value.

I tracked the operational burden of the Kafka deployment over 6 months:

Incidents: 14 production incidents related to the streaming layer, including consumer lag alerts, partition rebalancing failures, and serialization errors. The equivalent batch pipeline had 2 incidents in the same period
Engineering time: 12 hours per week of engineering time on streaming infrastructure maintenance. The batch alternative required approximately 2 hours per week
Cost: $4,200 per month for Kafka infrastructure (brokers, ZooKeeper, monitoring). The micro-batch alternative (scheduled Airflow jobs processing S3 files) estimated at $800 per month
Debugging complexity: Average incident resolution time for streaming issues was 3.2 hours versus 45 minutes for batch. Distributed state, consumer offsets, and exactly-once semantics create debugging challenges that batch processing avoids entirely

When does streaming actually make sense?

Streaming makes sense when the business requirement genuinely demands sub-minute data availability: fraud detection, real-time bidding, operational monitoring, live personalization, and any system where delayed data means missed decisions with immediate financial or safety consequences.

I maintain a simple test: “What is the cost of the data being 15 minutes old?” If the answer is “nothing meaningful changes,” streaming is overengineering. If the answer is “we lose money, miss fraud, or endanger safety,” streaming is justified. In my experience, fewer than 20% of data use cases pass this test. According to stream processing principles, the fundamental tradeoff is latency versus complexity. Organizations should only accept the complexity when the latency reduction has demonstrable value.

For the 80% that do not pass, micro-batch (processing data every 5 to 15 minutes using scheduled jobs) provides “near-real-time” at a fraction of the cost and complexity. The boring technology principle applies: a scheduled SQL query that runs every 10 minutes is easier to build, debug, monitor, and maintain than a Kafka consumer group with exactly-once processing guarantees.

What should teams consider before adopting streaming?

Before adopting streaming, teams should document the specific business decisions that require sub-minute data, calculate the total cost of ownership (not just infrastructure but staffing and operational overhead), and prototype with micro-batch first to establish whether the latency is actually insufficient.

The prototype-first approach has saved teams I have worked with significant cost. In 3 cases, the micro-batch prototype met business requirements without modification. In 1 case, it revealed that only 2 of 12 data streams needed real-time processing, allowing a hybrid architecture where 10 streams used batch and 2 used streaming. The total cost was $1,800 per month instead of $4,200 for the full streaming deployment. The architecture without Netflix’s problems mindset applies: build for your actual scale and requirements, not for the requirements of organizations 100x your size.

Streaming is a powerful pattern for the problems that genuinely require it. For everything else, it is expensive complexity masquerading as technical sophistication. The question is not “can we build this with streaming?” The question is “do we need to?” Most teams, if they answer honestly, do not.

Why do teams adopt streaming when they do not need it?

What are the hidden costs of unnecessary streaming?

When does streaming actually make sense?

What should teams consider before adopting streaming?

More Essays

Signal Extraction in an Age of Information Obesity

Your Data Catalog Is Lying to You

Vanity Metrics and the Theater of Data-Driven Decision Making