Let's Grow

contact@eustatiu.com

Strategy 12 min read

Software architecture guide: monolith vs microservices (2026)

Count your developers. That number – not your traffic – decides your architecture. Real cost data from Amazon, Shopify, and Monzo.

A founder walked into our office last year with a 12-service Kubernetes cluster, a service mesh, a message queue, and 340 users. Four developers. AWS bill: $4,200 a month. Separate services for auth, notifications, user profiles, analytics, payments – the works. He wanted us to add a feature. We told him to delete eleven services first.

He thought we were joking. We weren’t. We consolidated everything into a single application on Cloudflare Workers. Same features. Same 340 users. New infrastructure cost: $45 a month. His CTO had spent six months building the architecture Netflix uses – for a product with fewer users than a mid-size restaurant has on a Saturday night.

The only question that matters

How many developers will work on this codebase at the same time?

Not how many users. Not how much traffic. Not what you read on Hacker News. How many developers. Martin Fowler spent years studying microservices projects and landed on this: “Almost all the successful microservice stories have started with a monolith that got too big and was broken up. Almost all the cases where I’ve heard of a system that was built as a microservice system from scratch, it has ended up in serious trouble” [1].

Four case studies follow – including the one company where microservices were the right call from day one.

What this decision actually costs

Before the numbers: serverless is a deployment target, not an architecture. You can deploy a modular monolith to Cloudflare Workers just as easily as to a VPS. The founder we mentioned? His consolidated monolith runs on Workers. It’s serverless. It’s also a monolith. These aren’t opposites.

We’ve built or audited 40+ products in three years. Here’s what architecture choice does to your budget, based on our project data:

Monolith (VPS/container)Monolith (serverless)Microservices
MVP development$25,000-$45,000$20,000-$40,000$60,000-$140,000
Infrastructure (0-1K users)$50-$200/mo$5-$100/mo$500-$2,000/mo
Infrastructure (10K+ users)$400-$2,000/mo$200-$3,000/mo$2,000-$18,000/mo
Minimum team size1-31-35-15+
Time to first deploy8-12 weeks6-10 weeks16-24 weeks

The microservices column isn’t more expensive because the product is bigger. It’s more expensive because you’re building two things: your product and the plumbing to connect a dozen small applications that used to be one. Service discovery, distributed tracing, container orchestration, separate deployment pipelines. None of that ships a feature. All of it costs money forever.

We break down every line item in our software development cost guide.

Amazon built it the “right” way. Then the bill arrived.

In 2023, Amazon’s Prime Video team published a case study that broke the internet [2]. Their video quality monitoring tool – the thing that detects glitches in your stream – was built exactly how AWS tells you to build things: Step Functions orchestrating Lambda functions, S3 passing data between steps. The architecture worked. It shipped fast, each stage could be developed and tested independently, and it handled their initial load.

Then they scaled to real production traffic and hit account limits at 5% of expected load [2].

The architecture didn’t crash. It became financially unsustainable. Two things killed the economics. First, AWS charges $0.025 per 1,000 state transitions in Step Functions [3]. Every second of video triggered multiple transitions – extract frame, analyze audio, analyze video, compare, log. Millions of seconds of video per day. Those fractions of a cent became real money.

Second – and this was worse – S3 was the glue between every step. Each Lambda function wrote its intermediate result (video frames, audio buffers) to an S3 bucket. The next function read it back. Thousands of Tier-1 S3 API calls per second, and the high volume of GET and LIST operations to that temporary bucket became the dominant cost [2]. They hit AWS account limits before they hit traffic limits.

They consolidated everything into a single process on ECS. Data that used to bounce through S3 now passed in-memory – just function calls. Orchestration that used to cost money per step became free.

90%
cost reduction after consolidating a serverless pipeline into a single process
[2] Prime Video Tech Blog, 2023

This wasn’t a failure of serverless deployment. It was a failure of serverless orchestration at high frequency. The initial architecture gave them fast iteration during development – exactly what it should have. The mistake was keeping it past the point where the unit economics stopped working. Step Functions is brilliant for workflows that fire thousands of times a day. It’s financially catastrophic for anything that runs millions of times per day.

Segment: 140 services, three engineers, barely shipping

Segment routes your analytics data to every tool you use – Google Analytics, Mixpanel, Amplitude, 140+ destinations. By 2016, each destination was its own microservice. Its own queue. Its own deployment [4].

By early 2017, three full-time engineers spent most of their time keeping the lights on. A shared library bug meant deploying a patch to 140+ services, each with drifted dependencies – a morning fix became a two-day marathon. One slow destination API caused head-of-line blocking across every other pipeline. The test suite took an hour because integration tests hit live third-party endpoints with expired keys and flaky networks.

They built Centrifuge – a single service that replaced everything. Testing went from hours to minutes. They shipped 46 improvements to shared libraries in the first year – up from 32 under the old architecture [4].

Same lesson as Prime Video. An architecture chosen for a scale they hadn’t reached, kept past the point where it stopped paying for itself.

284 million requests per minute. One codebase.

Every “monolith vs microservices” debate eventually gets the same response: “But what about scale? You’ll have to rewrite everything when you grow!”

Shopify’s answer: 2.8 million lines of Ruby. One codebase. 1,000+ engineers merging 400 commits per day. 1.19 trillion edge requests over Black Friday weekend 2024, with origin traffic peaking at 284 million requests per minute [5][6].

A deliberate engineering choice – and one that required years of investment they started before they needed it.

Packwerk. Around 2016-2017, Shopify assembled a team to make their monolith modular [7]. Three years later they released Packwerk, an open-source static analysis tool that enforces module boundaries at build time [8]. The codebase has 37 components, each with defined public APIs. If your payments code tries to import something from the shipping module’s internals, the build fails. Not “someone will catch it in code review.” The CI pipeline rejects it automatically.

Pods. Each pod serves a subset of merchants with isolated failure domains. One store’s Black Friday traffic spike can’t take down another store. It’s horizontal isolation without splitting the codebase.

This is a modular monolith. One deployment. ACID transactions without saga patterns. One set of logs when something breaks at 3am. But internally, the modules are separated as if they were independent services.

This is the architecture we default to. When you hire us, your modules own their own tables, expose clean internal APIs, and your database design already supports extracting a service later. The extraction path is a strangler fig pattern: an API gateway or reverse proxy sits in front, routing requests by path. When you need to extract billing, /billing/* routes to the new service while everything else stays on the monolith. The new service starts with a read replica of the shared database, then migrates to its own. No rewrite. No flag day.

When microservices are actually right: Monzo’s 2,800 services

It’s easy to read this and conclude microservices are always wrong. They’re not. They’re wrong for most teams at most stages.

Monzo – the UK neobank with over 9 million customers – runs 2,800 microservices managed by roughly 300 engineers [9]. Every new service starts in default-deny: it can’t talk to other services, can’t access databases, can’t reach the internet until access is explicitly granted through Kubernetes network policies. Here’s why this works for them and almost certainly wouldn’t work for you:

They’re a bank. PCI-DSS requires network segmentation for systems that touch cardholder data. You can achieve this with VLANs and firewalls in a monolith, but Monzo’s microservices give them granular isolation: the PCI audit scope covers only the handful of services that process payments, not all 2,800. That’s not an architecture preference – it’s a regulatory strategy that microservices made simpler to enforce.

They have 300 engineers. Dozens of product teams, each owning specific services, shipping 100+ deployments per day. The overhead of distributed tracing, deployment automation, and service mesh is justified because 300 people can’t deploy to one codebase without stepping on each other constantly.

They standardized everything. All 2,800 services are written in Go, stored in a monorepo, running on Kubernetes. No polyglot mess. No “this service is in Python, that one’s in Node, and nobody knows who maintains the Java one.” One language. One repo. One deployment system.

To be clear: 300 engineers alone don’t justify 2,800 services. Shopify has more engineers and runs a monolith. What makes Monzo different is the combination – team size, regulatory isolation, and near-religious standardization. Remove any one of those three and the same architecture becomes a money pit. If you have compliance needs but fewer than 50 engineers, extract only the regulated components and keep everything else monolithic.

Kelsey Hightower said it best: organizations that couldn’t maintain discipline in a monolith split into microservices expecting to find the engineering discipline they never had [10]. They don’t find it. Monzo had the discipline first. If you have fewer than 50 engineers and no regulatory isolation requirement, microservices will cost you more than they save – and we’ll build you a monolith that’s ready to split the day you actually need to.

Serverless in 2026: deployment, not doctrine

Half the “serverless limitations” articles floating around are three to five years out of date. Cold starts? Cloudflare Workers use V8 isolates instead of containers – startup is sub-millisecond in warm datacenters, low single-digit milliseconds even on cold starts [11]. Invisible to end users. WebSockets? Durable Objects support persistent connections with hibernation – the connection stays open while the runtime sleeps, waking on the next message [12]. Lambda runs for up to 15 minutes [13]. Google Cloud Run offers serverless GPUs with sub-5-second startup for ML inference [14].

The 2020 version of “serverless can’t do X” is mostly gone. What remains is a pricing question.

Where serverless still bleeds money: sustained high-throughput compute. If your workload runs at constant high CPU for hours – batch ETL, video encoding, ML training – per-invocation pricing exceeds always-on containers. That’s the Prime Video lesson. High-frequency orchestration on serverless is financially catastrophic; the same code in a single process on a container costs a fraction.

The other real limitation: cost predictability. Usage-based billing scales with traffic. Great when traffic is low. Terrifying when a viral moment turns your $50/month bill into $5,000 overnight. Budget caps and spending alerts aren’t optional – they’re mandatory.

Our default stack for MVPs and on-demand platforms: a modular monolith deployed to the edge on Cloudflare Workers, a managed database, static hosting for the frontend. When a workload needs sustained compute, we pair the serverless API with always-on containers for the heavy processing. If you’re building a progressive web app or single-page application, this handles everything – including real-time.

Five signs it’s time to extract your first service

Don’t extract preemptively. Don’t extract because a blog post said to. Extract when you see this:

  1. Two teams blocked on the same deployment pipeline for more than 2 sprints straight.
  2. One module’s load is radically different from everything else – 100x more compute, different scaling curve.
  3. A module needs a different technology – ML inference in Python while your app is TypeScript.
  4. Compliance requires physical isolation – a separate audit boundary for payments or health data.
  5. Your test suite takes 45+ minutes because one module’s integration tests drag down everyone else.

Zero or one of these? Stay monolithic. Two or three? Extract that specific module using the strangler fig approach described above – route its traffic to a new service, keep everything else on the monolith. All five? Plan a phased migration with domain boundaries mapped to team ownership.


Most founders we talk to are about to overspend on infrastructure they won’t need for years. We build modular monoliths with clean boundaries, deployed to the edge – the architecture that handles 99% of startups through Series B and beyond. Tell us what you’re building and we’ll scope it.

References

[1] M. Fowler, “Monolith First,” martinfowler.com, 2015.

[2] M. Kolny, “Scaling up the Prime Video audio/video monitoring service and reducing costs by 90%,” Prime Video Tech Blog, 2023. primevideotech.com

[3] AWS, “Step Functions Pricing – $0.025 per 1,000 state transitions,” 2025. aws.amazon.com

[4] A. Noonan, “Goodbye Microservices: From 100s of Problem Children to 1 Superstar,” Segment Engineering Blog, 2018. segment.com

[5] P. Downey, “Under Deconstruction: The State of Shopify’s Monolith,” Shopify Engineering, 2020. shopify.engineering

[6] Shopify, “Black Friday Cyber Monday Data 2024,” shopify.com/news/bfcm-data-2024

[7] Shopify, “Q4 2017 Financial Results – nearly 3,000 employees,” shopify.com; Shopify Engineering, “Deconstructing the Monolith,” 2019. shopify.engineering

[8] Shopify Engineering, “Enforcing Modularity in Rails Apps with Packwerk,” 2020. shopify.engineering

[9] Monzo, “How we run migrations across 2,800 microservices,” 2024. monzo.com/blog; InfoQ, “Planning, Automation and Monorepo,” 2024. infoq.com; The Register, “How does Monzo keep 1,600 microservices spinning?” 2020. theregister.com

[10] Changelog, “Monoliths are the future – Kelsey Hightower discussion,” 2023. changelog.com

[11] Cloudflare, “Eliminating cold starts with Cloudflare Workers,” 2024. blog.cloudflare.com

[12] Cloudflare, “WebSocket Hibernation – Durable Objects,” 2024. developers.cloudflare.com

[13] AWS, “Lambda quotas – execution timeout: 900 seconds,” 2025. docs.aws.amazon.com

[14] Google Cloud, “Cloud Run GPUs are now generally available,” 2025. cloud.google.com/blog

Frequently asked questions

What is software architecture?

How your code is organized and deployed – monolith, microservices, or serverless. It determines development cost, team velocity, and operational complexity.

Should a startup use microservices or a monolith?

A monolith. Martin Fowler's rule: almost all successful microservice stories started with a monolith that got too big. For most startups, that day never comes.

When should you migrate from monolith to microservices?

When teams block each other on deploys – usually around 15+ developers. Start with the strangler fig pattern: route new features to a new service while the monolith stays running.

We build the architecture this article describes.

Modular monolith. Clean module boundaries. Database schema ready to split. Deployed to the edge. Ready to extract services the day you actually need them – not six months before.

Or leave your details — we'll reach out within 24h.

Build the right architecture.