How to architect your MVP so it doesn't need a full rewrite at 10k users

The rewrite story nobody talks about honestly

There's a version of this that every engineering team goes through. You build the MVP fast, it works, users show up, and then around 5k to 10k users something starts to feel wrong. Response times creep up. A background job blocks an API call. The codebase has grown in ways that made sense three months ago and make no sense now. And then comes the conversation nobody wants to have: we need to rewrite this.

The frustrating part is that most of the time, the rewrite wasn't inevitable. It came from architecture decisions made on day one that were never revisited.

I'm 23, two years into the industry, currently building VedaStack on the side while transitioning to a new role. My previous company was RabbitLoader, a web performance and optimisation product. When I joined, the entire backend was in PHP, frontend in Bootstrap and jQuery. In my first year I was part of rewriting both — backend in Go, frontend in React. The difference was immediate. Same workloads, smaller AWS VMs, with headroom to spare. That's not a theoretical argument for Go's resource efficiency, I watched it happen on real traffic.

So when I talk about architecture decisions that avoid rewrites, I'm not working from blog posts. I'm working from having done the rewrite.

Why most MVPs break before they need to

The most common culprit isn't the language or the framework. It's that everything runs in one process and nobody thought about what happens when that process gets busy.

Sending a welcome email inside an API request handler. Resizing an image synchronously before returning a response. Running a report generation job on the same server that handles user logins. All of these are the same problem: work that doesn't need to happen immediately is blocking work that does.

Everything in one service. Works fine at 100 users. At 5k, one slow endpoint degrades everything else.
Background work mixed into request handlers. The moment a job takes 2 seconds, your API latency looks like garbage.
No queue, just direct function calls. If the background task fails, it's gone. No retry, no visibility, no recovery.
Serverless for everything, forever. Great at the start, expensive and limiting past a few thousand users. Covered this in the last post.
Premature microservices. Splitting into ten services before you understand the domain boundaries just gives you a distributed monolith that's harder to debug.

The fix for most of these is the same: two services, a clean separation of concerns, and a language that makes both cheap to run.

The architecture I'd actually use

architecture diagram preferred by vedastack

Two Go binaries on one VPS, a frontend on Vercel, and a database. No service mesh, no container orchestration, no infrastructure team required. This setup will take you comfortably to 10k users and, depending on your workload, well beyond.

Service one: the API service

The API service handles everything user-facing. HTTP requests, authentication checks, reading and writing to the database, returning responses. Its only job is to be fast and reliable for the person on the other end of the connection.

The rule is strict: the API service does not do slow work. If something takes more than a few hundred milliseconds, it doesn't belong here. It gets pushed to a queue and handed to the worker.

1func main() {
2    db    := database.Connect()
3    queue := jobqueue.New(db)
4    router := chi.NewRouter()
5
6    router.Use(middleware.Logger)
7    router.Use(middleware.Recoverer)
8
9    router.Post("/api/users/signup",  handlers.Signup(db, queue))
10    router.Get("/api/users/me",       handlers.Me(db))
11    router.Post("/api/images/upload", handlers.Upload(db, queue))
12
13    log.Println("API service running on :8080")
14    http.ListenAndServe(":8080", router)
15}

The queue gets passed into handlers that need to dispatch background work. The handler does the fast part — validate, save to DB, return 200 — and drops a job on the queue for the slow part: send email, process image, generate report. The user gets a fast response. The work still happens.

For routing I like fiber. Lightweight, idiomatic, handles middleware cleanly. No strong opinion here though, use what you're comfortable with.

Service two: the background worker

The worker runs separately from the API. Its job is to pick jobs off the queue, process them, and mark them done.

Because it's a separate process, a slow or expensive job in the worker has zero impact on API response times. If the worker falls behind, the API doesn't care. If the worker crashes, the API keeps running. The jobs stay in the queue and the worker picks them up when it comes back.

1// worker/main.go — background worker entry point
2func main() {
3    db := database.Connect()
4
5    for {
6        job, err := queue.Dequeue(db)
7        if err != nil || job == nil {
8            time.Sleep(2 * time.Second)
9            continue
10        }
11
12        switch job.Type {
13        case "send_welcome_email":
14            workers.SendWelcomeEmail(job)
15        case "process_image":
16            workers.ProcessImage(job)
17        case "generate_report":
18            workers.GenerateReport(job)
19        }
20
21        queue.MarkDone(db, job.ID)
22    }
23}

For early-stage products, your database is a perfectly good job queue. A jobs table with a status column, a type, a payload, and a created_at is all you need. The worker polls it on an interval. When job volume grows to the point where polling feels wasteful, you swap in Redis or something like River, a Go-native job queue built on Postgres. But that's a future problem.

We're using this exact pattern at VedaStack for the image optimisation service. Upload request comes in, API returns immediately with a job ID, worker picks it up, processes it with Go on Vultr, writes the result. The user polls for status or gets a notification when it's done. Clean, boring, works.

The frontend: Vercel and SSR

Put your frontend on Vercel and use server-side rendering. Next.js is the obvious choice, though Nuxt if you're Vue-inclined. SSR gives you better SEO out of the box, faster initial loads, and you don't have to think about CDN configuration because Vercel handles it.

The frontend talks to your Go API service. Keep that boundary clean. The frontend should not know anything about your database, your queue, or your worker. It knows about API endpoints and nothing else. This means you can change either side without touching the other.

Vercel's free and hobby tiers are generous enough for early traction. As I wrote in the build vs buy post, serverless makes sense up to around 5k MAU. Past that, move the API to a VPS while keeping the frontend on Vercel — that part genuinely benefits from edge distribution.

Why Go specifically

Because two Go binaries on a cheap VPS will handle a level of traffic that would bring most other setups to their knees, and they'll do it quietly.

I saw this firsthand at RabbitLoader. After the rewrite from PHP to Go, the same traffic running on smaller, cheaper VMs with resource headroom left over. Go's memory usage per request is tiny. A goroutine starts at about 8KB of stack. You can have tens of thousands of them active at once. A 2-core, 4GB VPS running a Go API service will handle thousands of concurrent connections without complaint. I went into the full detail in the Golang vs Node.js post.

For the worker specifically, Go's concurrency model is a natural fit. Ten or twenty goroutines each processing a different job in parallel, clean cancellation via context, almost no memory overhead. The deployment story is just as good — each service compiles to a single static binary. Copy it to the server, run it. Put Caddy in front for TLS, wire up GitHub Actions to deploy on push, and you're done.

What this looks like in practice

A user signs up on your Next.js frontend. The form posts to your Go API service. The API validates the input, creates the user in Turso or self-hosted Postgres, inserts a send_welcome_email job into the jobs table, and returns a 200 with the user object. Total time: under 100ms.

Meanwhile the worker is polling the jobs table. It picks up the welcome email job, calls Resend's API, marks the job done. If Resend is slow or down, the job retries. The user never knows. The API was fast regardless.

A week later you add profile photo uploads. The API accepts the file, saves it to Cloudflare R2, inserts a process_image job, returns immediately. The worker resizes and compresses it in Go, saves the result, updates the user record. The API was fast, the slow work happened separately.

This pattern handles most of what a typical SaaS product needs. Reports, emails, notifications, file processing, webhook deliveries, scheduled tasks. All of it goes through the worker. The API stays fast.

When this setup stops working

Later than you think. A well-structured two-service Go setup on a decent VPS can handle significant traffic. The signals that it's actually time to split further are concrete: one feature is consuming all the worker capacity and starving other jobs, one endpoint accounts for 80% of database load, or the team is large enough that people are stepping on each other in the same codebase. Those are real reasons. "We might need to scale this independently someday" is not.

When that time comes, the split is clean because you've already separated concerns. The worker becomes two workers. The API splits along domain lines. The foundation holds.

Start boring. The architecture that gets you to 10k users is not the one that impresses people at conferences. It's two services, one database, one VPS, and a frontend on Vercel. It works.