Building CloudArc: Lessons from a 7-Tier REST API
December 18, 2025 · 6 min read
When I started CloudArc, I told myself I'd keep it simple. That didn't last long.
What began as a straightforward Node.js API for managing cloud resource metadata turned into a seven-layer system with Docker Compose orchestration, Nginx as a reverse proxy, Prometheus + Grafana for observability, a Redis caching layer, and a full CI/CD pipeline running on a live VPS. This post is about why each of those layers exists, the decisions I got wrong, and what I'd do differently.
The 7 Layers
1. Reverse Proxy (Nginx) — All traffic enters through Nginx. It terminates TLS, handles rate limiting at the edge, and routes to the correct upstream service. I initially skipped this layer and exposed my Node server directly. That was a mistake — Nginx gives you compression, security headers, and connection pooling essentially for free.
2. Router (Express) — Thin routing layer. One responsibility: map an HTTP method + path to a controller. No business logic, no database calls. I learned this the hard way when I had 200-line route handlers early on.
3. Controller — Validates the incoming request shape using Zod. If validation fails, it returns a 400 immediately. If it passes, it calls the service layer and maps the result to an HTTP response. Controllers don't know how data is stored.
4. Service — This is where the business logic lives. Services are pure functions as much as possible — they take validated input, apply domain rules, and return a result. They call the repository for persistence, never the database directly.
5. Repository — The data access layer. Every database query lives here. Swapping Postgres for MySQL should only require touching this layer. In practice this abstraction saved me twice: once when I migrated from raw pg to Prisma, and once when I added read replicas.
6. Cache (Redis) — Sits between the service and repository for read-heavy endpoints. The cache-aside pattern: check the cache first, fall through to the database on miss, write back to cache on hit. TTLs are set per-resource type rather than globally.
7. Observability (Prometheus + Grafana) — Every service exposes a /metrics endpoint. Prometheus scrapes every 15 seconds. I track request duration by route, error rates, cache hit ratios, and DB query latency. This layer has already saved me twice: once by surfacing an N+1 query I'd missed in code review, and once by catching a memory leak before it became an incident.
What I Got Wrong
Over-engineering the repository abstraction early. I spent two days building a generic repository interface before I had enough queries to understand the actual patterns. Write real queries first. Abstract later, when the duplication becomes obvious.
Not thinking about connection pooling until it hurt. Under load, my Postgres connection count spiked. The fix was adding pg-pool with sensible limits — but I should have configured this from day one.
Stateless assumptions in CI/CD. My first pipeline ran docker build on every commit, including for dependencies that hadn't changed. Adding Docker layer caching to the workflow cut build times from ~4 minutes to under 1 minute.
The Part That Surprised Me
Observability was the highest return-on-investment thing I added. It felt like overhead at first — extra setup, extra services to run, extra things to maintain. But having real metrics meant every performance investigation started from data rather than guesswork. I've started adding it much earlier in projects now.
If you're building a personal project and skipping observability because "it's just for me" — add it anyway. Future you will be grateful.
CloudArc is open source. View on GitHub →