01
The reverse proxy — your app's front door
A reverse proxy sits in front of your application servers, takes every client request, forwards it to a backend, and relays the response back. To the outside world it looks like one server.
You already met it in Part 1: remember FD 51, the connection NGINX opened out to the backend in the request round-trip? That outbound leg was reverse proxying. Here’s why you’d want a middleman there at all.
The problem — a naked app server
Picture your Node/Python/Ruby app server with a public IP, exposed directly. It now has to personally do everything:
internet ──►[ app server ] ← personally responsible for EVERYTHING:
• TLS handshakes (crypto-heavy, not its job)
• serving static files (logo, CSS, JS)
• app logic (the only thing it SHOULD do)
• being the single thing everyone hits
slow client ·····(dribbles request over 8s on bad 3G)·····► ties up a
whole app worker for the ENTIRE 8 seconds
Four concrete pains: it burns cycles on jobs it’s bad at (TLS, static files); slow clients are poison (one bad-3G client on an 8-second request holds an expensive app worker hostage the whole time); no zero-downtime deploys (restart the app and everyone hits a dead server); and you can’t scale out cleanly (one server is a single point of failure).
The reverse proxy is the maître d’ (our anchor from Part 1). Guests never march into the kitchen and shout their order at a chef — they tell the host, who relays it to the right kitchen and brings the plate back. The kitchen is shielded; the guest only ever talks to the front.
“Proxy” hides two opposite jobs. A forward proxy stands in front of clients and hides them from servers (a corporate web filter; your browser routed through a company proxy) — it acts on the client’s behalf. A reverse proxy stands in front of servers and hides them from clients — it acts on the server’s behalf. Forward proxy = a personal assistant who makes calls for you. Reverse proxy = the company receptionist who answers every incoming call. Our maître d’ is the receptionist. (The forward/reverse distinction is settled — NGINX glossary, MDN.)
The payoff that ties back to Part 1 — the slow-client buffer
Without NGINX: app worker is BUSY the full 8s the slow client dribbles.
With NGINX: NGINX (event-driven, idle connections ~free — remember epoll?)
absorbs all 8s of dribble, BUFFERS the whole request, then
hands it to the app server in one fast burst.
──► the expensive app worker is busy for MILLISECONDS, not 8s.
This is exactly why NGINX’s cheap-per-connection architecture (the whole epoll story) makes it a perfect front door: it can babysit thousands of slow clients for almost nothing, shielding costly backends. (Concept settled — NGINX request buffering, proxy_buffering; nginx.org.)
When to use / when not to
Use it for anything past a toy in production: more than one backend (or might be), TLS termination, mixed static + dynamic content, zero-downtime deploys, hiding app servers on a private network. Reconsider when a CDN already fronts a pure static site (the CDN is your edge proxy), a managed cloud LB already covers you, or it’s a trivial internal tool where the extra hop and ops aren’t worth it.
02
Load balancing — many kitchens, one host
Load balancing distributes requests across multiple backends so no single one is overwhelmed, and so the system survives any one dying. The reverse proxy is the position; load balancing is the policy.
The problem — the single-server wall
traffic grows ──► [ one app server ] ──► (1) CAPACITY CEILING: maxes out
CPU/RAM, starts dropping requests
[ one app server ] ──► (2) SINGLE POINT OF FAILURE:
x it dies the ENTIRE site is down.
So you buy a second server — now you have capacity and a spare. But a new problem appears instantly: a browser only knows one address, so something has to sit in front and divvy up the work. And a beautiful bonus: that something can keep a mental note of which kitchens are alive, and stop sending orders to one that caught fire — that’s health checking, which makes load balancing about survival, not just speed.
The alternatives
| Approach | How it splits traffic | The catch |
|---|---|---|
| Scale up (vertical) | Don’t split — buy a bigger box | Still one box = still a single point of failure; hardware ceiling |
| DNS round-robin | DNS returns a different IP each lookup | Clients cache DNS (lumpy); no health awareness — hands out dead servers’ IPs |
| Hardware LB | Dedicated appliance (F5, Citrix) | Expensive; the enterprise “before” answer |
| Client-side LB | The client picks a backend itself | Client must know all backends; common inside microservices |
| Reverse-proxy LB (NGINX) | Proxy in front picks per-request | The flexible, cheap, health-aware default for web |
Spreading requests collides with “this user must keep talking to the same backend.” If a user logs in and their session lives in server A’s memory, but their next request lands on server B, they’re suddenly logged out. Fixes exist — sticky sessions (pin a user to one backend) or a shared session store (Redis/DB so any backend can serve any user) — and which you pick is a real architecture decision.
Health checks — active vs passive
How does NGINX know a backend is dead? Two strategies. Passive: it notices when a forwarded request fails and backs that server off for a while — meaning some unlucky request had to fail first. Active: it proactively pings each backend on a schedule, catching a dead server before a real user hits it.
max_fails / fail_timeout parameters in an upstream block). Active health checks (the health_check directive) are a feature of the commercial NGINX Plus. Confidence: medium-high — verify against current nginx.org docs before relying, as the OSS/Plus feature line can shift.03
L4 vs L7 — the two ways to balance
Two layers, two fundamentally different jobs. The single most useful thing to internalize: L4 balances connections; L7 balances requests.
L4 is the doorman who glances at your wristband — “blue band, section 3” — and waves you in without ever hearing your order. L7 is the maître d’ who listens to your full order — “vegan tasting menu, no nuts” — and routes you to the kitchen that can make it. The doorman is faster; the maître d’ is smarter.
| L4 — transport layer | L7 — application layer | |
|---|---|---|
| Reads | The “envelope”: src/dst IP + port, protocol (the 4-tuple). Never opens the payload. | The actual HTTP: method, path, headers, cookies, Host. |
| Decision unit | Per connection (pins a whole connection to one backend) | Per request (each request can go to a different backend) |
| NGINX block | stream { } | http { } |
| Speed vs smarts | Fast, low overhead, protocol-agnostic (any TCP/UDP) | Smarter (route by path/header), can do WAF/cache/retries; slightly more latency |
| Good for | Databases, game UDP, SMTP, DNS, raw TCP, max throughput | Web apps, microservices, canary/blue-green, multi-domain hosting |
How L4 agrees with itself at scale (the non-obvious part)
At edge scale you run many L4 boxes, and new connections arrive too fast to share state between them. So instead of syncing, each load balancer reaches the same forwarding decision independently, by hashing the connection’s 4-tuple — every node computes the same backend for the same flow with zero coordination. Two more L4 staples worth knowing: Direct Server Return (DSR) — the backend replies straight to the client, bypassing the LB on the way out (great when traffic is outbound-heavy) — and kernel-bypass for raw speed (DPDK, XDP/eBPF). These show up in the war stories next. (All from the engineering posts cited in §4 and §References.)
04
Real-world war stories
This is where L4/L7 stops being a definition and becomes a set of scars. These are drawn from public engineering write-ups (full links in References) — paraphrased, not quoted.
The #1 “you need L7” lesson. HTTP/2 and gRPC multiplex many requests over one long-lived connection. An L4 LB pins a connection to one backend — so a gRPC client that opens one connection and fires 10,000 requests sends all 10,000 to a single backend while the others idle. L7 sees the individual streams and spreads them. If you’ve deployed gRPC behind a plain network (L4) LB and watched one pod redline while peers nap — this is why. (CloudRPS, Gravitee.)
The flip side. Databases speak their own wire protocols (PostgreSQL, MySQL) — not HTTP — so an HTTP-parsing L7 proxy literally can’t understand them. Use L4 (NGINX stream) with TCP health checks, and watch idle timeouts since DB connections are long-lived. For protocol-aware routing, reach for specialized proxies (PgBouncer, ProxySQL). (CloudRPS.)
GitHub’s GLB splits into a director tier (L4) and a proxy tier (L7, HAProxy). Routers ECMP-shard traffic to directors by consistent hashing; directors forward to the proxy tier. The operational payoff: the L4 tier rarely changes, but the L7 tier changes hourly/daily, so the split contains config-change risk — and because the director keeps existing connections pinned to their proxy, you can drain an L7 proxy gracefully without breaking live connections. They rewrote the director on DPDK for line-rate packet processing, use Direct Server Return, and added “second-chance” rendezvous hashing (pick a primary and secondary backend) so a node change doesn’t break in-flight flows. (GitHub Engineering — GLB.)
Cloudflare runs both with a clean division of labor: the L7 stack applies CDN, WAF, bot management, and DDoS protection to HTTP(S); L4 (Unimog / Spectrum) handles arbitrary TCP/UDP at the transport layer. Unimog runs in XDP and lets each LB reach consistent forwarding decisions independently (no state sharing). A neat reach-detail: for their WARP VPN (UDP/WireGuard), they extended the protocol with a session ID and added a custom flow dissector so the hash stays stable — after which server load became much more uniform. (Cloudflare — Unimog, Spectrum, Maglev.)
A post-mortem that captures L4’s subtlest failure: the dashboard showed even distribution — and technically it was, by connection count. But one backend caught three 40-minute video exports while another caught thirty 50ms health checks. Same connection count, wildly different load. Symptom: video exports timing out, trials lost, ~120 hours burned debugging “random failures.” (“The Speed Engineer”, Medium.)
L4 balances connections; it has no idea how much work or how many requests ride inside each one. In the old world (one short HTTP/1.1 request per connection) that distinction was academic. In the modern world — long-lived HTTP/2, gRPC, WebSockets, big asymmetric jobs — “connection-balanced” and “load-balanced” have diverged. Trusting L4 to balance load is how you get one machine on fire while the rest nap.
The verdict
The blogs converge: for most web apps, use an L7 load balancer for HTTP and an L4 load balancer for everything else. They coexist happily, solving different problems at different layers.
05
What TLS actually is
TLS (Transport Layer Security — the modern name for what was called SSL) does three jobs at once. People think it’s just “the privacy lock.” It’s three things.
| Job | Plain meaning | Without it… |
|---|---|---|
| Encryption | Nobody on the path can read your data | A coffee-shop snoop reads your password |
| Authentication | You’re really talking to who you think | You hand your password to an impostor |
| Integrity | Nobody altered the data in flight | An attacker silently changes “$10” to “$10,000” |
TLS wraps your message in a sealed envelope (encryption), stamped with a verified ID proving who the recipient is (authentication), using a tamper-evident seal that visibly breaks if touched (integrity). All three, or it isn’t TLS.
06
The handshake — how the lock gets set up
Before any real data flows, client and server negotiate. This one sequence makes everything afterward make sense. Press play.
Two things to lock in. The certificate is an ID card binding a hostname to a public key, signed by a Certificate Authority (CA) the browser trusts; the server proves it owns the matching private key. And after the handshake the two sides share a secret session key only they have — which is the crux of §8.
07
The three TLS models — where does the envelope get opened?
A load balancer can handle TLS three ways. The whole question is: who opens the sealed envelope, and does it get re-sealed afterward? Toggle between them:
no key, no decrypt
terminates TLS
In NGINX config terms:
| Model | NGINX | Cert on NGINX? | Backend hop |
|---|---|---|---|
| Passthrough | stream { } + ssl_preread on; | None | Encrypted (end-to-end) |
| Termination | http { listen 443 ssl; } → proxy_pass http:// | Yes | Plaintext |
| Bridging | http { listen 443 ssl; } → proxy_pass https:// | Yes | Re-encrypted (new session) |
The L4 model can route without decrypting because the hostname (SNI) rides in the ClientHello in cleartext, before encryption begins — ssl_preread peeks at it. This is also the only model that works for non-HTTP TLS (e.g. RDP), where there’s no HTTP to read anyway.
ssl_preread exploits). Directive names (ssl_preread, proxy_pass https://, the proxy_ssl_* family) confirmed against current write-ups, but verify exact spelling/defaults on nginx.org before shipping.08
Why “recreate” TLS — you can’t forward a session
This is the deepest point, and it answers “why a whole new connection?”
A TLS session is point-to-point between exactly two endpoints. You cannot forward, extend, or pass it on. The shared secret session key belongs to the client and NGINX only — the backend was never in that handshake and has none of those keys. So once NGINX terminates (decrypts), the client’s session is over; there’s nothing left to forward, just plaintext in NGINX’s memory. To encrypt the next hop, NGINX must start a fresh handshake as a brand-new client to the backend — new keys, new session. That’s why it’s “recreate,” not “forward.”
client ══🔒 session A ═══► NGINX ══🔒 session B ═══► backend (keys: client+NGINX) opens A, (keys: NGINX+backend — reads, a totally separate session) re-seals as B
The maître d’ opens the guest’s sealed letter and reads the order. He can’t just hand the kitchen that same envelope — it was sealed to him, and he already broke the seal to read it. So he writes the order on a fresh sheet and seals a new envelope for the runner. Decrypt-then-re-encrypt = TLS bridging.
09
Which certificate is used where
There are two completely separate certificate relationships, and people conflate them constantly.
┌─ relationship 1: client ↔ NGINX ─┐ ┌─ relationship 2: NGINX ↔ backend ─┐ client ═══🔒══► NGINX NGINX ═══🔒══► backend validates validates NGINX's cert BACKEND's cert
Relationship 1 (public-facing): NGINX presents the public site certificate for example.com, issued by a publicly trusted CA (Let’s Encrypt, DigiCert…). It must be publicly trusted — real browsers validate it.
Relationship 2 (NGINX → backend): now NGINX is the client and the backend presents its own certificate. The freeing part: this cert does not need a public CA — it can be self-signed or from your internal CA, because only NGINX validates it (no browser ever sees it). It often carries an internal name like backend.internal. NGINX checks it with proxy_ssl_verify on; + proxy_ssl_trusted_certificate.
You can also make NGINX present a client certificate to the backend (proxy_ssl_certificate / _key), so the backend verifies “is this really my load balancer, or an impostor?” Now both sides prove identity — the backbone of zero-trust internal networks.
Direct answer: on the backend hop, the backend uses its own (often internal/self-signed) server cert; NGINX optionally uses a separate client cert for mTLS. Neither is the public cert the browser saw.
10
When something’s wrong while decrypting
The golden rule: TLS fails closed. There is no “looks a bit off, I’ll pass it anyway.” Three failure points:
| Where | What happens |
|---|---|
| (a) Handshake fails wrong/expired cert, untrusted CA, hostname mismatch, no shared cipher | TLS aborts with an alert (e.g. certificate_expired, unknown_ca, handshake_failure); no application data ever flows. Browser shows “Your connection is not private.” |
| (b) Integrity check fails mid-stream a byte was altered/corrupted | TLS uses authenticated encryption (AEAD): every record carries an auth tag. A bad tag → bad_record_mac alert → connection torn down immediately. It will never hand over tampered plaintext. |
(c) Backend hop failsproxy_ssl_verify on and backend cert is bad | NGINX refuses to talk to it and returns 502 Bad Gateway. The client sees an error, never broken plaintext. |
The principle in one line: anything wrong → abort, never degrade. No partial trust, ever.
11
How NGINX gets the private key (and a myth to kill)
First, a precise correction people need: the server proves it owns its cert by using the private key; the client validates using the CA’s public key and no secret of its own. Different jobs, different sides.
So where does NGINX get the private key? You put it there yourself, ahead of time. It’s a local file. It was never generated by, sent over, or derived from the network.
1. You (or certbot) GENERATE a key pair ON the server:
private key → /etc/nginx/example.com.key (SECRET, stays put)
public key → baked into a CSR (cert signing request)
2. You send the CSR (NOT the private key) to a CA. It signs and returns:
certificate → /etc/nginx/example.com.crt (public, shareable)
3. Point NGINX at both:
ssl_certificate /etc/nginx/example.com.crt; # public, OK to share
ssl_certificate_key /etc/nginx/example.com.key; # SECRET, never leaves
The private key is generated locally and never travels anywhere — not to the CA, not to the client, not across the wire. Only the certificate (containing the public key) is shared. A private key that crosses a network is considered compromised.
The private key is a unique engraved stamp die locked in NGINX’s drawer. The certificate is a public notice: “genuine example.com documents bear this stamp; the notary (CA) vouches.” You forge the die once, locally; it never leaves the drawer. You only ever circulate the public notice.
12
Can anyone spoof? (and the public-cert myth)
The most common misconception: “I’ll just copy the public certificate and impersonate the site.” It fails, every time.
YOUR FAKE example.com server VICTIM'S BROWSER
"Hi, I'm example.com!" ◄──── "Prove it. Do the operation
[sends the COPIED public cert] only example.com's PRIVATE
KEY can perform."
[you have the cert...
but NOT the private key. ────► proof missing/invalid?
You CANNOT produce the proof.] x → CONNECTION REFUSED
"Your connection is not private"
“Public” means meant to be shared, not a secret that leaked. The certificate is a public claim, designed to be copied to everyone. Its job isn’t to be secret — it’s to let anyone verify a proof that only the private-key holder can generate. Possessing the claim doesn’t let you make the proof. Showing a photo of a lock doesn’t open the door.
The honest caveat: if you steal both the cert and the matching private key, you can impersonate the server — which is exactly why the private key is the crown jewel. But that requires breaching the server and exfiltrating a secret file that by design never travels — a break-in, not a copy-paste. Backstops still apply: Certificate Transparency (public logs of issued certs), revocation, and forward secrecy (a stolen key can’t decrypt yesterday’s captured traffic).
The threat model, briefly
- Spoof the server to a client? No (outsider) — needs a CA-signed cert and the private key.
- Eavesdrop on the path / passthrough? Sees metadata (the cleartext SNI = which site, packet sizes/timing) but can’t read contents (encryption) or alter them undetected (integrity).
- The L7 box is an authorized man-in-the-middle. When NGINX terminates TLS it legitimately reads plaintext and holds the real key — so whoever compromises that box sees all plaintext. It’s a concentration of trust; secure it like a vault. (This is also literally how corporate “TLS inspection” works.)
- Spoof the backend internally? Plain termination (plaintext hop) lets an internal attacker read and impersonate. Bridging with
proxy_ssl_verify on(+ mTLS) blocks it. The classic mistake — bridging with verify off — gives encryption without authentication: a false sense of security.
proxy_ssl_verify against nginx.org — defaults have differed across versions, which is precisely why “encryption without verification” is such a common trap.13
The SaaS-user case — who provides the certificate?
A natural but backwards leap: “I’m a community manager using mypanel.saas.com; to make it safe, do I generate a cert and share the public half with the SaaS provider?” Let’s untangle it.
Everything hangs on this. mypanel.saas.com is the SaaS company’s domain, their servers, their NGINX. They are the server; you are the client. The certificate for that hostname is theirs to generate and hold — that padlock you see on the portal is already their cert, already working. You generating one would be like printing a “Genuine SaaS Co.” stamp in your garage; no trusted CA would sign it for a domain you don’t control. TLS certs secure a server’s identity to its visitors — not a visitor’s account on someone else’s server.
So what actually keeps your access safe?
A different layer entirely — authentication, not TLS: a strong unique password (+ password manager), MFA/2FA, verifying the exact real domain (anti-phishing), and SSO if offered. The TLS cert (theirs) secures the pipe; your password/MFA secures your account. Two different jobs.
When would you generate a pair and share the public half?
Your instinct is right in general — it’s just a different situation, one where you prove identity to them. The giveaway is always: share the public half, never move the private half. Examples: SSH (public key on the server, private key on your laptop — probably what you were half-remembering); API request signing (upload a public key, sign with your private one); mTLS (register a client cert so a high-security portal verifies you cryptographically instead of by password). You’d do this only if the portal explicitly asks — and even then your private key stays on your machine.
14
Glossary (Part 2 terms)
| Term | One line |
|---|---|
| Reverse proxy | Server in front of app servers; forwards requests, hides backends |
| Forward proxy | Proxy in front of clients; hides them (corporate filter) |
| L4 / L7 | Transport-layer (connections, 4-tuple) vs application-layer (requests, HTTP) |
| 4-tuple | src IP, src port, dst IP, dst port — what L4 hashes on |
| Consistent hashing | Lets many LBs agree on a backend independently, with minimal disruption on change |
| DSR | Direct Server Return — backend replies straight to client, bypassing the LB outbound |
| Sticky session | Pin a user to one backend so their in-memory session survives |
| TLS / SSL | Encryption + authentication + integrity for a connection |
| SNI | Hostname in the ClientHello, sent in cleartext — lets L4 route without decrypting |
| Passthrough / Termination / Bridging | Forward sealed / decrypt to plaintext backend / decrypt then re-encrypt |
| Certificate | Public ID card binding a hostname to a public key, signed by a CA |
| Private key | The secret that proves cert ownership; never leaves the server |
| CA | Certificate Authority — the trusted notary that signs certs after validating domain control |
| mTLS | Both sides present certificates; mutual authentication |
| AEAD | Authenticated encryption — the tamper-evident seal; bad tag → connection killed |
15
Recall questions & diffuse-mode seeds
1. A bad-3G client takes 8s to send a request. Why is the app worker busy for milliseconds with NGINX in front, but 8 full seconds without it?
2. You deploy gRPC behind an L4 network LB; one pod is at 95% CPU while two idle, yet “connections are balanced.” What’s happening and what’s the fix?
3. GitHub runs an L4 director tier in front of its L7 proxy tier. Give the one operational reason. (Which tier changes hourly, and what does the L4 tier let them do safely?)
4. Why can’t NGINX “forward” the client’s TLS session to the backend instead of doing a fresh handshake? (One sentence — about who holds the keys.)
5. The browser shows a green padlock for example.com; the backend presents a self-signed cert for backend.internal. Why is that fine — and who’d have screamed if the client-facing cert were self-signed?
6. You copy example.com’s public certificate onto your server. Why does impersonation still fail the instant a browser connects — what are you missing?
7. For mypanel.saas.com: who generates the TLS cert, what does it make safe, and what separate mechanism actually stops someone logging in as you?
• Consistent hashing: plain hash(client) % N seems fine — until N goes 4→5 and nearly every client remaps. Why does naive modulo fall apart on one node change, and what does consistent hashing do differently to keep most clients pinned? (Gateway to how LBs, caches, and sharded DBs survive scaling.)
• Health checks: active (ping on a schedule) vs passive (notice a failed request). What can active catch that passive can’t — and who’s the unlucky user whose request had to fail first under passive?
• ECH: SNI is cleartext, so anyone on the path sees which site you visit. Encrypted Client Hello (ECH) hides it — but then what happens to every L4 SNI-based passthrough router, and who’s still positioned to route the traffic? (Verify ECH’s current real-world deployment state — it was still rolling out as of early 2026.)
• The CA trust problem: your browser trusts hundreds of CAs, any one of which could issue a cert for your domain. So your bank’s security depends on the weakest CA. How would you ever detect a fraudulently issued cert for your domain? (Gateway to Certificate Transparency.)
• Protecting the key: the private key is a plain file NGINX must read but that must never leak. How do serious shops square that? (File permissions → secret managers → HSMs that sign without revealing the key → short-lived certs & rotation to shrink the blast radius of a leak.)
§
References
Tier-1 sources: the relevant RFCs, official NGINX docs, and primary engineering blogs. Claims in §4 are paraphrased from these.
| Topic | Source |
|---|---|
| TLS 1.3 (handshake, alerts, AEAD, authentication) | RFC 8446 — rfc-editor.org/rfc/rfc8446; TLS 1.2: RFC 5246 |
| ACME / domain-control validation (how CAs verify ownership) | RFC 8555 — rfc-editor.org/rfc/rfc8555 |
| Certificate Transparency | RFC 6962 — rfc-editor.org/rfc/rfc6962 |
| SSH public-key auth (the share-public/keep-private pattern) | RFC 4253 |
NGINX reverse proxy, proxy_pass, proxy_ssl_*, buffering | nginx.org · ngx_http_proxy_module |
NGINX TLS, ssl_certificate / _key | nginx.org · ngx_http_ssl_module |
NGINX L4 / SNI passthrough (stream, ssl_preread) | nginx.org · ngx_stream_ssl_preread_module |
NGINX load balancing & upstream (health checks, algorithms) | nginx.org · ngx_http_upstream_module |
| GitHub GLB — split L4/L7, DSR, DPDK, rendezvous hashing | github.blog · GLB |
| Cloudflare Unimog (L4LB), Spectrum (L4), Maglev (consistent hashing) | blog.cloudflare.com · Unimog & tag: load balancing |
| gRPC/HTTP-2 multiplexing trap, WebSockets, DB → L4 | CloudRPS & Gravitee L4-vs-L7 write-ups |
| “Connections balanced, load wasn’t” post-mortem | “The Speed Engineer” (Medium) |
| NGINX TLS passthrough / termination / bridging configs | OneUptime, Smartango, dev.to (TLS termination models) |
proxy_ssl_verify (varies by version), the OSS-vs-NGINX-Plus line for active health checks and some algorithms, current ECH deployment state, and the precise TLS 1.2-vs-1.3 handshake message sequence. No throughput/latency numbers were asserted, as those are setup-specific.