π API Gateway β The Front Door to Your Microservices
An API Gateway is a server that acts as the single entry point for all client requests into a microservices architecture. Instead of clients communicating directly with dozens of backend services, every request flows through the gateway, which handles routing, security, rate limiting, and request transformation before forwarding traffic to the appropriate service.
Think of it as a smart reverse proxy on steroids. While a traditional reverse proxy simply forwards requests, an API gateway adds cross-cutting concerns like authentication, logging, circuit breaking, and protocol translation β all in one centralized layer. Companies like Netflix, Amazon, and Uber rely heavily on API gateways to manage billions of requests per day across hundreds of microservices.
In this guide, we will cover everything you need to know about API gateways β from core responsibilities and popular tools to the BFF pattern, security best practices, and real-world architecture examples. For a broader understanding of system design fundamentals, check out swehelper.com system design topics.
βοΈ Core Responsibilities of an API Gateway
An API gateway shoulders a wide range of responsibilities that would otherwise be duplicated across every microservice. Here are the key functions:
1. Request Routing
The gateway inspects the incoming request path, headers, or query parameters and routes it to the correct backend service. For instance, /api/users/* might route to the User Service while /api/orders/* goes to the Order Service. This decouples clients from the internal service topology β services can be split, merged, or moved without any client-side changes.
2. Authentication and Authorization
Rather than each service implementing its own auth logic, the gateway validates JWT tokens, API keys, or OAuth2 tokens at the edge. It can attach user identity information to downstream requests via headers like X-User-Id or X-User-Roles, so backend services trust the gateway's verification. Learn more about securing distributed systems on the authentication patterns page.
3. Rate Limiting and Throttling
Rate limiting protects backend services from being overwhelmed. The gateway enforces limits like "100 requests per minute per API key" using algorithms such as token bucket, sliding window, or fixed window. This is essential for both preventing abuse and ensuring fair resource allocation among tenants.
4. Request/Response Transformation
The gateway can modify requests before forwarding them (adding headers, rewriting paths, converting protocols) and transform responses before returning them to the client (stripping internal fields, aggregating results from multiple services, converting XML to JSON).
5. Load Balancing
API gateways distribute incoming traffic across multiple instances of a backend service using strategies like round-robin, least connections, or weighted routing. This complements service-level load balancers and is especially useful for canary deployments and A/B testing. For deeper coverage, visit swehelper.com load balancing guide.
6. Circuit Breaking and Resilience
When a downstream service is failing, the gateway can open a circuit breaker to fail fast rather than letting requests pile up. This prevents cascading failures across the entire system β a critical concern in distributed architectures.
7. Caching
Gateways can cache responses for idempotent GET requests, reducing load on backend services and improving latency. Cache invalidation strategies (TTL-based, event-driven) are configured per route.
π§© Popular API Gateways β A Comparison
Choosing the right API gateway depends on your infrastructure, team expertise, and scale requirements. Here is a comparison of the most widely used gateways:
| Feature | Kong | AWS API Gateway | Azure APIM | Nginx |
|---|---|---|---|---|
| Type | Open-source / Enterprise | Managed (AWS) | Managed (Azure) | Open-source / Commercial |
| Deployment | Self-hosted / Cloud | Fully managed | Fully managed | Self-hosted |
| Plugin Ecosystem | Rich (100+ plugins) | Lambda authorizers | Policy expressions | Lua / NJS modules |
| Protocol Support | REST, gRPC, GraphQL, WebSocket | REST, WebSocket, HTTP | REST, SOAP, GraphQL, WebSocket | REST, gRPC, WebSocket |
| Rate Limiting | Built-in plugin | Usage plans + API keys | Built-in policies | ngx_http_limit_req |
| Best For | Multi-cloud, Kubernetes | AWS-native workloads | Azure-native, enterprise | High-perf reverse proxy |
| Pricing | Free (OSS) / Enterprise $$ | Pay per request | Tiered plans | Free (OSS) / Plus $$ |
Use swehelper.com comparison tools to evaluate these gateways against your specific requirements.
π Code Examples β Configuration in Practice
Kong Gateway β Declarative Configuration
Kong uses a declarative YAML format (or its Admin API) to define services, routes, and plugins. Below is a configuration that sets up a service with JWT authentication and rate limiting:
_format_version: "3.0"
services:
- name: user-service
url: http://user-svc.internal:8080
routes:
- name: user-routes
paths:
- /api/users
methods:
- GET
- POST
- PUT
strip_path: true
plugins:
- name: jwt
config:
claims_to_verify:
- exp
header_names:
- Authorization
- name: rate-limiting
config:
minute: 100
hour: 5000
policy: redis
redis_host: redis.internal
redis_port: 6379
- name: correlation-id
config:
header_name: X-Request-ID
generator: uuid
- name: order-service
url: http://order-svc.internal:8080
routes:
- name: order-routes
paths:
- /api/orders
methods:
- GET
- POST
plugins:
- name: key-auth
config:
key_names:
- X-API-Key
- name: response-transformer
config:
remove:
headers:
- X-Internal-Trace-Id
json:
- internal_metadata
AWS API Gateway β CloudFormation / SAM Template
AWS API Gateway is configured through the AWS Console, CLI, or Infrastructure as Code. Here is a SAM template that defines an HTTP API with a Lambda authorizer and usage plan:
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
ApiGateway:
Type: AWS::Serverless::HttpApi
Properties:
StageName: prod
CorsConfiguration:
AllowOrigins:
- "https://app.example.com"
AllowMethods:
- GET
- POST
- PUT
- DELETE
AllowHeaders:
- Authorization
- Content-Type
GetUsersFunction:
Type: AWS::Serverless::Function
Properties:
Handler: handlers/users.getAll
Runtime: nodejs18.x
MemorySize: 256
Timeout: 30
Events:
GetUsers:
Type: HttpApi
Properties:
ApiId: !Ref ApiGateway
Path: /api/users
Method: GET
UsagePlan:
Type: AWS::ApiGateway::UsagePlan
Properties:
UsagePlanName: StandardPlan
Throttle:
BurstLimit: 200
RateLimit: 100
Quota:
Limit: 10000
Period: DAY
Rate Limiting with Sliding Window (Pseudocode)
Understanding how rate limiting works under the hood is essential. Here is a sliding window counter implementation commonly used inside gateways:
class SlidingWindowRateLimiter:
def __init__(self, max_requests, window_seconds):
self.max_requests = max_requests
self.window_seconds = window_seconds
self.request_log = {} # key -> deque of timestamps
def is_allowed(self, client_id):
now = time.time()
window_start = now - self.window_seconds
if client_id not in self.request_log:
self.request_log[client_id] = deque()
log = self.request_log[client_id]
# Remove timestamps outside the current window
while log and log[0] < window_start:
log.popleft()
if len(log) < self.max_requests:
log.append(now)
return True # Request allowed
return False # Rate limit exceeded
# Usage: 100 requests per 60-second window
limiter = SlidingWindowRateLimiter(max_requests=100, window_seconds=60)
if limiter.is_allowed("user-abc-123"):
forward_to_backend(request)
else:
return Response(status=429, body="Too Many Requests")
π‘ The BFF (Backend for Frontend) Pattern
The Backend for Frontend pattern is an evolution of the API gateway concept where you create separate gateway layers tailored to each client type β one for web, one for mobile, one for third-party integrations, and so on.
Why BFF?
Different clients have different needs. A mobile app on a slow 3G connection needs minimal, compressed payloads. A web dashboard needs rich, nested data. A third-party API consumer needs stable, versioned endpoints. A single monolithic gateway trying to serve all these needs becomes bloated and difficult to maintain.
BFF Architecture
ββββββββββββ ββββββββββββ ββββββββββββββββ
β Web App β β Mobile β β Partner API β
β (React) β β (iOS/And)β β (3rd Party) β
ββββββ¬ββββββ ββββββ¬ββββββ ββββββββ¬ββββββββ
β β β
βΌ βΌ βΌ
βββββββββββ ββββββββββββ βββββββββββββββ
β Web BFF β βMobile BFFβ β Partner BFF β
β GraphQL β β REST/slimβ β REST/versnd β
ββββββ¬βββββ ββββββ¬ββββββ ββββββββ¬βββββββ
β β β
βββββββββββ¬ββββ΄ββββββββ¬ββββββββ
βΌ βΌ
ββββββββββββββ ββββββββββββββ
β User Svc β β Order Svc β
ββββββββββββββ ββββββββββββββ
Each BFF is owned by the frontend team it serves. The Web BFF might use GraphQL to let the frontend query exactly what it needs, while the Mobile BFF returns pre-shaped, minimal JSON payloads. The Partner BFF provides stable versioned REST endpoints with strict rate limits.
When to Use BFF
- Multiple client types with significantly different data needs
- Teams organized by client (web team, mobile team) that want autonomy
- Performance optimization β mobile BFF can aggressively cache and compress
- Avoid if you only have one client type or your gateway logic is simple
For more architectural patterns, explore microservices patterns on swehelper.com.
π Security Features and Best Practices
The API gateway is your first line of defense. Here are critical security capabilities and practices:
- TLS Termination: Terminate HTTPS at the gateway and use mTLS for internal service-to-service communication. Never pass unencrypted traffic beyond the gateway.
- Input Validation: Validate request schemas (OpenAPI spec enforcement) at the gateway to reject malformed requests before they reach backend services.
- IP Whitelisting / Blacklisting: Block known malicious IPs or restrict access to specific CIDR ranges for internal APIs.
- CORS Enforcement: Centralize CORS policy at the gateway so individual services do not need to manage it.
- Bot Detection: Integrate with WAF (Web Application Firewall) services to detect and block automated attacks, SQL injection, and XSS attempts.
- API Key Rotation: Enforce key expiration policies and provide self-service key management portals for consumers.
- Payload Size Limits: Set maximum request body sizes to prevent denial-of-service attacks via large payloads.
A common security architecture layers an API gateway behind a CDN/WAF (like CloudFront + AWS WAF or Azure Front Door) for DDoS protection, while the gateway handles application-level security. See more at swehelper.com security patterns.
π Monitoring and Observability
Since all traffic flows through the gateway, it is the ideal place to instrument observability. Key metrics and practices include:
| Metric | What It Tells You | Alert Threshold Example |
|---|---|---|
| Request Rate (RPS) | Traffic volume and trends | Spike above 3x baseline |
| Error Rate (4xx/5xx) | Client or server-side failures | 5xx rate exceeds 1% |
| P50/P95/P99 Latency | Response time distribution | P99 above 500ms |
| Rate Limit Hits | Clients hitting throttle limits | Consistent 429 responses |
| Circuit Breaker State | Downstream service health | Circuit opens for any service |
The gateway should inject a correlation ID (e.g., X-Request-ID) into every request so you can trace a single user action across all downstream services. Integrate with distributed tracing systems like Jaeger, Zipkin, or AWS X-Ray. Export metrics to Prometheus/Grafana or your cloud provider's monitoring stack.
Use swehelper.com latency calculator to model the impact of gateway overhead on your end-to-end response times.
β Patterns and Anti-Patterns
Good Patterns
- Gateway as a thin layer: Keep business logic out of the gateway. It should only handle cross-cutting concerns. If your gateway has
if/elsebranches based on business rules, something is wrong. - Declarative configuration: Define routes, plugins, and policies as code (YAML/JSON) in version control. This enables code review, rollback, and reproducibility.
- Health check endpoints: Configure the gateway to actively probe backend health (
/healthor/ready) and remove unhealthy instances from the routing pool automatically. - Graceful degradation: When a non-critical service is down, return cached or default responses instead of propagating errors. For example, if the recommendation service is down, show popular items instead of an error.
- API versioning at the gateway: Route
/v1/usersto the legacy service and/v2/usersto the new service. This allows gradual migration without client disruption.
Anti-Patterns to Avoid
- God Gateway: Stuffing business logic, data transformation, orchestration, and even database calls into the gateway. This creates a monolithic bottleneck that defeats the purpose of microservices.
- Single point of failure: Running a single gateway instance without redundancy. Always deploy at least two instances behind a load balancer with health checks.
- Tight coupling to gateway vendor: Embedding vendor-specific constructs deeply into your service contracts. Use standard protocols (OpenAPI, gRPC) so you can swap gateways if needed.
- Ignoring gateway latency: Every hop adds latency. An API gateway typically adds 5-20ms. If you chain multiple gateways (edge gateway to internal gateway to service mesh sidecar), the cumulative overhead can become significant.
- No canary or blue-green strategy: Deploying gateway configuration changes to all traffic at once. Always use traffic splitting to roll out changes gradually.
ποΈ Real-World Architecture Example
Here is a production-grade API gateway architecture for an e-commerce platform handling 50,000 requests per second:
Internet Traffic
β
βΌ
ββββββββββββββββ
β CloudFront β CDN + DDoS protection
β + AWS WAF β Edge caching for static content
ββββββββ¬ββββββββ
β
βΌ
ββββββββββββββββ
β AWS ALB β Layer 7 load balancer
β (multi-AZ) β SSL termination, health checks
ββββββββ¬ββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββ
β Kong Gateway Cluster β
β (3 nodes, Kubernetes, PostgreSQL) β
β β
β Plugins: JWT auth, rate limiting, β
β correlation-id, prometheus, β
β response-transformer, bot-detect β
ββββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββ
β β β β β
βΌ βΌ βΌ βΌ βΌ
User Order Payment Catalog Search
Svc Svc Svc Svc Svc
(x4) (x3) (x2) (x6) (x3)
Key design decisions in this architecture:
- Three-tier entry: CDN/WAF handles volumetric attacks and caching, ALB provides high-availability load balancing, Kong handles application-level gateway logic.
- Kong on Kubernetes: Auto-scales gateway pods based on CPU and request rate. PostgreSQL stores gateway configuration (routes, plugins, consumers).
- Service-level scaling: The Catalog service has 6 replicas because it handles the most read traffic. Payment has only 2 because it processes fewer but more critical requests.
- Redis-backed rate limiting: Shared Redis cluster ensures rate limits are enforced consistently across all Kong nodes, not per-node.
For hands-on practice designing architectures like this, try the swehelper.com system design simulator.
π API Gateway vs Service Mesh
A common source of confusion is the overlap between API gateways and service meshes (like Istio or Linkerd). Here is the distinction:
| Aspect | API Gateway | Service Mesh |
|---|---|---|
| Traffic Scope | North-south (external to internal) | East-west (service to service) |
| Deployment | Centralized edge proxy | Sidecar per service instance |
| Primary Focus | External API management | Internal service communication |
| Use Together? | Yes β API gateway at the edge, service mesh internally. They are complementary. | |
In mature architectures, both work together. The API gateway handles external client concerns (API keys, public rate limits, request shaping) while the service mesh handles internal concerns (mTLS between services, retry policies, circuit breaking). Explore service mesh patterns on swehelper.com for deeper coverage.
β Frequently Asked Questions
Q1: When should I introduce an API gateway?
Introduce a gateway when you have more than 2-3 microservices that clients consume directly. If you are building a monolith or have a single backend, a simple reverse proxy (Nginx) is sufficient. The tipping point is when you find yourself duplicating auth, rate limiting, or CORS configuration across multiple services β that is when a gateway pays for itself.
Q2: Does an API gateway add significant latency?
A well-configured gateway adds 5-20ms of latency in most cases. Managed gateways like AWS API Gateway add slightly more (10-30ms) due to their multi-tenant infrastructure. Self-hosted gateways like Kong or Envoy running close to your services add minimal overhead. The latency tradeoff is almost always worth it for the security, observability, and operational benefits you gain.
Q3: Can I use an API gateway with GraphQL?
Yes. You have two common approaches: (1) Place a GraphQL server (like Apollo Gateway) behind the API gateway, which handles auth and rate limiting at the REST/HTTP level, or (2) use a gateway that natively supports GraphQL (Kong has a GraphQL plugin, and Azure APIM supports GraphQL APIs). The BFF pattern works particularly well here β your Web BFF can expose GraphQL while other BFFs expose REST.
Q4: How do I handle API versioning through the gateway?
Three common strategies: URL path versioning (/v1/users, /v2/users) is the simplest β the gateway routes each version to the appropriate backend. Header-based versioning (Accept: application/vnd.api.v2+json) keeps URLs clean but is harder to debug. Query parameter versioning (?version=2) is the least recommended. Whichever you choose, the gateway should handle the routing so backend services only need to serve their own version.
Q5: What is the difference between an API gateway and a reverse proxy?
A reverse proxy (Nginx, HAProxy) forwards requests to backend servers with basic load balancing. An API gateway does everything a reverse proxy does plus application-aware features: authentication, rate limiting, request transformation, API analytics, developer portal, and more. Think of an API gateway as a reverse proxy with an extensive plugin system designed specifically for API management. Many organizations start with Nginx and evolve to Kong or a managed gateway as their API surface grows.
API gateways are a foundational building block in modern distributed systems. Whether you choose a managed service like AWS API Gateway for simplicity or a self-hosted solution like Kong for flexibility, the key is keeping the gateway thin, observable, and focused on cross-cutting concerns. For more system design topics and interview preparation, visit swehelper.com system design.