Skip to main content
🎯Interview Prep

System Design Cheat Sheets: Quick Reference for Every Key Concept

This is your one-stop quick reference for system design interviews. Every formula, number, pattern, and decision framework you need — condensed into scanna...

📖 7 min read

System Design Cheat Sheets: Quick Reference for Every Key Concept

This is your one-stop quick reference for system design interviews. Every formula, number, pattern, and decision framework you need — condensed into scannable tables and lists. Bookmark this page and review it before your interview. For deeper coverage, see our System Design Interview Guide and Common Questions.

Numbers Every Engineer Should Know

Operation Latency Notes
L1 cache reference 0.5 ns Fastest memory access
L2 cache reference 7 ns 14x L1
Main memory (RAM) 100 ns 200x L1
SSD random read 150 μs ~1,000x RAM
HDD seek 10 ms ~100,000x RAM
Network round trip (same DC) 500 μs 0.5 ms
Network round trip (cross-continent) 150 ms Use CDN to reduce
Read 1 MB from SSD 1 ms ~1 GB/s throughput
Read 1 MB from network (1 Gbps) 10 ms ~100 MB/s
Redis GET 0.1-0.2 ms In-memory, very fast
Simple DB query (indexed) 1-5 ms With warm cache

Back-of-Envelope Estimation Formulas

// Time conversions
1 day    = 86,400 seconds ≈ 100,000 seconds (for estimation)
1 month  = 2.5 million seconds
1 year   = 31.5 million seconds

// QPS (Queries Per Second)
QPS = DAU × (avg queries per user per day) / 86,400
Peak QPS = QPS × 2-3 (peak factor)
Write QPS = QPS × write_ratio

// Storage
Storage per year = daily_new_records × 365 × avg_record_size
Total storage (5 years) = Storage per year × 5

// Bandwidth
Incoming BW = write_QPS × avg_request_size
Outgoing BW = read_QPS × avg_response_size

// Cache (80/20 rule)
Cache size = daily_read_requests × 0.2 × avg_response_size

Power of 2 Quick Reference

Power Exact Approx Name
2^10 1,024 1 Thousand 1 KB
2^20 1,048,576 1 Million 1 MB
2^30 1,073,741,824 1 Billion 1 GB
2^40 ~1 Trillion 1 Trillion 1 TB
2^50 ~1 Quadrillion 1 Quadrillion 1 PB

CAP Theorem Quick Reference

Property Meaning Example
Consistency All nodes see the same data at the same time Bank transactions
Availability Every request receives a response Social media feed
Partition Tolerance System works despite network partitions Required in distributed systems
Choose Trade-off Databases
CP May be unavailable during partition MongoDB, HBase, Redis
AP May return stale data during partition Cassandra, DynamoDB, CouchDB
CA Not partition tolerant (single node) Traditional RDBMS (single node PostgreSQL)

Consistency Patterns

Pattern Guarantee Use Case
Strong Consistency Reads always return latest write Banking, inventory
Eventual Consistency Reads eventually return latest write Social feeds, analytics
Read-your-writes User sees their own writes immediately User profile updates
Causal Consistency Causally related writes seen in order Comment threads

Caching Strategies

Strategy How It Works Best For
Cache-Aside (Lazy) App reads cache first; on miss, reads DB and populates cache Read-heavy, general purpose
Write-Through Write to cache and DB simultaneously When data freshness is critical
Write-Behind (Back) Write to cache; async write to DB later Write-heavy with eventual consistency OK
Write-Around Write directly to DB; cache populated on read Data rarely re-read after write
Read-Through Cache loads from DB on miss transparently Simplified application code

Load Balancing Algorithms

Algorithm How It Works Best For
Round Robin Cycles through servers sequentially Equal-capacity servers, stateless
Weighted Round Robin More traffic to higher-capacity servers Mixed-capacity servers
Least Connections Route to server with fewest active connections Long-lived connections, varying request times
IP Hash Hash client IP to determine server Session affinity without cookies
Consistent Hashing Minimize redistribution when servers change Caches, distributed databases

Database Selection Guide

See our Database Cheatsheet for detailed comparison.

Use Case Database Type Examples
Structured data, ACID transactions Relational (SQL) PostgreSQL, MySQL
Flexible schema, rapid iteration Document MongoDB, CouchDB
High write throughput, horizontal scale Wide Column Cassandra, HBase
Caching, sessions, leaderboards Key-Value Redis, Memcached, DynamoDB
Relationships, social graphs Graph Neo4j, Amazon Neptune
Full-text search Search Engine Elasticsearch, Solr
Metrics, monitoring, IoT Time Series InfluxDB, TimescaleDB

Message Queue Comparison

Feature Kafka RabbitMQ SQS
Model Log-based (pull) Queue (push) Queue (pull)
Throughput Very high (millions/sec) High (100K/sec) High (managed)
Ordering Per-partition Per-queue FIFO option
Retention Configurable (days/weeks) Until consumed 14 days max
Best for Event streaming, log aggregation Task queues, RPC Serverless, simple decoupling

Microservices Patterns

Pattern Purpose
API Gateway Single entry point, routing, auth, rate limiting
Service Discovery Services find each other dynamically
Circuit Breaker Prevent cascading failures
Saga Pattern Distributed transactions via compensating actions
CQRS Separate read and write models
Event Sourcing Store state as sequence of events
Sidecar Attach helper process alongside main service
Strangler Fig Incrementally migrate from monolith

API Design Checklist

API Design:
[ ] RESTful resource naming (nouns, not verbs)
[ ] Consistent HTTP methods (GET=read, POST=create, PUT=update, DELETE=delete)
[ ] Proper status codes (200, 201, 400, 401, 403, 404, 429, 500)
[ ] Pagination (cursor-based for real-time data, offset for static)
[ ] Versioning strategy (URL path: /v1/ recommended)
[ ] Rate limiting headers (X-RateLimit-*)
[ ] Authentication (OAuth 2.0, JWT, API keys)
[ ] Input validation and sanitization
[ ] Error response format (consistent JSON structure)
[ ] HATEOAS links (optional, for discoverability)

Security Quick Reference

For detailed security guides, see Authentication vs Authorization, OAuth 2.0, JWT, and Encryption.

Topic Key Points
Authentication OAuth 2.0 + OIDC for user-facing; mTLS for service-to-service
Authorization RBAC for most apps; ABAC for fine-grained policies
Encryption AES-256-GCM at rest; TLS 1.3 in transit
Passwords bcrypt or Argon2id, never SHA-256 or MD5
API Security Rate limiting, input validation, CORS, security headers

Practice with Security Crypto Tools and API Network Tools. Visit swehelper.com/tools for all interactive tools.

Monitoring Checklist

The Four Golden Signals (Google SRE):
1. Latency    — Time to process a request (p50, p95, p99)
2. Traffic    — Requests per second
3. Errors     — Rate of failed requests (5xx)
4. Saturation — How full the system is (CPU, memory, disk, connections)

RED Method (for request-driven services):
- Rate:     Requests per second
- Errors:   Number of failed requests
- Duration: Distribution of request durations

USE Method (for resources):
- Utilization: % of time resource is busy
- Saturation:  Amount of work queued
- Errors:      Count of error events

Availability SLA Reference

SLA Downtime/Year Downtime/Month Downtime/Day
99% (two 9s) 3.65 days 7.3 hours 14.4 minutes
99.9% (three 9s) 8.76 hours 43.8 minutes 1.44 minutes
99.99% (four 9s) 52.6 minutes 4.38 minutes 8.6 seconds
99.999% (five 9s) 5.26 minutes 26.3 seconds 0.86 seconds

Frequently Asked Questions

What is the most important concept for system design interviews?

Trade-offs. Every design decision involves trade-offs between consistency and availability, latency and throughput, simplicity and scalability, cost and performance. The ability to articulate why you chose one approach over another is the single most important skill. See our Interview Guide for the full framework.

How do I decide between SQL and NoSQL?

Default to SQL (PostgreSQL) unless you have a specific reason for NoSQL. Use NoSQL when you need: horizontal write scaling (Cassandra), flexible schemas (MongoDB), extreme read speed (Redis), or graph queries (Neo4j). See our Database Cheatsheet for detailed decision criteria.

When should I introduce caching in my design?

Introduce caching when: reads significantly outnumber writes (10:1+), data changes infrequently, latency requirements are strict, or you need to reduce database load. Cache-aside with Redis is the safest default. Always discuss cache invalidation strategy — it is one of the hardest problems in computer science.

What is the difference between horizontal and vertical scaling?

Vertical scaling (scale up) means adding more CPU/RAM to a single machine. It is simpler but has a hard ceiling. Horizontal scaling (scale out) means adding more machines. It requires distributed systems thinking (load balancing, sharding, consistency) but scales nearly infinitely. Most interview designs should plan for horizontal scaling. See our Scalability Cheatsheet for patterns.

Related Articles