Common System Design Interview Questions: Top 20 with Approaches
This is your practical guide to the most frequently asked system design interview questions. For each question, you will find the problem scope, key requirements, a high-level approach, essential components, and the main challenges to address. Use this alongside the System Design Interview Guide for the overall framework and our Cheat Sheets for quick reference on building blocks.
Easy Level
1. URL Shortener (like bit.ly)
Problem: Design a service that takes long URLs and generates short, unique aliases that redirect to the original URL.
| Aspect | Details |
|---|---|
| Key Requirements | Generate unique short codes, redirect with low latency, analytics tracking |
| Scale | 100:1 read:write ratio, billions of URLs |
| Key Components | Application servers, key generation service, cache (Redis), database |
| Main Challenge | Unique ID generation at scale (Base62, Snowflake, or pre-generated keys) |
| Concepts to Emphasize | Hashing, caching, database indexing, 301 vs 302 redirects |
2. Pastebin
Problem: Design a service where users can store and share plain text or code snippets via a unique URL.
- Very similar to URL shortener but with content storage
- Key addition: object storage (S3) for paste content, content size limits
- Discuss: content expiration, abuse prevention, syntax highlighting
- Database stores metadata; actual content goes to blob storage
3. Rate Limiter
Problem: Design a distributed rate limiting system that controls the number of requests a client can make.
- Algorithms: token bucket (most common), sliding window, fixed window
- Distributed rate limiting requires shared state (Redis)
- Discuss: race conditions with concurrent requests, multi-data-center synchronization
- Key trade-off: accuracy vs performance (eventual consistency may be acceptable)
- Concepts: API security, distributed caching, atomic operations
4. Key-Value Store
Problem: Design a distributed key-value store like Redis or DynamoDB.
- Partitioning: consistent hashing to distribute keys across nodes
- Replication: configurable replication factor for durability
- Consistency: quorum reads/writes (R + W > N for strong consistency)
- Failure handling: gossip protocol, hinted handoff, anti-entropy repair
- Concepts: CAP theorem, consistent hashing, vector clocks
Medium Level
5. Twitter / X Feed
Problem: Design the home timeline feature where users see tweets from people they follow.
| Approach | How It Works | Trade-off |
|---|---|---|
| Pull (Fan-out on read) | Query all followed users' tweets on timeline load | Slow reads, no pre-computation |
| Push (Fan-out on write) | When user tweets, push to all followers' timelines | Slow writes for celebrities (millions of followers) |
| Hybrid | Push for regular users, pull for celebrities | More complex but optimal |
6. Instagram
Problem: Design a photo-sharing social network with feed, stories, likes, and comments.
- Image storage: object storage (S3) with CDN for delivery
- Image processing: resize, compress, generate thumbnails asynchronously
- Feed generation: similar to Twitter but with ranking algorithm
- Key components: CDN, object storage, message queue for async processing, search service
- Discuss: image upload flow, feed ranking signals, sharding by user ID
7. WhatsApp / Chat System
Problem: Design a real-time messaging system supporting 1:1 and group chats.
- Connection: WebSocket for real-time bidirectional communication
- Message delivery: at-least-once delivery with deduplication
- Offline messages: store-and-forward when recipient is offline
- Group chat: fan-out messages to all group members
- End-to-end encryption: Signal Protocol (Double Ratchet)
- Key challenge: presence system (online/offline/typing indicators) at scale
8. Notification System
Problem: Design a system that sends push notifications, emails, and SMS to millions of users.
- Components: notification service, priority queue, rate limiter, template engine, delivery services
- Multi-channel: push (APNS/FCM), email (SES/SendGrid), SMS (Twilio)
- Reliability: retry with exponential backoff, dead letter queue for failed deliveries
- User preferences: per-user notification settings, quiet hours, channel preferences
- Discuss: rate limiting to prevent spam, deduplication, analytics
9. Web Crawler
Problem: Design a web crawler that systematically browses the internet and indexes web pages.
- Core loop: pick URL from frontier, download page, extract links, add to frontier
- URL frontier: priority queue with politeness constraints (one request per domain per second)
- Deduplication: URL bloom filter + content fingerprinting (SimHash)
- Distributed: partition by domain, coordinate with consistent hashing
- Robots.txt compliance, handling traps (infinite URLs), DNS resolution caching
10. Typeahead / Autocomplete
Problem: Design a search autocomplete system that suggests queries as the user types.
- Data structure: Trie (prefix tree) with frequency counts
- Pre-compute top suggestions for each prefix, update periodically
- Serve from cache (Redis) for low latency (<50ms)
- Update suggestions asynchronously based on search analytics
- Discuss: personalization, trending queries, multi-language support
11. Distributed Task Scheduler
Problem: Design a system that schedules and executes tasks at specified times or intervals.
- Task storage: database with scheduled execution time
- Polling vs event-driven: workers poll for due tasks or use delay queues
- Exactly-once execution: distributed locking (Redis SETNX) to prevent duplicate runs
- Failure handling: heartbeats, task timeout, automatic retry
- Concepts: leader election, distributed locking, idempotency
Hard Level
12. YouTube / Video Streaming Platform
Problem: Design a video upload, processing, and streaming platform.
| Component | Purpose | Technology |
|---|---|---|
| Upload Service | Resumable uploads, chunked transfer | S3 multipart upload |
| Transcoding Pipeline | Convert to multiple resolutions/formats | FFmpeg, distributed workers |
| CDN | Deliver video to users globally | CloudFront, custom CDN |
| Streaming | Adaptive bitrate streaming | HLS/DASH protocols |
13. Google Maps
Problem: Design a mapping service with navigation, directions, and location search.
- Map tiles: pre-rendered at multiple zoom levels, served via CDN
- Routing: graph-based (Dijkstra/A*), pre-computed with contraction hierarchies
- Real-time traffic: stream processing of GPS data from millions of devices
- Location search: geospatial index (R-tree, geohash, quadtree)
- ETA estimation: ML model combining distance, traffic, historical patterns
14. Distributed Cache (like Redis/Memcached)
Problem: Design a distributed caching system that provides low-latency data access.
- Partitioning: consistent hashing with virtual nodes
- Eviction: LRU, LFU, or TTL-based policies
- Replication: primary-replica for read scaling and fault tolerance
- Cache invalidation: write-through, write-behind, or cache-aside patterns
- Hot key handling: local caching, key replication across multiple shards
- Concepts: consistent hashing, caching strategies, CAP theorem
15. Search Engine
Problem: Design a text search engine that indexes billions of documents and returns relevant results.
- Indexing: inverted index mapping terms to document IDs
- Crawling: distributed web crawler to discover and fetch pages
- Ranking: TF-IDF, PageRank, and ML-based relevance scoring
- Query processing: tokenization, stemming, query expansion
- Sharding: partition index by document ID or term ranges
- Key challenge: freshness (how quickly new content appears in results)
16. Uber / Ride-Sharing Platform
Problem: Design a ride-sharing platform that matches drivers with riders in real-time.
- Location tracking: drivers send GPS coordinates every 3-5 seconds
- Matching: geospatial index to find nearby drivers, optimize ETA
- Dispatch: real-time matching algorithm considering location, rating, ride type
- Pricing: surge pricing based on supply-demand ratio per geo-zone
- Key components: location service, matching service, pricing service, trip service, payment service
- Concepts: geospatial indexing, real-time data processing, microservices
17. Distributed Message Queue (like Kafka)
Problem: Design a distributed messaging system supporting publish-subscribe and at-least-once delivery.
- Topics and partitions: messages partitioned for parallel processing
- Consumer groups: each group gets every message, consumers within a group share load
- Persistence: append-only log on disk for durability
- Ordering: guaranteed within a partition, not across partitions
- Replication: ISR (In-Sync Replicas) for fault tolerance
18. E-Commerce Platform (like Amazon)
Problem: Design an e-commerce platform with product catalog, shopping cart, checkout, and order management.
- Services: product catalog, inventory, cart, order, payment, shipping
- Inventory: distributed locking to prevent overselling
- Checkout: saga pattern for distributed transactions across services
- API security: OAuth 2.0, payment tokenization, PCI compliance
- Key challenge: handling flash sales (sudden 100x traffic spike)
19. Metrics and Monitoring System (like Datadog)
Problem: Design a system that collects, stores, and visualizes millions of metrics per second.
- Data model: time-series data (metric name, tags, timestamp, value)
- Storage: time-series database (InfluxDB, TimescaleDB) with downsampling
- Ingestion: high-throughput write path with buffering
- Alerting: rules engine evaluating metrics against thresholds
- Concepts: database selection, stream processing, data retention policies
20. Collaborative Document Editor (like Google Docs)
Problem: Design a real-time collaborative editing system where multiple users edit simultaneously.
- Conflict resolution: Operational Transformation (OT) or CRDTs
- Real-time sync: WebSocket connections for low-latency updates
- Versioning: document revision history with undo/redo
- Presence: show cursors and selections of other users
- Key challenge: merging concurrent edits without data loss while preserving intent
Difficulty and Topic Mapping
| Difficulty | Questions | Key Concepts Tested |
|---|---|---|
| Easy | URL Shortener, Pastebin, Rate Limiter, KV Store | Hashing, caching, databases, basic scaling |
| Medium | Twitter, Instagram, WhatsApp, Notifications, Crawler, Autocomplete, Task Scheduler | Fan-out strategies, async processing, WebSockets, distributed systems |
| Hard | YouTube, Google Maps, Distributed Cache, Search Engine, Uber, Kafka, Amazon, Datadog, Google Docs | Streaming, geospatial, consensus, CRDTs, time-series, distributed transactions |
Strengthen your preparation with our Scalability Cheatsheet, Database Cheatsheet, and explore interactive tools for hands-on practice with security and API concepts.
Frequently Asked Questions
How many of these should I practice before an interview?
Aim to thoroughly practice at least 8-10 problems across all difficulty levels. Focus on understanding the patterns rather than memorizing solutions. If you are short on time, prioritize: URL Shortener, Twitter Feed, WhatsApp, and one hard problem like YouTube or Uber. These cover the most commonly tested patterns.
Which questions are most commonly asked at FAANG companies?
URL Shortener (warm-up), Twitter/Instagram Feed (most popular), Chat/Messaging System, Notification System, and Distributed Cache appear most frequently. Hard questions like Search Engine and Google Maps are common for senior and staff-level interviews. Every company has their favorites — research your target company on Glassdoor and Blind.
Should I design for a specific tech stack or keep it abstract?
Start abstract (describe components and their roles), then get specific when it adds value. Saying "a NoSQL database" is fine initially, but in the deep dive, saying "Cassandra for its write performance and eventual consistency model, which fits our use case" shows depth. Name specific technologies only when you can justify the choice.
How do I handle follow-up questions I was not prepared for?
Take a breath and reason from first principles. State what you know, identify the constraints, and work through it logically. It is perfectly acceptable to say: "I have not designed this specific component before, but I would approach it by..." Interviewers are evaluating your problem-solving process, not testing if you have memorized every answer.