Design a URL Shortener (like TinyURL / Bitly)
A URL shortener converts long URLs into compact, shareable links. This is one of the most commonly asked system design interview questions because it touches on hashing, database design, caching, and scalability. In this guide, we will design a production-grade URL shortening service from scratch.
1. Requirements
Functional Requirements
- Given a long URL, generate a unique short URL.
- When a user accesses the short URL, redirect them to the original long URL.
- Users can optionally set a custom alias for their short URL.
- Short URLs expire after a configurable time-to-live (TTL); default is 5 years.
- Track analytics: click count, referrer, geolocation, device type.
Non-Functional Requirements
- High availability: The redirect service must have 99.99% uptime.
- Low latency: URL redirection should complete in under 50ms.
- Scalability: Handle billions of URLs and high read throughput.
- Non-guessable: Short URLs should not be easily predictable.
- The system is read-heavy (100:1 read-to-write ratio).
2. Capacity Estimation
Let us estimate the scale of the system with concrete numbers.
| Metric | Estimate |
|---|---|
| New URLs per month | 100 million |
| URL redirections per month | 10 billion (100:1 ratio) |
| Write QPS | 100M / (30 × 24 × 3600) ≈ 40 URLs/sec |
| Read QPS | 40 × 100 = 4,000 redirections/sec |
| Peak read QPS (5x) | 20,000 redirections/sec |
| Storage per URL | ~500 bytes (short URL + long URL + metadata) |
| Storage per year | 100M × 12 × 500B = 600 GB/year |
| Storage over 5 years | ~3 TB |
| Total URLs over 5 years | 6 billion |
Bandwidth: For reads, assuming an average redirect response of 500 bytes: 4,000 × 500B = 2 MB/s outbound. This is modest and easily handled.
Cache memory: Caching 20% of daily reads using the 80/20 rule: 4,000 × 86,400 × 0.2 × 500B ≈ 34 GB. A few Redis instances handle this comfortably. Learn more about caching strategies.
3. High-Level Design
The system consists of the following major components:
| Component | Responsibility |
|---|---|
| API Gateway | Rate limiting, authentication, request routing |
| URL Shortening Service | Generates short URLs, validates input, stores mappings |
| Redirect Service | Looks up short URL and returns HTTP redirect |
| Cache Layer (Redis) | Caches hot short-to-long URL mappings |
| Database | Persistent storage for URL mappings |
| Analytics Service | Tracks clicks, referrers, geolocation asynchronously |
| Load Balancer | Distributes traffic across service instances |
Write flow: Client → API Gateway → URL Shortening Service → Generate short key → Store in DB → Return short URL.
Read flow: Client → API Gateway → Redirect Service → Check Cache → (Cache miss? Query DB) → Return 301/302 redirect → Log analytics event asynchronously.
4. Detailed Component Design
4.1 Short URL Key Generation
We need to generate a unique 7-character key for each URL. With Base62 encoding (a-z, A-Z, 0-9), 7 characters give us 62^7 = 3.5 trillion possible combinations, which far exceeds our 6 billion URLs over 5 years.
Approach 1: Counter-Based with Base62 Encoding
Use an auto-incrementing counter and convert the number to Base62. This guarantees uniqueness with zero collisions.
CHARSET = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
def base62_encode(num):
if num == 0:
return CHARSET[0]
result = []
while num > 0:
result.append(CHARSET[num % 62])
num //= 62
return ''.join(reversed(result))
def base62_decode(short_str):
num = 0
for char in short_str:
num = num * 62 + CHARSET.index(char)
return num
# Example: base62_encode(123456789) = "8M0kX"
To avoid sequential/predictable URLs, use a distributed ID generator like Twitter Snowflake or a dedicated Key Generation Service (KGS) that pre-generates keys in batches.
Approach 2: Hashing with Collision Handling
Hash the long URL using MD5 or SHA-256, then take the first 7 characters of the Base62-encoded hash. Handle collisions by appending a counter and re-hashing.
import hashlib
def generate_short_key(long_url, attempt=0):
url_to_hash = long_url if attempt == 0 else f"{long_url}_{attempt}"
hash_hex = hashlib.md5(url_to_hash.encode()).hexdigest()
hash_int = int(hash_hex, 16)
short_key = base62_encode(hash_int)[:7]
return short_key
# If collision detected in DB, retry with attempt+1
Approach 3: Pre-Generated Key Service (KGS)
A dedicated service pre-generates millions of unique 7-character keys and stores them in a database. When the shortening service needs a key, it fetches one from the unused pool and moves it to the used pool. This eliminates runtime collision checks entirely.
| Approach | Pros | Cons |
|---|---|---|
| Counter + Base62 | No collisions, simple | Sequential keys (predictable), single counter is a bottleneck |
| Hashing | Stateless, same URL = same key | Collisions require retries, extra DB lookups |
| Pre-Generated KGS | No collisions, non-sequential, fast | Extra infrastructure, key pool management |
Recommended: Use a Key Generation Service (KGS) for production. It decouples key generation from the shortening logic, avoids collisions, and scales independently.
4.2 Redirection: 301 vs 302
This is a critical design decision:
- 301 (Moved Permanently): Browser caches the redirect. Reduces server load but you lose analytics on subsequent visits because the browser never hits your server again.
- 302 (Found / Temporary Redirect): Browser always hits your server first. Better for analytics tracking but increases server load.
Recommendation: Use 302 if analytics matter (most commercial URL shorteners do this). Use 301 if reducing server load is the priority.
4.3 Analytics Tracking
For each click, capture analytics data asynchronously to avoid adding latency to the redirect:
// Redirect handler pseudocode
async function handleRedirect(shortKey) {
const longUrl = await cache.get(shortKey) || await db.get(shortKey);
if (!longUrl) return 404;
// Fire-and-forget analytics event
analyticsQueue.publish({
shortKey,
timestamp: Date.now(),
ip: request.ip,
userAgent: request.headers['user-agent'],
referrer: request.headers['referer']
});
return redirect(302, longUrl);
}
Use a message queue (Kafka or SQS) to decouple analytics processing from the redirect path.
5. Database Schema
For the primary URL mapping, we need a simple but efficient schema. Given the read-heavy workload and simple key-value access pattern, both SQL and NoSQL work. See SQL vs NoSQL for a deeper comparison.
CREATE TABLE urls (
short_key VARCHAR(7) PRIMARY KEY,
long_url VARCHAR(2048) NOT NULL,
user_id BIGINT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
expires_at TIMESTAMP,
click_count BIGINT DEFAULT 0
);
CREATE INDEX idx_urls_user_id ON urls(user_id);
CREATE INDEX idx_urls_expires_at ON urls(expires_at);
CREATE TABLE analytics_events (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
short_key VARCHAR(7) NOT NULL,
clicked_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
ip_address VARCHAR(45),
user_agent TEXT,
referrer VARCHAR(2048),
country VARCHAR(2),
device_type VARCHAR(20)
);
CREATE INDEX idx_analytics_short_key ON analytics_events(short_key);
CREATE INDEX idx_analytics_clicked_at ON analytics_events(clicked_at);
CREATE TABLE users (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
api_key VARCHAR(64) UNIQUE NOT NULL,
tier ENUM('free', 'pro', 'enterprise') DEFAULT 'free',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
For analytics at scale, consider using a columnar store like Apache Cassandra or ClickHouse instead of a relational database, as analytics events are append-heavy and queries are time-range based.
6. API Design
A clean API design is essential. Here are the core endpoints:
POST /api/v1/shorten
Headers: Authorization: Bearer <api_key>
Body: {
"long_url": "https://example.com/very/long/path",
"custom_alias": "my-link", // optional
"expires_in": 2592000 // optional, seconds
}
Response: {
"short_url": "https://short.ly/aB3x7Kp",
"long_url": "https://example.com/very/long/path",
"expires_at": "2025-02-01T00:00:00Z"
}
GET /{short_key}
Response: HTTP 302 redirect to long_url
GET /api/v1/analytics/{short_key}
Response: {
"total_clicks": 15234,
"clicks_by_day": [...],
"top_referrers": [...],
"top_countries": [...]
}
DELETE /api/v1/urls/{short_key}
Response: HTTP 204 No Content
7. Key Trade-offs
SQL vs NoSQL
| Factor | SQL (PostgreSQL/MySQL) | NoSQL (DynamoDB/Cassandra) |
|---|---|---|
| Access pattern | Works well for key-value lookup | Optimized for key-value lookup |
| Scalability | Requires manual sharding | Built-in horizontal scaling |
| Consistency | Strong consistency (ACID) | Eventual consistency (tunable) |
| Recommendation | Good starting choice | Better at extreme scale |
Verdict: Start with PostgreSQL for simplicity and ACID guarantees. Migrate to DynamoDB or Cassandra if you need to handle billions of URLs with minimal operational overhead. Understand the CAP theorem trade-offs when choosing.
Custom Alias Handling
When users request a custom alias, check for uniqueness against the database. Reserve certain words (admin, api, health, etc.) to avoid routing conflicts. Limit alias length to 3-16 characters and restrict to alphanumeric characters plus hyphens.
8. Scaling Considerations
8.1 Caching with Redis
Since the system is read-heavy (100:1 ratio), caching is critical. Use Redis as a read-through cache:
async function resolve(shortKey):
# Check cache first
longUrl = redis.get(shortKey)
if longUrl:
return longUrl
# Cache miss - query database
longUrl = db.query("SELECT long_url FROM urls WHERE short_key = ?", shortKey)
if longUrl:
redis.setex(shortKey, 86400, longUrl) # Cache for 24 hours
return longUrl
With the 80/20 rule, caching the top 20% of URLs serves 80% of traffic. At our scale, ~34 GB of cache memory handles this, split across a few Redis nodes.
8.2 Database Sharding
Shard the database by the short_key using consistent hashing or range-based sharding. Since our primary access pattern is a point lookup by short_key, hash-based sharding distributes load evenly.
shard_id = hash(short_key) % num_shards
8.3 Rate Limiting
Protect the write endpoint from abuse. Apply rate limits per API key:
- Free tier: 100 URLs/hour
- Pro tier: 10,000 URLs/hour
- Enterprise: Custom limits
Implement using a token bucket or sliding window algorithm in Redis.
8.4 Cleanup Service
Run a background job to delete expired URLs. Use a lazy deletion approach: check expiry on read (skip expired URLs) and run a periodic batch job to purge from the database and cache.
8.5 Global Distribution
Deploy redirect servers in multiple regions behind a global load balancer (e.g., AWS Global Accelerator or Cloudflare). Cache URL mappings at edge locations for sub-10ms redirects worldwide.
9. System Design Diagram Summary
The complete data flow can be described as follows:
| Step | Write Path | Read Path |
|---|---|---|
| 1 | Client sends POST with long URL | Client sends GET /{shortKey} |
| 2 | API Gateway validates and rate-limits | Load balancer routes to redirect server |
| 3 | Shortening service fetches key from KGS | Redirect service checks Redis cache |
| 4 | Store mapping in database | On cache miss, query database |
| 5 | Return short URL to client | Return 302 redirect, publish analytics event |
Use swehelper.com tools to practice estimating capacity and drawing system architecture diagrams.
10. Frequently Asked Questions
Q1: How do you handle hash collisions?
If using the hashing approach, after generating a short key, check the database for existence. If a collision is found, append a counter to the original URL and re-hash. With Base62 and 7 characters giving 3.5 trillion combinations, collisions are rare. The KGS approach eliminates collisions entirely by pre-generating unique keys.
Q2: Should we use 301 or 302 redirects?
Use 302 (temporary) if you need analytics tracking, since the browser will always hit your server. Use 301 (permanent) if you want to reduce server load and the browser should cache the redirect. Most commercial shorteners (Bitly, TinyURL) use 302 or 307 to preserve analytics.
Q3: How do you prevent abuse of the service?
Implement multiple layers: (1) Rate limiting per API key and per IP address. (2) Validate that the long URL is reachable and not on a blocklist. (3) Use CAPTCHAs for anonymous users. (4) Monitor for spam patterns and auto-block suspicious accounts. (5) Require email verification for API key issuance.
Q4: How does the Key Generation Service (KGS) avoid giving the same key to two servers?
The KGS pre-generates keys and loads batches into an in-memory pool for each application server. Once a batch is assigned, those keys are marked as used in the database. Each server gets an exclusive batch, so no two servers share keys. If a server dies before using all keys in its batch, those keys are simply lost (acceptable given 3.5 trillion possible keys).
Q5: What happens if the same long URL is shortened twice?
Two approaches: (1) Generate a new short URL each time, which is simpler and allows per-user analytics. (2) Check if the long URL already exists and return the existing short URL, which saves storage but requires an index on long_url. Most production systems choose option 1 for simplicity and per-link analytics isolation.