Skip to main content
🏢Case Studies

Design a URL Shortener (like TinyURL / Bitly)

A URL shortener converts long URLs into compact, shareable links. This is one of the most commonly asked system design interview questions because it touch...

📖 10 min read

Design a URL Shortener (like TinyURL / Bitly)

A URL shortener converts long URLs into compact, shareable links. This is one of the most commonly asked system design interview questions because it touches on hashing, database design, caching, and scalability. In this guide, we will design a production-grade URL shortening service from scratch.

1. Requirements

Functional Requirements

  • Given a long URL, generate a unique short URL.
  • When a user accesses the short URL, redirect them to the original long URL.
  • Users can optionally set a custom alias for their short URL.
  • Short URLs expire after a configurable time-to-live (TTL); default is 5 years.
  • Track analytics: click count, referrer, geolocation, device type.

Non-Functional Requirements

  • High availability: The redirect service must have 99.99% uptime.
  • Low latency: URL redirection should complete in under 50ms.
  • Scalability: Handle billions of URLs and high read throughput.
  • Non-guessable: Short URLs should not be easily predictable.
  • The system is read-heavy (100:1 read-to-write ratio).

2. Capacity Estimation

Let us estimate the scale of the system with concrete numbers.

Metric Estimate
New URLs per month 100 million
URL redirections per month 10 billion (100:1 ratio)
Write QPS 100M / (30 × 24 × 3600) ≈ 40 URLs/sec
Read QPS 40 × 100 = 4,000 redirections/sec
Peak read QPS (5x) 20,000 redirections/sec
Storage per URL ~500 bytes (short URL + long URL + metadata)
Storage per year 100M × 12 × 500B = 600 GB/year
Storage over 5 years ~3 TB
Total URLs over 5 years 6 billion

Bandwidth: For reads, assuming an average redirect response of 500 bytes: 4,000 × 500B = 2 MB/s outbound. This is modest and easily handled.

Cache memory: Caching 20% of daily reads using the 80/20 rule: 4,000 × 86,400 × 0.2 × 500B ≈ 34 GB. A few Redis instances handle this comfortably. Learn more about caching strategies.

3. High-Level Design

The system consists of the following major components:

Component Responsibility
API Gateway Rate limiting, authentication, request routing
URL Shortening Service Generates short URLs, validates input, stores mappings
Redirect Service Looks up short URL and returns HTTP redirect
Cache Layer (Redis) Caches hot short-to-long URL mappings
Database Persistent storage for URL mappings
Analytics Service Tracks clicks, referrers, geolocation asynchronously
Load Balancer Distributes traffic across service instances

Write flow: Client → API Gateway → URL Shortening Service → Generate short key → Store in DB → Return short URL.

Read flow: Client → API Gateway → Redirect Service → Check Cache → (Cache miss? Query DB) → Return 301/302 redirect → Log analytics event asynchronously.

4. Detailed Component Design

4.1 Short URL Key Generation

We need to generate a unique 7-character key for each URL. With Base62 encoding (a-z, A-Z, 0-9), 7 characters give us 62^7 = 3.5 trillion possible combinations, which far exceeds our 6 billion URLs over 5 years.

Approach 1: Counter-Based with Base62 Encoding

Use an auto-incrementing counter and convert the number to Base62. This guarantees uniqueness with zero collisions.

CHARSET = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

def base62_encode(num):
    if num == 0:
        return CHARSET[0]
    result = []
    while num > 0:
        result.append(CHARSET[num % 62])
        num //= 62
    return ''.join(reversed(result))

def base62_decode(short_str):
    num = 0
    for char in short_str:
        num = num * 62 + CHARSET.index(char)
    return num

# Example: base62_encode(123456789) = "8M0kX"

To avoid sequential/predictable URLs, use a distributed ID generator like Twitter Snowflake or a dedicated Key Generation Service (KGS) that pre-generates keys in batches.

Approach 2: Hashing with Collision Handling

Hash the long URL using MD5 or SHA-256, then take the first 7 characters of the Base62-encoded hash. Handle collisions by appending a counter and re-hashing.

import hashlib

def generate_short_key(long_url, attempt=0):
    url_to_hash = long_url if attempt == 0 else f"{long_url}_{attempt}"
    hash_hex = hashlib.md5(url_to_hash.encode()).hexdigest()
    hash_int = int(hash_hex, 16)
    short_key = base62_encode(hash_int)[:7]
    return short_key

# If collision detected in DB, retry with attempt+1

Approach 3: Pre-Generated Key Service (KGS)

A dedicated service pre-generates millions of unique 7-character keys and stores them in a database. When the shortening service needs a key, it fetches one from the unused pool and moves it to the used pool. This eliminates runtime collision checks entirely.

Approach Pros Cons
Counter + Base62 No collisions, simple Sequential keys (predictable), single counter is a bottleneck
Hashing Stateless, same URL = same key Collisions require retries, extra DB lookups
Pre-Generated KGS No collisions, non-sequential, fast Extra infrastructure, key pool management

Recommended: Use a Key Generation Service (KGS) for production. It decouples key generation from the shortening logic, avoids collisions, and scales independently.

4.2 Redirection: 301 vs 302

This is a critical design decision:

  • 301 (Moved Permanently): Browser caches the redirect. Reduces server load but you lose analytics on subsequent visits because the browser never hits your server again.
  • 302 (Found / Temporary Redirect): Browser always hits your server first. Better for analytics tracking but increases server load.

Recommendation: Use 302 if analytics matter (most commercial URL shorteners do this). Use 301 if reducing server load is the priority.

4.3 Analytics Tracking

For each click, capture analytics data asynchronously to avoid adding latency to the redirect:

// Redirect handler pseudocode
async function handleRedirect(shortKey) {
    const longUrl = await cache.get(shortKey) || await db.get(shortKey);
    if (!longUrl) return 404;

    // Fire-and-forget analytics event
    analyticsQueue.publish({
        shortKey,
        timestamp: Date.now(),
        ip: request.ip,
        userAgent: request.headers['user-agent'],
        referrer: request.headers['referer']
    });

    return redirect(302, longUrl);
}

Use a message queue (Kafka or SQS) to decouple analytics processing from the redirect path.

5. Database Schema

For the primary URL mapping, we need a simple but efficient schema. Given the read-heavy workload and simple key-value access pattern, both SQL and NoSQL work. See SQL vs NoSQL for a deeper comparison.

CREATE TABLE urls (
    short_key   VARCHAR(7) PRIMARY KEY,
    long_url    VARCHAR(2048) NOT NULL,
    user_id     BIGINT,
    created_at  TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    expires_at  TIMESTAMP,
    click_count BIGINT DEFAULT 0
);

CREATE INDEX idx_urls_user_id ON urls(user_id);
CREATE INDEX idx_urls_expires_at ON urls(expires_at);

CREATE TABLE analytics_events (
    id          BIGINT AUTO_INCREMENT PRIMARY KEY,
    short_key   VARCHAR(7) NOT NULL,
    clicked_at  TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    ip_address  VARCHAR(45),
    user_agent  TEXT,
    referrer    VARCHAR(2048),
    country     VARCHAR(2),
    device_type VARCHAR(20)
);

CREATE INDEX idx_analytics_short_key ON analytics_events(short_key);
CREATE INDEX idx_analytics_clicked_at ON analytics_events(clicked_at);

CREATE TABLE users (
    id          BIGINT AUTO_INCREMENT PRIMARY KEY,
    email       VARCHAR(255) UNIQUE NOT NULL,
    api_key     VARCHAR(64) UNIQUE NOT NULL,
    tier        ENUM('free', 'pro', 'enterprise') DEFAULT 'free',
    created_at  TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

For analytics at scale, consider using a columnar store like Apache Cassandra or ClickHouse instead of a relational database, as analytics events are append-heavy and queries are time-range based.

6. API Design

A clean API design is essential. Here are the core endpoints:

POST /api/v1/shorten
Headers: Authorization: Bearer <api_key>
Body: {
    "long_url": "https://example.com/very/long/path",
    "custom_alias": "my-link",     // optional
    "expires_in": 2592000          // optional, seconds
}
Response: {
    "short_url": "https://short.ly/aB3x7Kp",
    "long_url": "https://example.com/very/long/path",
    "expires_at": "2025-02-01T00:00:00Z"
}

GET /{short_key}
Response: HTTP 302 redirect to long_url

GET /api/v1/analytics/{short_key}
Response: {
    "total_clicks": 15234,
    "clicks_by_day": [...],
    "top_referrers": [...],
    "top_countries": [...]
}

DELETE /api/v1/urls/{short_key}
Response: HTTP 204 No Content

7. Key Trade-offs

SQL vs NoSQL

Factor SQL (PostgreSQL/MySQL) NoSQL (DynamoDB/Cassandra)
Access pattern Works well for key-value lookup Optimized for key-value lookup
Scalability Requires manual sharding Built-in horizontal scaling
Consistency Strong consistency (ACID) Eventual consistency (tunable)
Recommendation Good starting choice Better at extreme scale

Verdict: Start with PostgreSQL for simplicity and ACID guarantees. Migrate to DynamoDB or Cassandra if you need to handle billions of URLs with minimal operational overhead. Understand the CAP theorem trade-offs when choosing.

Custom Alias Handling

When users request a custom alias, check for uniqueness against the database. Reserve certain words (admin, api, health, etc.) to avoid routing conflicts. Limit alias length to 3-16 characters and restrict to alphanumeric characters plus hyphens.

8. Scaling Considerations

8.1 Caching with Redis

Since the system is read-heavy (100:1 ratio), caching is critical. Use Redis as a read-through cache:

async function resolve(shortKey):
    # Check cache first
    longUrl = redis.get(shortKey)
    if longUrl:
        return longUrl

    # Cache miss - query database
    longUrl = db.query("SELECT long_url FROM urls WHERE short_key = ?", shortKey)
    if longUrl:
        redis.setex(shortKey, 86400, longUrl)  # Cache for 24 hours
    return longUrl

With the 80/20 rule, caching the top 20% of URLs serves 80% of traffic. At our scale, ~34 GB of cache memory handles this, split across a few Redis nodes.

8.2 Database Sharding

Shard the database by the short_key using consistent hashing or range-based sharding. Since our primary access pattern is a point lookup by short_key, hash-based sharding distributes load evenly.

shard_id = hash(short_key) % num_shards

8.3 Rate Limiting

Protect the write endpoint from abuse. Apply rate limits per API key:

  • Free tier: 100 URLs/hour
  • Pro tier: 10,000 URLs/hour
  • Enterprise: Custom limits

Implement using a token bucket or sliding window algorithm in Redis.

8.4 Cleanup Service

Run a background job to delete expired URLs. Use a lazy deletion approach: check expiry on read (skip expired URLs) and run a periodic batch job to purge from the database and cache.

8.5 Global Distribution

Deploy redirect servers in multiple regions behind a global load balancer (e.g., AWS Global Accelerator or Cloudflare). Cache URL mappings at edge locations for sub-10ms redirects worldwide.

9. System Design Diagram Summary

The complete data flow can be described as follows:

Step Write Path Read Path
1 Client sends POST with long URL Client sends GET /{shortKey}
2 API Gateway validates and rate-limits Load balancer routes to redirect server
3 Shortening service fetches key from KGS Redirect service checks Redis cache
4 Store mapping in database On cache miss, query database
5 Return short URL to client Return 302 redirect, publish analytics event

Use swehelper.com tools to practice estimating capacity and drawing system architecture diagrams.

10. Frequently Asked Questions

Q1: How do you handle hash collisions?

If using the hashing approach, after generating a short key, check the database for existence. If a collision is found, append a counter to the original URL and re-hash. With Base62 and 7 characters giving 3.5 trillion combinations, collisions are rare. The KGS approach eliminates collisions entirely by pre-generating unique keys.

Q2: Should we use 301 or 302 redirects?

Use 302 (temporary) if you need analytics tracking, since the browser will always hit your server. Use 301 (permanent) if you want to reduce server load and the browser should cache the redirect. Most commercial shorteners (Bitly, TinyURL) use 302 or 307 to preserve analytics.

Q3: How do you prevent abuse of the service?

Implement multiple layers: (1) Rate limiting per API key and per IP address. (2) Validate that the long URL is reachable and not on a blocklist. (3) Use CAPTCHAs for anonymous users. (4) Monitor for spam patterns and auto-block suspicious accounts. (5) Require email verification for API key issuance.

Q4: How does the Key Generation Service (KGS) avoid giving the same key to two servers?

The KGS pre-generates keys and loads batches into an in-memory pool for each application server. Once a batch is assigned, those keys are marked as used in the database. Each server gets an exclusive batch, so no two servers share keys. If a server dies before using all keys in its batch, those keys are simply lost (acceptable given 3.5 trillion possible keys).

Q5: What happens if the same long URL is shortened twice?

Two approaches: (1) Generate a new short URL each time, which is simpler and allows per-user analytics. (2) Check if the long URL already exists and return the existing short URL, which saves storage but requires an index on long_url. Most production systems choose option 1 for simplicity and per-link analytics isolation.

Related Articles