Skip to main content
🏢Case Studies

Design Instagram: A Photo-Sharing Social Platform

Instagram serves over 2 billion monthly active users, handling photo uploads, news feed generation, stories, and social interactions at massive scale. This...

📖 9 min read

Design Instagram: A Photo-Sharing Social Platform

Instagram serves over 2 billion monthly active users, handling photo uploads, news feed generation, stories, and social interactions at massive scale. This system design explores how to build the core features of Instagram, covering storage, feed generation, CDN integration, and the follower graph. It is a classic interview question that tests your understanding of media storage, fan-out strategies, and social graph management.

1. Requirements

Functional Requirements

  • Users can upload photos and short videos with captions and hashtags.
  • Users can follow other users (asymmetric relationship).
  • Generate a personalized news feed showing posts from followed users, ranked by relevance.
  • Like and comment on posts.
  • Explore/discover page with trending and recommended content.
  • Search for users, hashtags, and locations.
  • User profiles displaying their posts, follower/following counts.

Non-Functional Requirements

  • High availability: 99.99% uptime. Feed generation and media serving must be resilient.
  • Low latency: News feed loads in under 500ms. Photo uploads complete in under 5 seconds.
  • Scalability: Support 500M+ DAU with 100M+ photo uploads per day.
  • Durability: Uploaded photos must never be lost.
  • Eventual consistency: Feed updates can tolerate seconds of delay.
  • Read-heavy system: feed reads vastly outnumber writes (posts).

2. Capacity Estimation

Metric Estimate
Daily Active Users 500 million
Photo uploads per day 100 million
Average photo size (original) 2 MB
Resized versions per photo 4 sizes (thumbnail, small, medium, original) ≈ 4 MB total
Storage per day (photos) 100M × 4 MB = 400 TB/day
Storage per year (photos) ~146 PB/year
Feed reads per day 500M users × 10 feed opens = 5 billion
Feed read QPS 5B / 86,400 ≈ 58,000/sec
Write QPS (posts) 100M / 86,400 ≈ 1,160/sec
Upload bandwidth 1,160 × 2 MB = 2.3 GB/sec inbound

CDN bandwidth: Serving images to 500M users generates massive outbound bandwidth. A CDN is absolutely essential to serve images from edge locations.

3. High-Level Design

Component Responsibility
API Gateway Authentication, rate limiting, request routing
Post Service Handles photo upload, metadata creation
Media Processing Pipeline Resize, compress, generate thumbnails
Object Storage (S3) Stores photo files durably
CDN Serves photos from edge locations globally
Feed Service Generates and serves personalized news feeds
Social Graph Service Manages follow/unfollow relationships
Search Service Indexes users, hashtags, locations for search
Notification Service Sends push notifications for likes, comments, follows
Cache Layer Redis caches for feed, user profiles, hot posts

4. Detailed Component Design

4.1 Photo Upload Pipeline

The upload flow is a multi-step asynchronous pipeline:

  1. Client uploads photo to a pre-signed S3 URL (direct-to-storage upload bypasses application servers).
  2. Client sends metadata (caption, tags, location) to the Post Service via REST API.
  3. Post Service creates a post record in the database with status "processing."
  4. Media Processing Pipeline (triggered via message queue) resizes the photo into multiple dimensions, applies compression, strips EXIF data, and stores all versions in S3.
  5. Post status updated to "published" once processing completes.
  6. Fan-out Service is triggered to distribute the post to followers' feeds.
// Photo upload API
POST /api/v1/posts
Headers: Authorization: Bearer <token>
Body (multipart): {
    "caption": "Beautiful sunset",
    "hashtags": ["sunset", "nature"],
    "location": {"lat": 34.05, "lng": -118.24, "name": "Los Angeles"},
    "media_key": "s3://uploads/user123/photo_abc.jpg"
}
Response: {
    "post_id": "post_789",
    "status": "processing",
    "created_at": "2025-01-25T10:00:00Z"
}

4.2 Media Storage Strategy

Each uploaded photo is stored in multiple resolutions in object storage:

Version Dimensions Avg Size Use Case
Thumbnail 150x150 15 KB Grid view, notifications
Small 320x320 50 KB Low-bandwidth feed
Medium 640x640 150 KB Standard feed view
Original Up to 1080x1080 2 MB Full-screen view

The CDN URL structure encodes the size variant: cdn.instagram.com/photos/{post_id}/{size}.jpg. The client requests the appropriate size based on screen resolution and network conditions.

4.3 News Feed Generation

This is the most complex part of the system. Two fundamental approaches exist:

Fan-out on Write (Push Model)

When a user publishes a post, immediately push the post ID into every follower's pre-computed feed cache.

async function fanOutOnWrite(post) {
    const followers = await socialGraph.getFollowers(post.user_id);

    for (const followerId of followers) {
        // Push post_id to each follower's feed in Redis sorted set
        // Score = timestamp for chronological ordering
        await redis.zadd(
            `feed:${followerId}`,
            post.created_at,
            post.post_id
        );
        // Trim feed to last 1000 posts
        await redis.zremrangebyrank(`feed:${followerId}`, 0, -1001);
    }
}

Pros: Fast feed reads (pre-computed). Cons: Slow writes for users with millions of followers (celebrity problem). High memory usage for pre-computed feeds.

Fan-out on Read (Pull Model)

When a user requests their feed, fetch recent posts from all users they follow and merge them in real-time.

Pros: No write amplification. Cons: Slow feed reads; requires querying many users' post lists and merging.

Instagram and Twitter use a hybrid: fan-out on write for normal users (who have <10K followers), and fan-out on read for celebrities (who have millions of followers). When generating a feed, merge the pre-computed feed with fresh posts from followed celebrities.

async function getFeed(userId, page) {
    // 1. Get pre-computed feed (from push for normal users)
    const feedPostIds = await redis.zrevrange(`feed:${userId}`, page * 20, (page + 1) * 20 - 1);

    // 2. Get followed celebrities
    const celebrities = await socialGraph.getFollowedCelebrities(userId);

    // 3. Fetch recent posts from celebrities (pull)
    const celebrityPosts = await postService.getRecentPosts(celebrities, since: lastFeedRefresh);

    // 4. Merge, rank, and return
    const mergedFeed = rankAndMerge(feedPostIds, celebrityPosts);
    return hydratePosts(mergedFeed);  // Fetch full post data
}

4.4 Follower System (Social Graph)

The follow relationship is asymmetric (A follows B does not mean B follows A). Store in a graph-like structure:

CREATE TABLE follows (
    follower_id BIGINT NOT NULL,
    followee_id BIGINT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (follower_id, followee_id)
);

CREATE INDEX idx_follows_followee ON follows(followee_id, follower_id);

Two indexes allow efficient queries in both directions: "who does user X follow?" and "who follows user X?" For users with millions of followers, store the follower count in a separate counter cache (Redis) to avoid COUNT queries. Use database sharding by user_id for scalability.

The Explore page shows trending and personalized content from users you do not follow. It relies on:

  • Engagement signals: Posts with high like velocity, comment count, and share rate.
  • Content-based filtering: Analyze hashtags, captions, and image features to match user interests.
  • Collaborative filtering: "Users similar to you liked these posts."
  • Search: Elasticsearch indexes usernames, hashtags, captions, and location names for full-text search.

5. Database Schema

CREATE TABLE users (
    id BIGINT PRIMARY KEY,
    username VARCHAR(30) UNIQUE NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    display_name VARCHAR(100),
    bio TEXT,
    profile_photo_url TEXT,
    follower_count INT DEFAULT 0,
    following_count INT DEFAULT 0,
    post_count INT DEFAULT 0,
    is_verified BOOLEAN DEFAULT FALSE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE posts (
    id BIGINT PRIMARY KEY,
    user_id BIGINT NOT NULL REFERENCES users(id),
    caption TEXT,
    location_name VARCHAR(255),
    location_lat DECIMAL(9,6),
    location_lng DECIMAL(9,6),
    media_type ENUM('photo', 'video', 'carousel'),
    like_count INT DEFAULT 0,
    comment_count INT DEFAULT 0,
    status ENUM('processing', 'published', 'deleted') DEFAULT 'processing',
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE INDEX idx_posts_user_id ON posts(user_id, created_at DESC);

CREATE TABLE post_media (
    id BIGINT PRIMARY KEY,
    post_id BIGINT NOT NULL REFERENCES posts(id),
    media_url_template TEXT NOT NULL,
    width INT,
    height INT,
    duration_seconds INT,
    sort_order INT DEFAULT 0
);

CREATE TABLE likes (
    user_id BIGINT NOT NULL,
    post_id BIGINT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (user_id, post_id)
);

CREATE INDEX idx_likes_post ON likes(post_id, created_at DESC);

CREATE TABLE comments (
    id BIGINT PRIMARY KEY,
    post_id BIGINT NOT NULL,
    user_id BIGINT NOT NULL,
    parent_comment_id BIGINT,
    content TEXT NOT NULL,
    like_count INT DEFAULT 0,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE INDEX idx_comments_post ON comments(post_id, created_at);

6. Key Trade-offs

Decision Option A Option B Instagram's Choice
Feed generation Fan-out on write Fan-out on read Hybrid (push for normal, pull for celebrities)
Feed ranking Chronological ML-based relevance ML-ranked with chronological option
Photo storage Own infrastructure Cloud object storage S3 (originally) → custom (at scale)
Database PostgreSQL Cassandra PostgreSQL (sharded) + Cassandra for feeds
Like counts Real-time COUNT query Denormalized counter Denormalized with async counter update

7. Scaling Considerations

7.1 CDN for Media Delivery

A CDN is non-negotiable. Photos are pushed to CDN edge locations worldwide. Cache-Control headers with long TTLs (photos are immutable) ensure minimal origin fetches. Instagram reportedly serves ~1 billion images per day from CDN.

7.2 Database Sharding Strategy

Shard the users and posts tables by user_id using consistent hashing. This co-locates a user's posts with their profile for efficient profile page loads. Cross-shard queries (like feed generation) are handled by the feed service aggregating from the pre-computed Redis cache. See sharding patterns for details.

7.3 Caching Layers

Multiple caching layers are essential:

  • Feed cache (Redis): Pre-computed list of post IDs per user.
  • Post cache (Memcached): Full post objects for hot posts.
  • User profile cache: Follower counts, profile data.
  • Session cache: Authentication tokens.

7.4 Handling the Celebrity Problem

A user with 100M followers would require 100M Redis writes on each post (fan-out on write). Instead, mark users with more than a threshold (e.g., 10K followers) as celebrities. Their posts are not fanned out; instead, they are pulled in real-time when followers request their feed.

Use swehelper.com tools to practice capacity estimation and architecture design for social media platforms.

8. Frequently Asked Questions

Q1: How does Instagram handle the celebrity problem in feed generation?

Instagram uses a hybrid fan-out approach. For regular users with fewer than ~10K followers, posts are fanned out on write (pushed to each follower's feed cache). For celebrities with millions of followers, posts are fetched on read (pulled when a follower opens their feed). The feed service merges both sources, ranks them, and returns the combined feed.

Q2: How do you store and serve billions of photos efficiently?

Photos are stored in object storage (like S3) in multiple resolutions. A CDN serves photos from edge locations closest to users. Photos are immutable (never modified), which makes CDN caching highly effective with long TTLs. The client requests the appropriate resolution based on screen size and network speed, reducing bandwidth usage.

Q3: How would you handle real-time notifications for likes and comments?

When a user likes or comments on a post, an event is published to a message queue. The Notification Service consumes these events and sends push notifications to the post owner. To prevent notification spam, aggregate notifications (e.g., "User A and 15 others liked your post"). Use a brief delay (30-60 seconds) before sending to allow aggregation.

Q4: How is the Explore page generated?

The Explore page uses a multi-stage pipeline: (1) A candidate generation phase selects thousands of potentially interesting posts using collaborative filtering and content signals. (2) A ranking model scores each candidate based on predicted engagement. (3) A diversity filter ensures variety in topics and content types. (4) Results are cached per-user with a TTL of a few minutes for freshness.

Related Articles