Skip to main content
🏢Case Studies

Design a Notification System

A notification system is a critical component of virtually every modern application. It delivers timely information to users through multiple channels — pu...

📖 8 min read

Design a Notification System

A notification system is a critical component of virtually every modern application. It delivers timely information to users through multiple channels — push notifications, email, SMS, and in-app alerts. At scale, this system must handle billions of notifications daily with proper prioritization, rate limiting, template management, and delivery tracking. This guide covers the complete architecture for a production-grade notification platform.

1. Requirements

Functional Requirements

  • Send notifications through multiple channels: push (iOS/Android), email, SMS, in-app.
  • Support various notification types: transactional (order confirmation), marketing (promotions), social (likes, follows), system (security alerts).
  • Template-based notification content with dynamic variables.
  • User notification preferences: opt-in/opt-out per channel and per type.
  • Rate limiting: prevent notification fatigue by limiting frequency.
  • Priority levels: critical notifications bypass rate limits.
  • Delivery tracking: track sent, delivered, opened, clicked status.
  • Scheduled notifications: send at a specific time or in the user's timezone.

Non-Functional Requirements

  • High availability: 99.99% uptime. Critical notifications (security, payments) must always deliver.
  • Scalability: Handle 10 billion+ notifications per day.
  • Low latency: Real-time notifications delivered within 1 second.
  • Reliability: At-least-once delivery with deduplication.
  • Extensibility: Easy to add new channels (WhatsApp, Slack, etc.).

2. Capacity Estimation

Metric Estimate
Total notifications per day 10 billion
Average QPS 10B / 86,400 ≈ 115,000/sec
Peak QPS (3x) ~350,000/sec
Push notifications per day 5 billion (50%)
Emails per day 3 billion (30%)
SMS per day 1 billion (10%)
In-app per day 1 billion (10%)
Average notification size ~500 bytes
Daily data volume 10B × 500B = 5 TB/day

3. High-Level Design

Component Responsibility
Notification API Accept notification requests from internal services
Validation and Enrichment Validate input, resolve templates, check preferences
Priority Queue Separate queues by priority (critical, high, medium, low)
Rate Limiter Enforce per-user and per-type notification limits
Channel Router Route to appropriate delivery channel based on preferences
Push Worker Delivers via APNs (iOS) and FCM (Android)
Email Worker Delivers via email providers (SES, SendGrid)
SMS Worker Delivers via SMS providers (Twilio, SNS)
In-App Worker Delivers via WebSocket/SSE for in-app notifications
Template Service Manages notification templates per type and locale
Delivery Tracker Tracks delivery status, failures, retries
Analytics Service Metrics on delivery rate, open rate, click rate

4. Detailed Component Design

4.1 Notification Flow

  1. An internal service sends a notification request to the Notification API.
  2. The Validation Service checks the request, resolves the template, and looks up user preferences.
  3. If the user has opted out of this notification type or channel, it is dropped.
  4. The Rate Limiter checks if the user has exceeded their notification limit.
  5. The notification is enqueued into the appropriate priority queue.
  6. The Channel Router determines which channels to use (push, email, SMS, in-app).
  7. Channel-specific workers dequeue and deliver through third-party providers.
  8. Delivery status is recorded in the Delivery Tracker.
  9. On failure, the notification is retried with exponential backoff.
// Notification API request
POST /api/v1/notifications
{
    "type": "order_shipped",
    "user_id": "user_12345",
    "priority": "high",
    "channels": ["push", "email"],
    "template_data": {
        "order_id": "ORD-789",
        "tracking_number": "1Z999AA10123456784",
        "estimated_delivery": "2025-01-28"
    },
    "idempotency_key": "order_shipped_ORD-789"
}

4.2 Priority Queue Architecture

Different notification types have different urgency levels. Use separate message queues per priority:

Priority Examples SLA Rate Limit
Critical Security alerts, 2FA codes, payment failures <5 sec No limit
High Order updates, direct messages <30 sec 50/hour
Medium Social interactions (likes, comments) <5 min 20/hour
Low Marketing, recommendations, newsletters <1 hour 5/day

Workers consume from high-priority queues first. Critical queue workers have dedicated capacity that is never shared with lower priorities.

4.3 Rate Limiting

class NotificationRateLimiter:
    def __init__(self, redis_client):
        self.redis = redis_client

    def check_rate_limit(self, user_id, notification_type, priority):
        # Critical priority bypasses rate limits
        if priority == "critical":
            return True

        # Per-type limit (e.g., max 5 marketing emails per day)
        type_key = f"ratelimit:{user_id}:{notification_type}:daily"
        type_count = self.redis.incr(type_key)
        if type_count == 1:
            self.redis.expire(type_key, 86400)

        type_limit = self.get_type_limit(notification_type)
        if type_count > type_limit:
            return False

        # Global per-user limit (e.g., max 100 notifications per day)
        global_key = f"ratelimit:{user_id}:global:daily"
        global_count = self.redis.incr(global_key)
        if global_count == 1:
            self.redis.expire(global_key, 86400)

        if global_count > 100:
            return False

        return True

4.4 Template System

Templates separate content from delivery logic, enabling non-engineers to update notification copy:

// Template example (stored in database)
{
    "template_id": "order_shipped",
    "locale": "en-US",
    "channels": {
        "push": {
            "title": "Your order is on its way!",
            "body": "Order {{order_id}} shipped. Tracking: {{tracking_number}}. ETA: {{estimated_delivery}}."
        },
        "email": {
            "subject": "Order {{order_id}} has shipped",
            "html_body": "<h1>Great news!</h1><p>Your order {{order_id}} is on its way...</p>"
        },
        "sms": {
            "body": "Your order {{order_id}} shipped. Track: {{tracking_url}}"
        }
    }
}

// Template rendering
function renderTemplate(template, data) {
    return template.replace(/\{\{(\w+)\}\}/g, (match, key) => data[key] || match);
}

4.5 Delivery Tracking

Track the lifecycle of each notification through status transitions:

Status Meaning
created Notification request received
queued Enqueued to priority queue
sent Handed off to third-party provider (APNs, SES, Twilio)
delivered Provider confirmed delivery to device/inbox
opened User opened/read the notification
clicked User clicked a link in the notification
failed Delivery failed (invalid token, bounced email)
rate_limited Dropped due to rate limiting
opted_out Dropped because user opted out

5. Database Schema

CREATE TABLE notification_templates (
    id VARCHAR(50) PRIMARY KEY,
    type VARCHAR(50) NOT NULL,
    locale VARCHAR(10) DEFAULT 'en-US',
    channel VARCHAR(20) NOT NULL,
    title_template TEXT,
    body_template TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    UNIQUE (type, locale, channel)
);

CREATE TABLE user_preferences (
    user_id BIGINT NOT NULL,
    notification_type VARCHAR(50) NOT NULL,
    channel VARCHAR(20) NOT NULL,
    enabled BOOLEAN DEFAULT TRUE,
    PRIMARY KEY (user_id, notification_type, channel)
);

CREATE TABLE user_devices (
    id BIGINT PRIMARY KEY,
    user_id BIGINT NOT NULL,
    platform ENUM('ios','android','web') NOT NULL,
    device_token TEXT NOT NULL,
    is_active BOOLEAN DEFAULT TRUE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE INDEX idx_devices_user ON user_devices(user_id, is_active);

CREATE TABLE notifications (
    id BIGINT PRIMARY KEY,
    user_id BIGINT NOT NULL,
    type VARCHAR(50) NOT NULL,
    channel VARCHAR(20) NOT NULL,
    priority ENUM('critical','high','medium','low') NOT NULL,
    status ENUM('created','queued','sent','delivered','opened','clicked',
                'failed','rate_limited','opted_out') DEFAULT 'created',
    title TEXT,
    body TEXT,
    idempotency_key VARCHAR(128) UNIQUE,
    retry_count INT DEFAULT 0,
    scheduled_at TIMESTAMP,
    sent_at TIMESTAMP,
    delivered_at TIMESTAMP,
    opened_at TIMESTAMP,
    failed_reason TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE INDEX idx_notif_user ON notifications(user_id, created_at DESC);
CREATE INDEX idx_notif_status ON notifications(status, created_at);
CREATE INDEX idx_notif_scheduled ON notifications(scheduled_at) WHERE status = 'created';

6. Key Trade-offs

Decision Trade-off
At-most-once vs at-least-once delivery At-most-once risks missing notifications. At-least-once with idempotency keys prevents duplicates while ensuring delivery. Use idempotency_key to dedup retries.
Single queue vs priority queues A single queue with priority tags is simpler but can starve critical notifications during surges. Separate queues per priority with dedicated workers ensure critical notifications are never delayed.
Sync vs async delivery Synchronous delivery gives immediate feedback but blocks the caller. Asynchronous (queue-based) decouples producers from delivery, handles spikes via buffering, and enables retry logic. Always use async for notifications.
Build vs buy for channel delivery Building direct SMTP/SMS delivery is complex and requires managing sender reputation. Using providers (SES, Twilio) is easier but adds cost and dependency. Most companies use providers for email/SMS and direct integration for push (APNs/FCM).

7. Scaling Considerations

7.1 Queue Scaling

At 350K notifications/sec peak, Kafka is ideal. Partition by channel and priority. Each channel worker group scales independently. Use Kafka consumer groups for horizontal scaling. Ensure ordering within user_id partitions for deduplication.

7.2 Third-Party Provider Rate Limits

APNs, FCM, SES, and Twilio all have rate limits. Use connection pooling and token management. Maintain multiple provider accounts and distribute load. Monitor provider health and failover between providers (e.g., SES to SendGrid for email).

7.3 Database Scaling

Notifications table grows at 10B rows/day. Use time-partitioned tables (partition by month). Archive old notifications to cold storage after 90 days. Shard by user_id for real-time queries. Use database sharding strategies. Analytics queries run on a separate read replica or data warehouse.

7.4 Handling Spikes

Mass notifications (e.g., app-wide announcements) can generate billions of requests simultaneously. Pre-compute the recipient list, enqueue gradually (rate-controlled fan-out), and load balance across workers. Cache rendered templates to avoid re-rendering for each recipient.

Use swehelper.com tools to practice notification system capacity planning.

8. Frequently Asked Questions

Q1: How do you prevent notification fatigue for users?

Multiple layers: (1) Per-type rate limits (e.g., max 3 marketing notifications per day). (2) Global per-user daily caps (e.g., max 50 notifications total per day). (3) Intelligent batching — aggregate similar notifications ("User A and 5 others liked your post"). (4) User-controlled preferences per channel and type. (5) ML-based send-time optimization — send when the user is most likely to engage.

Q2: How do you handle notification delivery failures?

Use exponential backoff with jitter for retries (1s, 2s, 4s, 8s, up to max 5 retries). For push notifications, if a device token is invalid (APNs returns 410 Gone), mark the device as inactive and stop sending. For email bounces, categorize as hard bounce (permanent, remove address) or soft bounce (temporary, retry). Track failure rates and alert on anomalies.

Q3: How does idempotency work for notifications?

Each notification request includes an idempotency_key (e.g., "order_shipped_ORD-789"). Before processing, the system checks if a notification with this key was already created. If yes, it returns the existing result without sending a duplicate. This is stored as a unique constraint in the database. The caller is responsible for generating meaningful, unique idempotency keys.

Q4: How do you handle timezone-aware scheduled notifications?

Store the scheduled time in UTC. When scheduling, convert the user's desired local time to UTC using their timezone. A scheduler service polls for notifications where scheduled_at is in the past and status is "created," then enqueues them. For "send at 9 AM local time" across all users, pre-compute each user's UTC send time and store individually.

Related Articles