Design a Payment System
A payment system is one of the most critical components of any e-commerce or fintech platform. It handles money — making correctness, security, and reliability non-negotiable. This guide covers the architecture of a payment system including payment flows, idempotency, double-spend prevention, reconciliation, and PCI compliance. Every design decision here has financial consequences, so we must be exceptionally careful with consistency guarantees.
1. Requirements
Functional Requirements
- Process payments: charge a customer's payment method (credit card, debit card, wallet).
- Support multiple payment methods: cards, bank transfers, digital wallets (PayPal, Apple Pay).
- Refund processing: full and partial refunds.
- Payment status tracking: pending, authorized, captured, settled, failed, refunded.
- Retry failed payments with idempotency.
- Reconciliation: verify all transactions match between internal records and external processors.
- Webhook notifications for payment events.
- Multi-currency support.
Non-Functional Requirements
- Correctness: Money must never be lost, duplicated, or miscounted. This is the top priority.
- Reliability: 99.99% availability. Payment processing cannot go down.
- Consistency: Strong consistency for financial transactions (ACID).
- Security: PCI DSS compliance. No plaintext card numbers stored.
- Idempotency: Retrying a payment request must not result in double charging.
- Auditability: Complete audit trail of every transaction and state change.
2. Capacity Estimation
| Metric | Estimate |
|---|---|
| Transactions per day | 50 million |
| Average TPS | 50M / 86,400 ≈ 580/sec |
| Peak TPS (10x during sales) | ~6,000/sec |
| Average transaction size | $50 |
| Daily volume | $2.5 billion |
| Transaction record size | ~2 KB (with full audit trail) |
| Storage per day | 50M × 2 KB = 100 GB/day |
| Storage per year | ~36 TB/year (must retain 7+ years for compliance) |
3. High-Level Design
| Component | Responsibility |
|---|---|
| Payment API | Accept payment requests, validate, return status |
| Payment Service | Core orchestration: manage payment lifecycle |
| Payment Processor Gateway | Adapter layer for external processors (Stripe, Adyen, etc.) |
| Idempotency Store | Ensures duplicate requests are not processed twice |
| Ledger Service | Double-entry bookkeeping for all money movements |
| Wallet Service | Manages internal wallet balances (if applicable) |
| Reconciliation Service | Compares internal records with external processor records |
| Webhook Service | Notifies merchants/services of payment events |
| Fraud Detection | Real-time fraud scoring before processing |
| Audit Log | Immutable record of every action and state change |
4. Detailed Component Design
4.1 Payment Flow
A typical card payment goes through these steps:
- Authorization: Verify the card and reserve funds (amount is "held" but not yet transferred).
- Capture: Actually charge the reserved funds (happens when order ships).
- Settlement: Money moves from cardholder's bank to merchant's bank (T+1 to T+3 days).
// Payment lifecycle
POST /api/v1/payments
{
"idempotency_key": "order_12345_payment",
"amount": 4999,
"currency": "USD",
"payment_method_id": "pm_card_visa_1234",
"capture": false,
"metadata": { "order_id": "order_12345" }
}
Response: {
"payment_id": "pay_abc123",
"status": "authorized",
"amount": 4999,
"currency": "USD"
}
// Later, when order ships:
POST /api/v1/payments/pay_abc123/capture
{
"amount": 4999
}
Response: {
"payment_id": "pay_abc123",
"status": "captured"
}
// If order is cancelled:
POST /api/v1/payments/pay_abc123/cancel
Response: {
"payment_id": "pay_abc123",
"status": "cancelled"
}
4.2 Idempotency
Idempotency is the most critical pattern in payment systems. Network failures, timeouts, and retries can cause duplicate requests. Without idempotency, a customer could be charged twice.
async function processPayment(request) {
const { idempotency_key } = request;
// Step 1: Check idempotency store
const existing = await idempotencyStore.get(idempotency_key);
if (existing) {
// Return the same response as the original request
return existing.response;
}
// Step 2: Lock the idempotency key (prevent concurrent duplicates)
const lock = await idempotencyStore.acquireLock(idempotency_key, ttl: 30);
if (!lock) {
return { status: 409, message: "Request already in progress" };
}
try {
// Step 3: Process payment
const result = await paymentProcessor.charge(request);
// Step 4: Store result with idempotency key
await idempotencyStore.set(idempotency_key, {
response: result,
created_at: Date.now(),
expires_at: Date.now() + 86400000 // 24 hour TTL
});
return result;
} finally {
await idempotencyStore.releaseLock(idempotency_key);
}
}
4.3 Double-Spend Prevention
Prevent the same funds from being used in two transactions simultaneously:
-- Use database transactions with optimistic locking
BEGIN TRANSACTION;
-- Check current balance with row-level lock
SELECT balance, version FROM wallets
WHERE user_id = 'user_123' FOR UPDATE;
-- Verify sufficient funds
-- If balance >= amount:
UPDATE wallets
SET balance = balance - 4999,
version = version + 1
WHERE user_id = 'user_123' AND version = 5;
-- If 0 rows updated, version changed = concurrent modification
INSERT INTO transactions (id, user_id, amount, type, status)
VALUES ('txn_abc', 'user_123', -4999, 'payment', 'completed');
COMMIT;
For card payments, the external processor handles double-spend prevention through the authorization hold. For wallet-based payments, use database-level pessimistic locking (SELECT FOR UPDATE) or optimistic locking (version column).
4.4 Ledger Service (Double-Entry Bookkeeping)
Every money movement is recorded as a debit and a credit, ensuring the books always balance:
// For a $49.99 payment from customer to merchant:
// Debit: Customer's account -$49.99
// Credit: Merchant's account +$47.49 (after 5% platform fee)
// Credit: Platform revenue +$2.50
INSERT INTO ledger_entries (transaction_id, account_id, amount, type, created_at) VALUES
('txn_abc', 'customer_wallet', -4999, 'debit', NOW()),
('txn_abc', 'merchant_wallet', 4749, 'credit', NOW()),
('txn_abc', 'platform_revenue', 250, 'credit', NOW());
-- Invariant: SUM(amount) for any transaction_id = 0
-- This MUST always hold. If it doesn't, something is wrong.
4.5 Reconciliation
Reconciliation verifies that internal records match external processor records. It runs daily:
- Download settlement reports from each payment processor (Stripe, Adyen, etc.).
- Compare each external transaction with the internal ledger.
- Flag discrepancies: missing transactions, amount mismatches, status mismatches.
- Generate reconciliation reports for the finance team.
- Auto-resolve simple discrepancies; escalate complex ones.
async function reconcile(date) {
const internalTxns = await ledger.getTransactions(date);
const externalTxns = await processor.getSettlementReport(date);
const results = { matched: 0, mismatched: 0, missing_internal: 0, missing_external: 0 };
const externalMap = new Map(externalTxns.map(t => [t.id, t]));
for (const internal of internalTxns) {
const external = externalMap.get(internal.processor_txn_id);
if (!external) {
results.missing_external++;
await flagDiscrepancy(internal, "missing_in_processor");
} else if (internal.amount !== external.amount) {
results.mismatched++;
await flagDiscrepancy(internal, "amount_mismatch", external);
} else {
results.matched++;
externalMap.delete(internal.processor_txn_id);
}
}
// Remaining external transactions not in our system
results.missing_internal = externalMap.size;
for (const [id, ext] of externalMap) {
await flagDiscrepancy(null, "missing_in_internal", ext);
}
return results;
}
4.6 PCI Compliance
Payment Card Industry Data Security Standard (PCI DSS) governs how card data is handled:
| Strategy | Description |
|---|---|
| Tokenization | Replace card numbers with tokens. Only the payment processor stores actual card data. Your system only stores tokens (pm_card_visa_1234). |
| Client-side collection | Use Stripe.js or Adyen's SDK. Card data goes directly from the browser to the processor, never touching your servers. |
| Encryption at rest | All payment data encrypted in the database (AES-256). |
| Network segmentation | Payment services run in an isolated network segment with strict access controls. |
| Audit logging | Every access to payment data is logged immutably. |
5. Database Schema
CREATE TABLE payments (
id VARCHAR(36) PRIMARY KEY,
idempotency_key VARCHAR(128) UNIQUE NOT NULL,
user_id BIGINT NOT NULL,
amount_cents BIGINT NOT NULL,
currency VARCHAR(3) NOT NULL DEFAULT 'USD',
status ENUM('pending','authorized','captured','settled',
'failed','cancelled','refunded','partially_refunded') NOT NULL,
payment_method_id VARCHAR(64),
processor VARCHAR(20) NOT NULL,
processor_txn_id VARCHAR(128),
capture_amount_cents BIGINT,
refund_amount_cents BIGINT DEFAULT 0,
failure_reason TEXT,
metadata JSON,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_payments_user ON payments(user_id, created_at DESC);
CREATE INDEX idx_payments_processor_txn ON payments(processor_txn_id);
CREATE TABLE payment_events (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
payment_id VARCHAR(36) NOT NULL REFERENCES payments(id),
event_type VARCHAR(50) NOT NULL,
from_status VARCHAR(30),
to_status VARCHAR(30),
amount_cents BIGINT,
processor_response JSON,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_events_payment ON payment_events(payment_id, created_at);
CREATE TABLE ledger_entries (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
transaction_id VARCHAR(36) NOT NULL,
account_id VARCHAR(64) NOT NULL,
amount_cents BIGINT NOT NULL,
entry_type ENUM('debit','credit') NOT NULL,
description TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_ledger_transaction ON ledger_entries(transaction_id);
CREATE INDEX idx_ledger_account ON ledger_entries(account_id, created_at);
CREATE TABLE refunds (
id VARCHAR(36) PRIMARY KEY,
payment_id VARCHAR(36) NOT NULL REFERENCES payments(id),
amount_cents BIGINT NOT NULL,
reason TEXT,
status ENUM('pending','processed','failed') DEFAULT 'pending',
processor_refund_id VARCHAR(128),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
6. Key Trade-offs
| Decision | Trade-off |
|---|---|
| Sync vs async payment processing | Synchronous gives immediate response but blocks on processor latency (~1-3 sec). Async improves throughput but requires polling/webhook for status. Use sync for authorization (user is waiting), async for settlement. |
| Single vs multiple processors | Single processor is simpler. Multiple processors provide redundancy (failover), better rates by region, and negotiating leverage. At scale, always use multiple processors with an abstraction layer. |
| Strong consistency everywhere vs eventual consistency | Payment state transitions MUST be strongly consistent (ACID). Analytics and reporting can be eventually consistent. Use a relational database for the core payment data. See CAP theorem. |
7. Scaling Considerations
7.1 Database Scaling
At 6,000 TPS peak, a single PostgreSQL instance can handle this with connection pooling. For horizontal scaling, shard by user_id. Keep the ledger_entries table on a dedicated database with strong consistency guarantees. Use read replicas for reporting queries.
7.2 Handling Processor Outages
If Stripe goes down, failover to Adyen. Maintain a processor health check and routing table. Queue payments during brief outages and retry when the processor recovers. Use message queues to buffer during outages.
7.3 Fraud Prevention
Run real-time fraud scoring before authorization. Features include: transaction velocity, location mismatch, device fingerprint, transaction amount patterns. Use a cache for recent user transaction history to enable fast lookups.
Use swehelper.com tools to practice payment system design and consistency calculations.
8. Frequently Asked Questions
Q1: How does idempotency prevent double charging?
Every payment request includes a unique idempotency_key (e.g., "order_12345_payment"). Before processing, the system checks if this key was already processed. If yes, it returns the original response without re-charging. The key is stored atomically with the payment result. Even if the client retries 10 times due to timeouts, the customer is charged exactly once.
Q2: What is the difference between authorization and capture?
Authorization verifies the card is valid and reserves (holds) the funds without transferring them. Capture actually moves the funds. This two-step process is essential for e-commerce: authorize when the order is placed, capture when the order ships. If the order is cancelled between authorization and capture, the hold is released without any money moving.
Q3: Why is double-entry bookkeeping important?
Double-entry bookkeeping ensures every debit has a corresponding credit. The sum of all entries for any transaction must equal zero. This makes it mathematically impossible for money to appear or disappear without a trace. It enables easy reconciliation, auditing, and debugging. If the sum ever does not equal zero, the system immediately knows something is wrong.
Q4: How do you handle partial failures in a payment flow?
Use a state machine for payment status. If the processor call succeeds but the database write fails, the reconciliation service will detect the discrepancy (processor shows captured, internal shows pending) and auto-correct. If the processor call times out, check the processor's status API before retrying (to avoid double-charging). The idempotency key on the processor side also prevents duplicates.