Back to All Concepts
System DesignReal-timeSocialSecurityPro

System Design: WhatsApp (Chat App)

How to design a massive scale chat application with focus on WebSocket architecture, End-to-End Encryption, and offline message delivery.

Designing a Real-Time Chat System (WhatsApp/Telegram)

WhatsApp handles billions of messages per day. The core challenges are strict low latency, high delivery reliability (messages must not be lost), and privacy.

1. Requirements

Functional Requirements

  • 1:1 Chat: Real-time message delivery between two users.
  • Group Chat: Fan-out of messages to group members (up to 256/1024).
  • Status: Online/Offline/Last Seen.
  • Delivery Receipts: Sent (Tick 1), Delivered (Tick 2), Read (Blue Ticks).

Non-Functional Requirements

  • Low Latency: < 100ms for p99.
  • Availability: High.
  • Durability: Messages must persist until delivered.

2. High-Level Architecture

We cannot use standard HTTP requests (Polling) because they are slow and battery-intensive for mobile devices. We use WebSockets for persistent, bidirectional connections.

Components

  1. Chat Service (Gateway): The stateful server that holds millions of WebSocket connections.
  2. Presence Service: Tracks if user is Online/Offline using Heartbeats.
  3. Group Service: Manages group memberships.
  4. Message Queue: Kafka/RabbitMQ to decouple message ingestion from delivery.
  5. Database: NoSQL (Cassandra/HBase) or NewSQL (CockroachDB).

3. Communication Protocol

Why not standard WebSocket?

Raw WebSockets are just a stream of bytes. We need a sub-protocol for structure.

  • Use XMPP?: Good for presence, but XML is heavy/verbose.
  • Use MQTT?: Lightweight, binary, perfect for mobile battery life. Facebook Messenger and WhatsApp use custom binary protocols (like MQTT or Thrift) over TCP.

4. Message Flow

Message Delivery Lifecycle

Alice
WhatsApp Server
Bob

Hey, are you free for coffee? ☕

10:42 AM
1. Sent to Server2. Delivered to Device3. Read by User

Scenario 1: User A sends to User B (Both Online)

  1. User A sends message to Chat Server 1 (via persistent WebSocket).
  2. Chat Server 1 acknowledges receipt to A ("Sent" tick).
  3. Chat Server 1 asks Discovery Service: "Which server holds User B's connection?"
  4. Service replies: "User B is on Chat Server 5."
  5. Chat Server 1 forwards message to Chat Server 5 (via RPC or Redis Pub/Sub).
  6. Chat Server 5 pushes message to User B (via WebSocket).
  7. User B sends ACK.
  8. Chat Server 5 forwards ACK back to A ("Delivered" tick).

Scenario 2: User B is Offline

  1. Chat Server 1 fails to find an active connection for B.
  2. Chat Server 1 stores the message in the Unread Database (or Queue).
  3. Push Notification Service (APNS/FCM) is triggered to wake up B's phone.
  4. When B comes online next time, they sync with the Unread Database.
  5. After sync, messages are deleted from server (WhatsApp Architecture) or kept (Telegram Architecture).

5. End-to-End Encryption (E2E)

WhatsApp servers cannot read your messages. How?

The Signal Protocol

We use Double Ratchet Algorithm.

  1. Public Keys: When User A registers, they upload an "Identity Key" and a bundle of "Pre-Keys" to the server.
  2. Session Setup: When A wants to text B for the first time:
    • A downloads B's path-key bundle from server.
    • A generates a shared secret Session Key on their device locally (X3DH Key Agreement).
  3. Message Encryption: A encrypts message M with Session Key.
  4. Forward Secrecy: The key changes for every single message. If a hacker steals your key tomorrow, they cannot decrypt messages from yesterday.

6. Group Chat Optimization

Sending a message to a group of 500 people.

Naive: Client-Side Fan-out

User A sends 500 individual messages.

  • Bad: Kills User A's bandwidth/battery.

Improved: Server-Side Fan-out

User A sends 1 message to Server. Server looks up Group Members (500 IDs). Server loops and sends 500 messages.

  • Issue: Slow serial loop.

Advanced: Hybrid / Multicast

Large groups are sharded. The server pushes the message to a "Group Channel" in a Pub/Sub system (Kafka). Different consumer workers pick up shards of the group members and push reliably.

Summary

FeatureTechnology
ProtocolMQTT / WebSocket
DatabaseCassandra / Hbase (Write Heavy)
OfflineDelayed Queue + Push Notifications
EncryptionSignal Protocol (Double Ratchet)

About ScaleWiki

ScaleWiki is an interactive educational platform dedicated to demystifying distributed systems, software architecture, and system design. Our mission is to provide high-quality, technically accurate resources for software engineers preparing for interviews or solving complex scaling challenges in production.

Read more about our Editorial Guidelines & Authorship.

Educational Disclaimer: The architectural patterns and system designs discussed in this article are based on common industry practices, technical whitepapers, and public engineering blogs. Actual implementations in enterprise environments may vary significantly based on specific product requirements, legacy constraints, and evolving technologies.

Related Articles