• The CI/CD Guy
  • Posts
  • #1 Cloud Architectures Uncovered - Chat System: Overview

#1 Cloud Architectures Uncovered - Chat System: Overview

Imagine you’re building WhatsApp from scratch. Millions of users. Real-time chat. Instant delivery. How do you architect that in the cloud?


In this post, we’ll break down a battle-tested AWS architecture (from the ‘This is My Architecture’ series) and see how it solves the challenges of building a modern chat platform.

Some of the Core Requirements of a Modern Chat System:

  • Scaling to millions of users

  • Real-time bidirectional messaging

  • Data persistence

  • Reliability and availability

  • Security & compliance

The AWS Stack

  • Amazon EKS (REST & WebSocket) — Hosts microservices that handle API and real-time communication.

  • Amazon EKS (Chat & Notifications) — Microservice containing all the business logic for the chat service

  • Amazon MSK — Streams chat events/messages asynchronously to consumers, responsible for different tasks, like detecting spams, profanities, fraud detection etc.

  • Amazon DynamoDB — Primarily used for storing user chat messages, as it provides high write throughput and performance with horizontal scaling without the need for sharding

  • Amazon Aurora — Stores relational data like chat metadata etc, to display the inbox, used for mostly read operations

  • Amazon ElastiCache for Redis — Provides low-latency caching for frequently accessed data (e.g., users’ inboxes)

  • Amazon Pinpoint — Sends push notifications or messages to users in case they are offline

🧩 How It All Comes Together


Imagine a user named Anant(1)opens the chat application and navigates to a conversation with Shikhar(9).

At this point, a WebSocket connection is established for Anant. 
As he types and sends a message, the data travels over this WebSocket connection to a WebSocket Gateway microservice hosted in EKS(2)
This service performs lightweight tasks like authentication, authorisation, and connection management.

Once validated, the message is handed off to a dedicated Chat Microservice(3). This service is responsible for:

  • Routing the message to the correct recipient,

  • Persisting the message,

  • And delivering it in real time if the recipient is online.

If Shikhar already has an active WebSocket connection, the Chat Service pushes the message directly to him, over his web socket connection.
If he’s offline (i.e., no open WebSocket), the Chat Service:

  • Stores the message in DynamoDB, marking it as unread.

  • Triggers an Amazon Pinpoint(4) call to send a push notification alerting Shikhar of the new message.

When Shikhar later comes online and his WebSocket is established, the service immediately pushes the pending messages.

💾How data is stored: CQRS pattern in action


This architecture follows the CQRS (Command Query Responsibility Segregation) pattern:

  • Writes (Commands): All message data (text, timestamp, sender, recipient, status) is stored in Amazon DynamoDB(5), which is optimised for high-throughput write operations and low-latency access.

  • Reads (Queries): Metadata like conversation summaries, unread counts, and inbox views are stored in Amazon Aurora(6). Aurora’s SQL capabilities make it ideal for structured, complex queries and analytics.

So when Anant opens his inbox, a REST API call fetches data from Aurora, often cached in Redis(7) for even faster retrieval.

When he opens a specific chat with Shikhar, recent messages are pulled from DynamoDB, while older messages may be fetched from Aurora, creating a seamless experience.

Real-time stream processing

Every time a message is processed, the Chat Service also publishes events to Amazon MSK (Managed Kafka)(8). These streaming events can power multiple downstream systems, like:

  • Spam detection

  • Profanity filtering

  • Analytics and audit logging

Consumers can independently subscribe to these topics and act on them without affecting the core chat flow.

What’s coming next:

In the next article of this series, we’ll dive deep into each AWS service that powers this chat system architecture — understanding why it’s used, how it’s configured, and what makes it all click.

So stay tuned, and don’t forget to subscribe to our page to follow the full journey!