HLD Advanced Concepts
AI Learning Mentor
Generative insights & diagnostic help
Communication Protocols & APIs
Choosing the right communication protocol defines the latency, payload size, and coupling of your distributed system:
- REST: Stateless, cacheable HTTP requests. Uses standard verbs (GET, POST). Best for public CRUD APIs.
- GraphQL: Single endpoint API where the client specifies exactly what data it needs. Solves the over-fetching and under-fetching problems of REST.
- gRPC: Built on HTTP/2. Uses Protocol Buffers (Protobuf) for compact binary serialization. Exceptionally fast. Best for internal microservice-to-microservice communication.
- WebSockets & Server-Sent Events (SSE): WebSockets provide full-duplex bi-directional streams (e.g. chat apps, live gaming). SSE provides uni-directional server-to-client streams (e.g. stock tickers).
Reliability & Fault Tolerance
Distributed systems must assume failure is inevitable. Implementing fault tolerance prevents cascading system outages:
- Retry with Exponential Backoff: Retrying failed requests immediately can overload a recovering service. Exponential backoff (e.g. wait 1s, 2s, 4s, 8s) gives the service time to recover. Adding Jitter (randomness) prevents retry synchronization storms.
- Circuit Breaker: If a downstream service is failing continuously, the circuit breaker 'trips' (opens), immediately returning an error without calling the broken service. After a timeout, it 'half-opens' to test if the service is healthy again.
- Bulkhead Pattern: Isolating resources (like connection pools) for different services so a failure in one service does not drain the resources available for others.
Data Processing: Batch vs Stream
Massive analytics workloads require specialized data pipelines:
- Batch Processing (Hadoop, Spark): Processes large, bounded datasets at scheduled intervals (e.g. nightly reports). High latency, high throughput.
- Stream Processing (Kafka Streams, Flink): Processes continuous, unbounded data in real-time (e.g. fraud detection during a swipe). Low latency.
- Lambda Architecture: Maintains two parallel pipelines: a Batch layer for accurate historical views, and a Speed (Stream) layer for real-time views. The serving layer merges them.
- Kappa Architecture: Simplifies Lambda by removing the Batch layer entirely. Treats everything as an event stream, replaying historical logs when re-computation is needed.
Security: Authentication & Rate Limiting
Security at scale cannot rely on simple session cookies shared across hundreds of microservices. Modern HLD security includes:
- OAuth 2.0 & OIDC: Standard framework for delegated authorization (e.g. 'Login with Google'). OIDC adds an identity layer on top of OAuth.
- Mutual TLS (mTLS): Requires both the client and the server to authenticate each other using certificates. Critical for Zero Trust internal microservice networks.