Streaming

Streaming Interview Guide

Data streaming with Kinesis and MSK, the messaging decision framework, real-time analytics pipelines, and video streaming architectures.

6Topics

Intermediate

SQS vs SNS vs Kinesis vs EventBridge

The #1 most-asked messaging question in architect interviews — knowing when to use which service:

Aspect	SQS	SNS	Kinesis Data Streams	EventBridge
Pattern	Queue (point-to-point)	Pub/Sub (fan-out)	Ordered streaming	Event routing
Ordering	FIFO only (3,000 msg/s)	No	Yes (per shard)	No
Retention	Up to 14 days	None (fire and forget)	1–365 days	Archive up to unlimited
Replay	No	No	Yes (any position)	Yes (archive replay)
Consumers	Single consumer (or fan-out via SQS per consumer)	Multiple subscribers	Multiple consumers (KCL, Lambda, Flink)	Multiple targets per rule
Throughput	Unlimited (standard)	Millions/sec	1 MB/s per shard write, 2 MB/s read	2,400 events/sec default
Best For	Work queues, decoupling, buffering	Simple fan-out, notifications	Real-time analytics, clickstream, IoT, log aggregation	Microservice events, SaaS integration, AWS service events

Decision Framework

"I need to decouple two services" → SQS
"One event, multiple consumers" → SNS (simple) or EventBridge (with content filtering)
"I need ordering + replay + high throughput" → Kinesis Data Streams
"I need smart event routing with filtering" → EventBridge
"I need the Kafka ecosystem" → Amazon MSK

🎯 Key Takeaway

Interview tip: "I choose the messaging service based on the pattern: SQS for work queues and decoupling, SNS for simple fan-out, Kinesis for ordered high-throughput streaming with replay, and EventBridge for smart event routing between microservices. The key differentiator is whether you need ordering (Kinesis), content-based filtering (EventBridge), or simple queue semantics (SQS)."

Intermediate

Kinesis Data Streams & Firehose

Kinesis is the core data streaming service on AWS — understand the difference between Streams and Firehose:

Kinesis Data Streams vs Firehose

Aspect	Kinesis Data Streams	Kinesis Data Firehose
Purpose	Real-time data streaming with custom consumers	Managed data delivery to destinations
Latency	~200ms (real-time)	60 seconds minimum (near-real-time)
Consumers	Custom (KCL, Lambda, Flink) — you write processing logic	Auto-delivers to S3, Redshift, OpenSearch, Splunk
Scaling	Manual shard management (or on-demand mode)	Fully automatic
Replay	Yes (1–365 day retention, seek to any position)	No replay capability
Transform	You build consumers	Built-in Lambda transform, format conversion (JSON→Parquet), compression
Pricing	Per shard-hour ($0.015/hr) + PUT payload	Per GB ingested ($0.029/GB)

Shard Architecture

Each shard: 1 MB/s write, 2 MB/s read (or 5 reads/sec)
Enhanced Fan-Out: 2 MB/s per consumer per shard (dedicated throughput). Use when you have 3+ consumers.
Partition Key: Determines which shard gets the record. Use high-cardinality keys (user_id) to avoid hot shards.
On-Demand Mode (2021): Auto-scales shards, no capacity planning. Pay per GB. Best for unpredictable traffic.

🎯 Key Takeaway

Interview tip: "Kinesis Data Streams and Firehose solve different problems. Streams is for real-time processing where I need custom consumers, ordering, and replay — like processing clickstream data with Flink. Firehose is for managed delivery — like landing log data in S3 as compressed Parquet files with automatic format conversion. I often use both: Streams for real-time processing, feeding into Firehose for the batch/archive path."

Advanced

Amazon MSK (Managed Kafka)

Amazon MSK is managed Apache Kafka on AWS — choose it when you need the Kafka ecosystem:

MSK vs Kinesis — Decision Framework

Factor	Choose MSK When	Choose Kinesis When
Existing Expertise	Team already knows Kafka	Team prefers AWS-native, simpler APIs
Ecosystem	Need Kafka Connect, Kafka Streams, ksqlDB, Schema Registry	Need tight Lambda/Firehose integration
Retention	Unlimited (tiered storage to S3)	1–365 days
Operations	More operational control (broker config, partition tuning)	Serverless, minimal operations
Multi-Cloud	Kafka API is portable to any cloud	AWS-only (lock-in)
Consumer Model	Consumer groups with offset management	KCL checkpointing or Lambda ESM

MSK Serverless vs Provisioned

MSK Serverless: Auto-scales, no broker management, pay per data. Best for variable workloads and teams that don't want to manage Kafka infrastructure.
MSK Provisioned: Choose broker instances and config. Better price at sustained high throughput. Needed for advanced Kafka features.

🎯 Key Takeaway

Interview tip: "I choose MSK over Kinesis when the team has Kafka expertise and needs the ecosystem — Kafka Connect for CDC from databases, Kafka Streams for stateful stream processing, or Schema Registry for event contracts. For teams without Kafka experience who want a fully managed serverless experience, Kinesis Data Streams with Lambda consumers is simpler and cheaper to operate."

Advanced

Stream Processing Patterns

Architecture patterns for processing streaming data at scale:

Lambda Architecture vs Kappa Architecture

Aspect	Lambda Architecture	Kappa Architecture
Paths	Two: speed layer (real-time) + batch layer (historical)	One: stream processing only
Speed Layer	Kinesis → Flink → Redis	Kinesis → Flink → store
Batch Layer	S3 → Glue/EMR → data warehouse	Replay stream for reprocessing
Complexity	Higher (maintain 2 codepaths)	Lower (single processing path)
Best For	When batch and real-time views need different computation	When the same processing logic works for both real-time and historical

Processing Options on AWS

Service	When to Use	Throughput
Lambda (ESM)	Simple per-record transforms, enrichment, filtering	Up to 10 concurrent batches per shard
Managed Apache Flink	Windowed aggregations, complex event processing, stream joins	Parallel processing with managed scaling
ECS/EKS + KCL	Custom consumer logic, team prefers containers, long-running stateful processors	1 worker per shard (KCL), custom scaling
Firehose + Lambda	Simple transform before delivery to S3/Redshift/OpenSearch	Auto-scaling, 60s–900s buffer

Windowing Strategies (For Flink)

Tumbling Window: Fixed size, non-overlapping (e.g., "count per 5-minute block"). Simplest.
Sliding Window: Overlapping (e.g., "trending in last 15 min, updated every 1 min"). Most common for real-time dashboards.
Session Window: Activity-based (e.g., "user session ends after 30 min of inactivity"). Used for user behavior analytics.

🎯 Key Takeaway

Interview tip: "For stream processing, I match the tool to the complexity. Lambda for simple per-event transforms (filter, enrich, format). Managed Apache Flink for stateful processing — windowed aggregations, stream joins, complex event processing. For delivery to S3 or data warehouses, Kinesis Firehose with a Lambda transform handles batching, compression, and format conversion automatically."

Intermediate

Live Streaming

AWS architecture for live video streaming at scale — understanding this shows you can design media-heavy systems:

Architecture Flow (5 Steps)

Ingest — Live video source (camera, encoder) sends an RTMP/SRT stream to AWS Elemental MediaLive.
Transcode — MediaLive transcodes the stream into multiple bitrates and resolutions (ABR — Adaptive Bitrate): e.g., 1080p, 720p, 480p, 240p for different devices and network conditions.
Package — AWS Elemental MediaPackage packages the transcoded stream into HLS/DASH formats and provides origin endpoints with DVR/time-shift/catch-up TV capabilities.
Deliver — Amazon CloudFront (CDN) distributes the live stream globally to millions of viewers with low latency via 450+ edge locations.
Play — Viewers watch on web, mobile, or smart TV using a video player that dynamically switches quality based on bandwidth (ABR).

MediaLive vs IVS — When to Use Which

Aspect	MediaLive + MediaPackage + CloudFront	Amazon IVS
Complexity	Full control, 10+ services to configure	Fully managed, single API call
Latency	~10-30 seconds (standard HLS)	~2-5 seconds (low-latency)
Customization	Full control over transcoding, packaging, DRM	Limited, opinionated defaults
Best For	Broadcast TV, large-scale OTT (Netflix-like)	Interactive streams (Twitch-like), quick prototypes
DRM Support	Yes (Widevine, FairPlay, PlayReady via MediaPackage)	No native DRM
Cost	Higher, pay per channel-hour + egress	Lower, pay per hour + viewer-hours

Streaming Protocols (Quick Reference)

HLS (HTTP Live Streaming) — Apple standard. Works everywhere. 10-30s latency. Use for broad compatibility.
DASH (Dynamic Adaptive Streaming over HTTP) — Open standard. Similar to HLS. Used by YouTube, Netflix.
LL-HLS (Low Latency HLS) — Apple's low-latency extension. 2-4 seconds. Use for near-real-time.
WebRTC — Sub-second latency. Use for video conferencing, not broadcast.

🎯 Key Takeaway

Interview tip: Know the full pipeline and when to simplify. Say: "For a broadcast-quality OTT platform, I'd use MediaLive → MediaPackage → CloudFront with ABR transcoding. For a quick interactive streaming feature (like in-app live video), Amazon IVS gives you 2-5 second latency with a single API call and no infrastructure to manage."

Advanced

Live Streaming with Ads

Server-side ad insertion (SSAI) is the enterprise approach to monetizing live streams while defeating ad blockers:

SSAI Architecture Flow

Live Stream + Ad Markers — MediaLive inserts SCTE-35 markers (cue points) into the video stream at designated ad break positions.
Ad Decision Server (ADS) — When a viewer requests the stream, AWS Elemental MediaTailor detects SCTE-35 markers and calls the Ad Decision Server to fetch personalized ads for that specific viewer.
Server-Side Stitching — MediaTailor transcodes the ad to match the stream's bitrate/format and stitches it directly into the video stream server-side. The viewer receives a single, seamless stream.
Content Delivery — CloudFront delivers the personalized stream (with stitched ads) to each viewer. Each viewer may see different ads.

SSAI vs Client-Side Ads (CSAI)

Aspect	SSAI (Server-Side)	CSAI (Client-Side)
Ad Blockers	✅ Immune — ads are in the video stream	❌ Easily blocked
User Experience	Seamless, no buffering between content and ads	May see loading, quality changes
Personalization	Per-viewer personalized ads	Per-viewer personalized ads
Measurement	Server-side tracking (more accurate)	Client-side pixels (can be blocked)
Complexity	Higher (MediaTailor + ADS integration)	Lower (JS SDK in player)
Use Case	Premium OTT, broadcast, sports	Web-only, short-form content

SCTE-35 Markers — What They Are

SCTE-35 is the industry standard for signaling ad breaks in video streams. MediaLive can insert them automatically on a timer, or your automation system can trigger them via the MediaLive API at specific moments (e.g., between game quarters).

🎯 Key Takeaway

Interview tip: Say: "For ad monetization of live streams, I'd use SSAI with MediaTailor because ads are stitched server-side, making them ad-blocker resistant. MediaLive inserts SCTE-35 markers, MediaTailor detects them and fetches personalized ads from the ADS, then stitches them into the stream before CloudFront delivers it. Each viewer gets a unique, seamless experience."

Advanced

Interview Questions — Streaming & Analytics

Streaming questions cover both data streaming (Kinesis, MSK) and video streaming architecture. These test real-time data pipeline design.

Compare SQS, SNS, Kinesis Data Streams, and EventBridge. You need to process clickstream data from a website at 50,000 events/second with ordering guarantees. Which do you choose and why?

Answer Guide

Kinesis Data Streams — ordered within shard (by partition key), handles massive throughput, 7-day retention. SQS has no ordering (unless FIFO, which caps at 3,000 msg/s). SNS is pub/sub (no retention). EventBridge is for event routing, not high-throughput data streaming.
When would you choose Amazon MSK (Managed Kafka) over Kinesis Data Streams? A team already using open-source Kafka on-prem wants to migrate to AWS.

Answer Guide

MSK if: existing Kafka expertise, need Kafka ecosystem (Kafka Connect, Kafka Streams, ksqlDB), complex consumer group patterns, longer retention (unlimited). Kinesis if: fully serverless, tight AWS integration (Lambda, Firehose), simpler operations. Migration path: MSK preserves Kafka APIs, no code changes.
Design a real-time analytics pipeline for an e-commerce platform. You need to: track user behavior, calculate trending products (last 15 minutes), and populate a real-time dashboard.

Answer Guide

Clickstream → Kinesis Data Streams (partitioned by user_id) → Managed Apache Flink (sliding window aggregation over 15 min) → DynamoDB/ElastiCache for trending products → API Gateway + Lambda for dashboard API. Alternative: Kinesis → Firehose → S3 for batch analytics with Athena.
Your Kinesis stream has 10 shards. One shard is receiving 80% of all traffic because events are partitioned by a single popular product_id. How do you fix this hot shard problem?

Answer Guide

Use a composite partition key (product_id + random_suffix) to distribute across shards. Or use a hash-based partition key. Trade-off: you lose strict ordering for that product_id. Alternative: re-shard (split the hot shard), but this is temporary if the key distribution doesn't change.
A financial services company needs exactly-once processing semantics for their transaction stream. Can Kinesis provide this? How would you guarantee no duplicates and no data loss?

Answer Guide

Kinesis provides at-least-once delivery. For exactly-once: use KCL (Kinesis Client Library) with checkpointing + DynamoDB for deduplication (idempotency key per record). Use enhanced fan-out for dedicated throughput per consumer. Discuss: Kafka with transactions can provide exactly-once semantics natively — trade-off consideration.
You need to deliver live video to 100,000 concurrent viewers with sub-5-second latency. Would you use MediaLive + MediaPackage + CloudFront, or Amazon IVS? Justify your choice.

Answer Guide

IVS for simplicity (managed, sub-3s latency out-of-the-box, WebRTC-based). MediaLive stack for control (custom encoding profiles, DVR, DRM, SSAI ads, multi-CDN). At 100K viewers, IVS is simpler but less customizable. If you need ad insertion or premium features, go MediaLive stack.
Design a log aggregation pipeline for an application generating 10GB of logs per hour across 200 EC2 instances. Logs need to be searchable within 30 seconds of generation.

Answer Guide

CloudWatch Agent on instances → CloudWatch Logs → Subscription Filter → Kinesis Data Firehose → OpenSearch (near-real-time indexing, ~30s). Alternative: Fluent Bit → Kinesis Data Streams → Lambda/Flink → OpenSearch. Discuss: Firehose buffers (60-900 seconds), so tune buffer interval to 60s for near-real-time.
Explain Kinesis Data Firehose vs Kinesis Data Streams. A team asks: "Why would I use Firehose when Streams does the same thing?"

Answer Guide

They solve different problems. Streams: real-time processing, custom consumers, replay, ordering. Firehose: fully managed delivery to S3/Redshift/OpenSearch with built-in transformation (Lambda), batching, compression, and format conversion (Parquet). Use Streams for real-time processing, Firehose for data delivery/ETL.
Your streaming pipeline experiences a 2-hour outage. When it recovers, how do you reprocess the missed data without duplicating records that were already processed?

Answer Guide

Kinesis retains data for 24-365 hours. Use TRIM_HORIZON or AT_TIMESTAMP to replay from before the outage. Idempotent consumers (dedup using DynamoDB conditional writes) ensure no duplicates. For Firehose: check S3 for already-delivered files. For Kafka/MSK: reset consumer group offset to the outage timestamp.
A media company wants to monetize their live stream with personalized ads. Each viewer should see different ads based on their profile. How does server-side ad insertion (SSAI) work and why is it preferred over client-side?

Answer Guide

SSAI (MediaTailor) stitches ads into the video stream server-side — ad blockers can't detect them, seamless quality matching, frame-accurate insertion. CSAI relies on client player to fetch ads — blocked by ad blockers, buffering during transitions. SSAI uses SCTE-35 markers in the manifest to identify ad break positions.

Preparation Strategy

Streaming questions test real-time data architecture. Know the decision framework: SQS (decoupling) vs SNS (fan-out) vs Kinesis (ordered streaming) vs EventBridge (event routing). For each, know throughput limits, retention periods, and ordering guarantees — interviewers will probe these specifics.