Serverless

Serverless Interview Guide

Lambda internals, Step Functions orchestration, EventBridge event routing, cold starts, concurrency, and serverless design patterns.

8Topics

Beginner

Lambda Cheaper Than EC2?

Incorrect Answer

Yes, AWS Lambda is cheaper than Amazon EC2.

Correct Answer

It depends on the workload. Both have different cost factors. Lambda can actually be more expensive than EC2 for high-throughput, long-running workloads. The key metric is TCO (Total Cost of Ownership) — Lambda eliminates patching, AMI management, and capacity planning overhead.

Lambda vs EC2 vs Fargate — Cost Comparison

Aspect	Lambda	EC2	Fargate
Pricing Model	Per request + duration × memory	Per hour (On-Demand/Reserved/Spot)	Per vCPU-hour + GB-hour
Idle Cost	$0 — pay only when invoked	Pays even when idle	Pays while task runs
Cheaper When	<1M requests/month, bursty/sporadic	>70% utilization, steady traffic	Containerized, variable workloads
Hidden Costs	NAT Gateway for VPC, CloudWatch Logs	EBS, ALB, AMI storage, patching time	ALB, CloudWatch, ECR
Max Duration	15 minutes	Unlimited	Unlimited
Ops Overhead	Zero (fully managed)	High (OS patching, scaling config)	Low (no EC2 to manage)

Lambda Limits (Know These for Interviews)

Memory: 128 MB – 10,240 MB (10 GB)
Timeout: Max 15 minutes
Payload: 6 MB sync, 256 KB async
Concurrency: 1,000 default (can request increase to 10,000+)
Deployment: 50 MB zipped, 250 MB unzipped (use container images for up to 10 GB)
Graviton: arm64 architecture gives 20% better price-performance

🎯 Key Takeaway

Interview tip: Never say "Lambda is always cheaper." Say: "Lambda wins on TCO for sporadic, event-driven workloads under 15 minutes. EC2 wins for sustained high-throughput workloads. I'd run a cost analysis using the AWS Pricing Calculator comparing the specific request volume, duration, and memory before choosing."

Beginner

Serverless Web Application

A fully serverless web application on AWS — the #1 whiteboard question for Solutions Architect interviews:

Architecture (Layer by Layer)

Layer	Service	Purpose	Key Detail
DNS	Route 53	Domain routing + health checks	Alias record to CloudFront for zero-latency resolution
CDN	CloudFront	Global content delivery, caching	450+ edge PoPs, HTTPS termination, WebSocket support
Frontend	S3	Static hosting (React/Vue/Angular)	OAC (Origin Access Control) restricts direct S3 access
Auth	Cognito User Pools	User signup, login, MFA, social login	Returns JWT tokens; API Gateway validates them natively
API	API Gateway (HTTP API)	Request routing, throttling, CORS	HTTP API is 70% cheaper than REST API for most use cases
Logic	Lambda	Business logic per endpoint	Use Powertools for structured logging, tracing, metrics
Database	DynamoDB	NoSQL data storage	On-demand mode → no capacity planning; <10ms latency
Storage	S3	File uploads via pre-signed URLs	Lambda generates pre-signed URLs; client uploads directly
Notifications	SNS + SES	Push notifications, email	SNS for fan-out, SES for transactional email

HTTP API vs REST API — Which to Choose

Feature	HTTP API	REST API
Cost	$1.00 per million	$3.50 per million
Auth	JWT (Cognito, OIDC natively)	IAM, Cognito, Lambda Authorizer
Features	Basic routing, CORS, JWT	Caching, WAF, usage plans, API keys, request validation
Use When	Most new serverless apps	Need caching, WAF, API keys, or request transforms

Cost model: ~$5-20/month for 1M requests. Zero traffic = ~$0. EC2 equivalent: $50-100+/month minimum.

🎯 Key Takeaway

Interview tip: This is the most-asked design question. Draw it layer by layer: "Route 53 → CloudFront → S3 for static frontend. For the API path: CloudFront → HTTP API → Lambda → DynamoDB. Cognito handles auth with JWT tokens validated at the API Gateway level. For file uploads, Lambda generates pre-signed S3 URLs so clients upload directly — never send files through Lambda."

Intermediate

Lambda Transform

AWS Lambda as an inline transformation step in data pipelines — zero infrastructure, pay-per-execution:

Transform Patterns (Common Interview Scenarios)

Pattern	Trigger	Use Case	Key Details
Firehose Transform	Kinesis Data Firehose	CSV→JSON, PII redaction, data enrichment	Lambda has 60s to process each micro-batch. Must return `Ok`, `Dropped`, or `ProcessingFailed`
S3 Event Transform	S3 PUT/POST notification	Image resize, video transcode, CSV parse	Use S3 Event Notifications or EventBridge for trigger. Write to a different bucket to avoid loops
DynamoDB Streams	Table item change (CDC)	Aggregate, replicate, or fan-out changes	Process in order per partition key. Set batch size and max batching window
API Gateway Transform	HTTP request	Request/response mapping, enrichment	Can also use VTL mapping templates for simple transforms (no Lambda needed)
CloudFormation Macro	Stack deploy	Template generation, custom conditionals	Lambda transforms the CloudFormation template at deploy time

Firehose + Lambda Architecture (Know This Flow)

Producers (IoT, apps, Kinesis Agent) → Kinesis Data Firehose → Lambda Transform (enrich/filter/convert) → S3 / Redshift / OpenSearch. Buffer: 60-900 seconds or 1-128 MB (whichever hits first). Failed records go to a dead-letter S3 prefix.

🎯 Key Takeaway

Interview tip: Emphasize zero-infrastructure ETL. Say: "Lambda transforms eliminate always-on ETL servers. For streaming data, I'd use Firehose with a Lambda transform to convert format and redact PII before landing in S3. For file-based, I'd trigger Lambda from S3 events. Both scale automatically with data volume."

Intermediate

Step Functions Orchestration

AWS Step Functions is the orchestration service for serverless workflows — essential for any multi-step process:

Standard vs Express Workflows

Aspect	Standard	Express
Duration	Up to 1 year	Up to 5 minutes
Pricing	$0.025 per state transition	$0.000001 per state transition
Execution	Exactly-once	At-least-once
History	Full visual execution history	CloudWatch Logs only
Best For	Order processing, ETL, long-running workflows	IoT data processing, high-volume transforms

Common Patterns

Saga Pattern: Orchestrate distributed transactions with compensating actions. If step 3 fails, Step Functions automatically calls undo functions for steps 1 and 2.
Fan-out/Fan-in: Map state processes items in parallel (e.g., process 1,000 images concurrently), then collect results.
Human Approval: Pause workflow with a task token, resume when a human approves via callback (e.g., expense approvals, content moderation).
Error Handling: Built-in Retry (with exponential backoff) and Catch (route errors to fallback states) at every step.

🎯 Key Takeaway

Interview tip: "Step Functions replaces the need for custom orchestration code. I use Standard workflows for order processing with the Saga pattern — each step has a compensating action so failures trigger automatic rollback. For high-volume event processing (10K+ events/min), I use Express workflows at 1000x lower cost. The built-in visual debugger is invaluable for troubleshooting multi-step failures in production."

Intermediate

EventBridge Event Routing

Amazon EventBridge is the central nervous system of modern serverless architectures — a serverless event bus for building event-driven systems:

EventBridge vs SNS vs SQS — Decision Framework

Aspect	EventBridge	SNS	SQS
Pattern	Event routing with content-based filtering	Pub/sub fan-out	Point-to-point queue
Filtering	Rich content-based rules (match on any JSON field)	Message attribute filtering (limited)	No filtering (consumer processes all)
Targets	20+ AWS services directly	Lambda, SQS, HTTP, Email, SMS	Single consumer (or with Lambda trigger)
Replay	Archive & Replay (up to unlimited retention)	No replay	DLQ for failed messages
Schema	Schema Registry with discovery	No schema	No schema
Cross-Account	Native (resource policies)	Native (topic policies)	Native (queue policies)
Throughput	2,400 events/sec default (soft limit)	Millions/sec	Unlimited (standard), 3,000/sec (FIFO)
Best For	Event-driven microservices, SaaS integrations, AWS service events	Simple fan-out, mobile push	Decoupling, work queues, buffering

Key Features for Architects

Archive & Replay: Replay past events for debugging, testing, or backfilling — unique to EventBridge.
Schema Registry: Auto-discovers event schemas from your bus. Generates code bindings (TypeScript, Python, Java).
Pipes: Point-to-point integrations between sources (SQS, Kinesis, DynamoDB Streams) and targets with optional enrichment Lambda.
Scheduler: Cron and rate-based scheduling with one-time or recurring events. Replaces CloudWatch Events for scheduled tasks.

🎯 Key Takeaway

Interview tip: "I use EventBridge as the default event router for new serverless architectures. Its content-based filtering lets me route 'order.placed' events to 5 different consumers using rules that match on event fields — without the consumers knowing about each other. The archive and replay feature is a game-changer for debugging production issues and testing new consumers against historical events."

Advanced

Lambda Cold Starts

Cold starts are the most-asked Lambda question in architect interviews. Know every mitigation strategy:

What Causes Cold Starts

When Lambda has no warm execution environment available, it must: 1) Download your code, 2) Start the runtime, 3) Run your initialization code (SDK clients, DB connections). This adds 100ms–10s depending on runtime, package size, and VPC configuration.

Cold Start Duration by Runtime

Runtime	Typical Cold Start	Notes
Python	100–300ms	Lightweight, fastest cold starts
Node.js	100–300ms	Similar to Python, fast V8 startup
Go / Rust	50–100ms	Compiled, smallest cold start possible
Java	3–10 seconds	JVM startup + class loading. Use SnapStart to reduce to ~200ms
.NET	1–3 seconds	CLR initialization. Use Native AOT to reduce significantly

Mitigation Strategies (Most to Least Impactful)

Provisioned Concurrency — Pre-warms N execution environments. Eliminates cold starts completely but costs money (~$0.015/GB-hour). Use for latency-sensitive APIs.
SnapStart (Java) — Snapshots the initialized JVM after init. Restores from snapshot on cold start (~200ms vs 6s). Free. Only for Java 11+.
Smaller packages — Remove unused dependencies. Use Lambda Layers for shared code. Use tree-shaking (webpack for Node.js). Smaller package = faster download.
Lazy initialization — Initialize SDK clients and DB connections outside the handler (global scope) so they persist across warm invocations. But initialize lazily (only when first needed, not at module load).
Choose lighter runtimes — Python/Node.js over Java unless you need the JVM. Consider Go/Rust for performance-critical functions.
Avoid VPC unless necessary — VPC Lambda used to add 10s of cold start. Since Hyperplane (2019), penalty is minimal (~1s), but only put Lambda in VPC if it accesses VPC resources.

🎯 Key Takeaway

Interview tip: "Cold starts are a solvable problem, not a blocker. For latency-sensitive APIs, I use Provisioned Concurrency with Application Auto Scaling to pre-warm based on traffic patterns. For Java, SnapStart reduces cold starts from 6 seconds to 200ms at zero additional cost. For everything else, I keep packages small, use Python/Node.js, and initialize SDK clients in the global scope."

Advanced

Lambda Concurrency

Understanding Lambda concurrency is critical for production architectures — it's the #1 cause of unexpected throttling:

Concurrency Types

Type	What It Does	Cost	Use Case
Unreserved (default)	Shared pool across all functions. Account default: 1,000.	Free	Most functions — fine for low-traffic workloads
Reserved	Guarantees N concurrent executions for one function. Also CAPS that function at N.	Free	Critical functions that must not be throttled, or to limit a function's blast radius
Provisioned	Pre-initializes N execution environments (no cold starts).	~$0.015/GB-hour	Latency-sensitive APIs, consistent sub-100ms response needed

Concurrency Formula

Concurrent executions = Invocations/second × Average duration (seconds)

Example: 100 requests/sec × 0.5 sec duration = 50 concurrent executions. If your function takes 2 seconds, the same 100 req/sec needs 200 concurrent executions.

Throttling Behavior

Synchronous (API Gateway): Returns 429 Too Many Requests immediately. Client must retry.
Asynchronous (S3, SNS): Lambda retries automatically (2 retries with backoff), then sends to DLQ/destination.
Stream (Kinesis, DynamoDB): Blocks the shard until capacity available. Records accumulate but aren't lost.

🎯 Key Takeaway

Interview tip: "I always set Reserved Concurrency on critical functions to guarantee they can't be starved by other functions in the account. For a rogue function that might consume all 1,000 concurrent executions, I'd cap it with Reserved Concurrency at 200. The key insight: Reserved Concurrency both guarantees AND limits — it's a floor and a ceiling."

Intermediate

Lambda in VPC

When to put Lambda in a VPC and what to watch out for — a frequently tested topic:

When You NEED Lambda in VPC

Accessing RDS / Aurora databases in private subnets
Accessing ElastiCache (Redis / Memcached)
Accessing OpenSearch clusters in VPC
Connecting to on-premises resources via VPN / Direct Connect
Compliance requirements mandating private network traffic only

When You DON'T Need Lambda in VPC

Calling AWS APIs (S3, DynamoDB, SQS) — use VPC Endpoints or public endpoints
Calling external APIs over the internet
Simple event processing without VPC resource access

VPC Lambda — Before and After Hyperplane (2019)

Aspect	Before 2019	After 2019 (Hyperplane)
Cold Start	10–30 seconds (creating ENI per invocation)	~1 second (shared ENIs via Hyperplane)
ENI Management	1 ENI per concurrent execution (IP exhaustion risk)	Shared ENIs managed by AWS, minimal IP consumption
Internet Access	Required NAT Gateway ($0.045/GB)	Still requires NAT Gateway for internet access

Cost Trap: NAT Gateway

VPC Lambda needs a NAT Gateway for internet access and for calling AWS services without VPC Endpoints. At $0.045/GB data processing + $0.045/hr, this can cost $500+/month even for moderate traffic. Mitigation: add VPC Endpoints for S3 (Gateway, free) and DynamoDB (Gateway, free).

🎯 Key Takeaway

Interview tip: "Only put Lambda in a VPC if it needs to access VPC resources like RDS or ElastiCache. The 2019 Hyperplane update eliminated the 10-second cold start penalty, but you still pay for NAT Gateway if Lambda needs internet access. My standard practice: VPC Lambda + VPC Endpoints for S3/DynamoDB (free) + NAT Gateway only if truly needed for external APIs."

Advanced

Interview Questions — Serverless

Serverless is a top interview domain. These questions test your understanding of Lambda internals, event-driven patterns, and when serverless is NOT the right choice.

A Lambda function has cold starts averaging 3 seconds. Users complain about slow API responses. Walk me through every option to reduce cold start latency.

Answer Guide

Provisioned Concurrency (pre-warm instances, costs money), SnapStart (Java — snapshots initialized state), smaller deployment packages, avoid VPC unless necessary, use lighter runtimes (Python/Node over Java), lazy initialization of SDK clients, and Lambda Layers for shared dependencies.
Design an order processing workflow: validate order → charge payment → reserve inventory → send confirmation email. If payment fails after inventory is reserved, how do you handle rollback? Which AWS service orchestrates this?

Answer Guide

Step Functions (Standard Workflow) with compensation pattern. Each step has a corresponding undo step. If ChargePayment fails → call ReleaseInventory. Use Step Functions' built-in error handling (Catch/Retry). Discuss orchestration (Step Functions) vs choreography (EventBridge + SQS).
Explain Lambda concurrency. Your function has an account limit of 1,000 concurrent executions. You have 10 Lambda functions. One rogue function consumes 900 concurrency, throttling all others. How do you prevent this?

Answer Guide

Reserved Concurrency — allocate a guaranteed pool per function (e.g., 100 for critical functions). The rogue function gets a cap. Remaining unreserved pool is shared. Discuss: reserved concurrency also acts as a throttle (requests beyond the limit get throttled, not queued).
When would you choose EventBridge over SNS + SQS? Design an event-driven architecture for a system where 15 different consumers need to react to "order placed" events with different filtering criteria.

Answer Guide

EventBridge — content-based filtering (rules match on event body fields), schema registry, archive/replay, cross-account delivery. SNS — simple fan-out, filter on message attributes (limited). For 15 consumers with different filters, EventBridge rules are cleaner than 15 SNS filter policies.
A Lambda function processes SQS messages. Under high load, some messages are processed twice, causing duplicate database entries. How do you make the system idempotent?

Answer Guide

Idempotency key in DynamoDB (conditional write: PutItem with condition attribute_not_exists). Lambda Powertools provides a built-in idempotency decorator. Also: enable SQS FIFO with deduplication for exactly-once delivery (5-minute dedup window), or use DynamoDB conditional writes as the last line of defense.
Compare Step Functions Standard vs Express workflows. An image processing pipeline processes 10,000 images per minute, each taking 30 seconds. Which type and why?

Answer Guide

Express Workflows — for high-volume, short-duration (up to 5 min), lower cost ($0.000025/transition vs $0.025/transition). 10K images/min with Standard would cost 100x more. Trade-off: Express is at-most-once (no built-in exactly-once), no visual execution history in console. Standard for long-running, Express for high-throughput.
Your serverless API (API Gateway + Lambda) works perfectly at 100 requests/second but returns 429 errors at 1,000 requests/second. Walk through all the potential throttling points.

Answer Guide

API Gateway default: 10,000 req/s account limit (per region), per-stage throttling. Lambda: 1,000 concurrent executions default, reserved concurrency limits. DynamoDB: provisioned capacity throttling. Check CloudWatch metrics for each service's throttle count. Request limit increases or switch to provisioned concurrency.
When would you NOT use serverless? Give three concrete scenarios where EC2 or containers are a better choice than Lambda.

Answer Guide

1) Long-running processes (>15 min) — batch ETL, video transcoding. 2) Consistent high traffic (millions of req/s steady) — Lambda per-request pricing exceeds EC2 at ~1M requests/hour. 3) Applications needing persistent connections — WebSockets, gRPC streaming, stateful sessions. Also: GPU workloads, compliance requiring specific OS hardening.
Design a serverless data pipeline: ingest data from 50 IoT devices (1 event/second each), transform the data, and store it for analytics. Which services would you use?

Answer Guide

IoT Core → Kinesis Data Firehose (batching + buffering) → Lambda transform (enrich/filter) → S3 (Parquet format for Athena). For real-time: IoT Core → Lambda → DynamoDB for live dashboards. Discuss: Firehose handles batching and delivery guarantees, reducing Lambda invocations vs processing each event individually.
A Lambda function in a VPC takes 10+ seconds for cold starts (vs 1 second without VPC). Explain why this happens and how AWS has improved it. Is VPC still slow for Lambda in 2025?

Answer Guide

Before 2019: Lambda created an ENI per cold start (10-30s). After Hyperplane (2019): ENIs are pre-provisioned and shared — cold starts reduced to ~1s even in VPC. In 2025, there's minimal penalty. But: still need NAT Gateway for internet access from VPC Lambda, which adds cost. Only put Lambda in VPC if it needs to access VPC resources (RDS, ElastiCache).

Preparation Strategy

Serverless interviews test two things: event-driven thinking and knowing the limits. For every Lambda question, mention the 15-minute timeout, 10GB memory limit, 1,000 concurrent executions default, and 6MB synchronous payload limit. Knowing limits shows you've built real systems.