Databases

Database Interview Guide

Database selection, RDS vs Aurora vs DynamoDB, caching, and migration β€” asked in every SA interview.

5Topics
5Diagrams
AWS Database Services Decision Framework Diagram
Databases β€” Services Decision Framework
Advanced

Database Selection Framework

Data Requirement SQL / ACID Joins NoSQL / Scale Caching / Sub-ms Data Warehouse Aurora PostgreSQL / MySQL DynamoDB Serverless Key-Value ElastiCache Redis / Memcached Redshift Petabyte Analytics

The #1 SA interview question: "Which database would you use?" β€” here's the decision matrix:

Requirement Best Choice Why
Relational + complex queries Aurora (PostgreSQL/MySQL) 5x throughput of standard MySQL, auto-scaling storage, multi-AZ
Key-value, <10ms, any scale DynamoDB Serverless, single-digit ms latency, unlimited scale, on-demand pricing
Sub-millisecond caching ElastiCache (Redis) In-memory, supports complex data structures, pub/sub
Data warehouse / analytics Redshift Columnar storage, MPP, optimized for OLAP queries on petabytes
Document / flexible schema DocumentDB MongoDB-compatible, managed, scales to millions of reads/writes
Graph relationships Neptune Social networks, fraud detection, knowledge graphs
Time-series / IoT Timestream Purpose-built for time-series with built-in analytics
Ledger / immutable Aurora PostgreSQL + pgcrypto Verifiable transaction log (QLDB deprecated July 2025 β€” migrate to Aurora with audit tables)
In-memory + durability MemoryDB for Redis Redis-compatible with multi-AZ durable storage

The Decision Tree (Interview Framework)

  1. Need SQL joins/transactions? β†’ Yes: Aurora. No: continue.
  2. Simple key-value access patterns? β†’ Yes: DynamoDB. No: continue.
  3. Need caching layer? β†’ Yes: ElastiCache Redis. No: continue.
  4. Analytics on large datasets? β†’ Yes: Redshift or Athena (serverless). No: continue.
  5. Graph relationships? β†’ Yes: Neptune. No: check remaining options.

🎯 Key Takeaway

Interview tip: "I start with the access pattern: if I need complex SQL joins and ACID transactions, Aurora. If simple get/put with massive scale, DynamoDB. For analytics, Redshift. I always add ElastiCache in front for hot data β€” even a simple cache reduces RDS load by 90%."

Intermediate

RDS vs Aurora Deep Dive

Aurora Shared Storage Architecture (6 Copies across 3 AZs) Primary (Read/Write) Reader 1 (<10ms lag) Reader 2 (<10ms lag) Reader 3 (<10ms lag) Aurora Distributed Storage Layer (Auto-Scaling, Self-Healing) 6 Copies across 3 Availability Zones
Feature RDS Aurora Aurora Serverless v2
Engines MySQL, PostgreSQL, MariaDB, Oracle, SQL Server MySQL, PostgreSQL only MySQL, PostgreSQL only
Storage EBS (gp3/io1), manual scaling Auto-scales 10GB β†’ 128TB, 6 copies across 3 AZs Same as Aurora
Read Replicas Up to 15 (async) Up to 15 (shared storage, ~10ms lag) Same as Aurora
Failover ~60-120 seconds ~30 seconds (faster with reader) ~30 seconds
Scaling Manual instance resize (downtime) Add readers, instance resize Auto-scales 0.5 β†’ 128 ACUs with load
Cost Lower starting point ~20% more than RDS Pay per ACU-hour (can scale to zero)
Global Cross-region read replicas (minutes lag) Global Database (<1 second lag, 5 secondary regions) Supported (since 2024)
Best For Small/medium, Oracle/SQL Server Production MySQL/PostgreSQL Variable/unpredictable workloads

Aurora Storage Architecture (Know This)

Aurora stores data in a shared, distributed storage layer: 6 copies across 3 AZs. It can lose 2 copies and still write, lose 3 and still read. Storage auto-heals β€” bad segments are automatically repaired. This is why Aurora has sub-30-second failover.

Multi-AZ vs Read Replicas

  • Multi-AZ: Standby instance for HA (automatic failover). Cannot serve reads. Purpose: availability.
  • Read Replica: Separate instance that serves read queries. Can be promoted to primary. Purpose: read scaling.
  • Aurora: Every reader is also a failover target. Multi-AZ + read scaling in one.

🎯 Key Takeaway

Interview tip: "I always recommend Aurora over standard RDS for production MySQL/PostgreSQL. Aurora's storage layer (6 copies, 3 AZs, auto-healing) gives better durability and 30-second failover. For unpredictable workloads, Aurora Serverless v2 auto-scales ACUs. For multi-region, Aurora Global Database provides <1 second replication."

Advanced

DynamoDB Architect Patterns

DynamoDB Event-Driven CDC Pipeline (Change Data Capture) Application (PutItem/UpdateItem) DynamoDB Orders Table DDB Streams (Time-ordered) Lambda Enrichment EventBus Target Fan-out Use Case: When an Order is created in DynamoDB, instantly notify Billing, Shipping, and Analytics decoupled from the main database transaction.

DynamoDB is THE go-to for serverless and high-scale architectures:

Key Concepts

Concept What Why It Matters
Partition Key (PK) Required. Determines which partition stores the item. Bad PK = hot partition = throttling. Use high-cardinality keys.
Sort Key (SK) Optional. Enables range queries within a partition. Design SK to support your query patterns (e.g., ORDER#timestamp).
GSI Global Secondary Index β€” different PK/SK Enables queries on non-key attributes. Has its own throughput.
LSI Local Secondary Index β€” same PK, different SK Must be created at table creation. Shares table throughput.
Single-Table Design Store multiple entity types in one table Reduces costs, enables transactional operations across entities.

Capacity Modes

Mode Best For Cost Scaling
On-Demand Unpredictable, new apps, spiky traffic ~6x provisioned price per request Instant, automatic
Provisioned Predictable, steady workloads Lowest (with auto-scaling) Auto-scaling with target utilization
Reserved Capacity Committed steady workload Up to 77% savings Fixed capacity

DynamoDB Streams + EventBridge

Pattern: DynamoDB Streams captures every item change (INSERT, MODIFY, DELETE) β†’ triggers Lambda β†’ fan-out via EventBridge to downstream services. This is the CDC (Change Data Capture) pattern for real-time event-driven architectures.

🎯 Key Takeaway

Interview tip: "I design DynamoDB tables access-pattern-first: identify all queries before choosing PK/SK. I use single-table design to colocate related entities, On-Demand mode for new apps, then switch to Provisioned with auto-scaling once traffic is predictable. DynamoDB Streams + Lambda powers real-time event processing."

Intermediate

Caching β€” ElastiCache & DAX

Caching Strategies Diagram
Service Engine Latency Best For
ElastiCache Redis Redis (in-memory) Sub-millisecond Session store, leaderboards, pub/sub, complex data types
ElastiCache Memcached Memcached (in-memory) Sub-millisecond Simple key-value caching, multi-threaded
DAX DynamoDB-native cache Microseconds DynamoDB read-heavy workloads (drop-in, no code change)
CloudFront CDN edge cache Edge-level Static assets, API responses at the edge

Caching Strategies

Strategy How Use When Trade-off
Lazy Loading (Cache-Aside) App checks cache β†’ miss β†’ read DB β†’ write to cache Most cases, default strategy First read is always a miss (cold start)
Write-Through Write to cache AND DB on every write Data must always be in cache Higher write latency, cache may hold unused data
Write-Behind Write to cache β†’ async write to DB later High write throughput Risk of data loss if cache crashes
TTL Set expiration time on cached items Always β€” prevents stale data Must balance freshness vs hit rate

🎯 Key Takeaway

Interview tip: "I use lazy loading with TTL as my default caching strategy β€” it handles cache misses gracefully and TTL prevents stale data. For DynamoDB reads, DAX gives microsecond latency with zero code changes. Redis is my choice over Memcached because it supports pub/sub, sorted sets, and persistence."

Intermediate

Database Migration (DMS & SCT)

Database Migration Diagram
Tool Purpose Migration Type
DMS (Database Migration Service) Migrate data between databases (including during operation) Homogeneous (MySQL β†’ MySQL) or Heterogeneous (Oracle β†’ Aurora)
SCT (Schema Conversion Tool) Convert schema + stored procedures between engines Heterogeneous only (e.g., Oracle β†’ PostgreSQL)

DMS Migration Modes

  • Full Load β€” One-time migration of all existing data
  • CDC (Change Data Capture) β€” Continuous replication of changes after full load
  • Full Load + CDC β€” Most common: migrate everything, then keep syncing until cutover

Migration Strategy

  1. Assessment: Run SCT to analyze schema compatibility and identify conversion issues
  2. Schema conversion: SCT converts schema, stored procedures, triggers
  3. Full load: DMS migrates all existing data to the target
  4. CDC replication: DMS continuously replicates changes while source is live
  5. Cutover: Stop writes to source β†’ verify target β†’ switch application connection string

🎯 Key Takeaway

Interview tip: "For database migration, I use DMS with Full Load + CDC for near-zero downtime. For heterogeneous (Oracle β†’ Aurora), I run SCT first to convert the schema, then DMS for data migration. During CDC, the source stays live β€” we cut over only when the replication lag is zero."

Advanced

Interview Questions β€” Databases

Database selection and design is a core architect competency. These questions test your ability to choose the right database and handle data at scale.

  1. Answer Guide
    Aurora β€” 5x throughput over standard PostgreSQL, 6 copies across 3 AZs, auto-scaling storage to 128TB, up to 15 read replicas. The storage architecture (shared distributed volume) eliminates replication lag for read replicas.
  2. Answer Guide
    PK = customer_id, SK = order_date#order_id (enables range queries). GSI1: PK = order_id (single lookup). Consider single-table design vs multiple tables. Discuss hot partition risk if one customer has millions of orders.
  3. Answer Guide
    1) Write sharding β€” add random suffix to partition key (e.g., customer_123#1, customer_123#2). 2) Composite keys β€” add a time-based prefix. 3) Caching β€” DAX in front of DynamoDB for read-heavy patterns. Also: re-evaluate the data model itself.
  4. Answer Guide
    DAX is specifically for DynamoDB β€” microsecond read latency, API-compatible (drop-in). ElastiCache Redis is general-purpose β€” supports complex data structures (sorted sets, pub/sub, Lua scripting), works with any backend. Use DAX for DynamoDB-only caching, Redis for everything else.
  5. Answer Guide
    1) SCT to convert schema. 2) Fix incompatibilities manually (Oracle-specific PL/SQL, sequences, synonyms). 3) DMS Full Load to seed data. 4) DMS CDC for ongoing replication. 5) Validate data consistency. 6) Cut over when replication lag = 0. Test rollback plan before cutover.
  6. Answer Guide
    Write-behind (write to cache immediately, async persist to DB) for real-time updates. Redis Sorted Sets are perfect for leaderboards (ZADD, ZRANGE). Lazy loading risks stale data. Write-through adds latency on every write. Discuss TTL as a safety net.
  7. Answer Guide
    DynamoDB Global Tables β€” active-active in all 3 regions with eventual consistency (typically sub-second replication). For SQL: Aurora Global Database gives low-latency reads in secondary regions, but writes go to one primary region. Trade-off: DynamoDB has eventual consistency, Aurora has single-writer.
  8. Answer Guide
    RDS Proxy β€” connection pooling and multiplexing. Lambda creates a new connection per invocation; RDS Proxy pools them (e.g., 1,000 Lambda connections β†’ 100 DB connections). Also supports IAM authentication and automatic failover. Without RDS Proxy, you'll exhaust max_connections.
  9. Answer Guide
    Switch to Provisioned capacity with auto-scaling (20-40% cheaper for predictable traffic). Add DAX for read-heavy patterns (reduces consumed RCUs dramatically). Enable DynamoDB Reserved Capacity for baseline. Consider DynamoDB Infrequent Access table class for cold data.
  10. Answer Guide
    Eventually consistent (default) β€” may return stale data (replication takes ~ms). Strongly consistent β€” always returns latest, but 2x the cost and only works against the leader node. Use strong consistency for: financial transactions, inventory counts, anything where reading stale data causes business errors.

Preparation Strategy

Database questions test your decision framework: SQL vs NoSQL, consistency vs availability, cost vs performance. For every answer, explain the trade-off: "I chose Aurora over DynamoDB because [requirement] outweighs [trade-off]." Interviewers want to see you think through options, not jump to a favorite.