Storage

Storage Interview Guide

S3 storage classes, block vs file vs object storage, data transfer, and backup — critical for cost optimization questions.

5Topics

Intermediate

S3 Storage Classes

Knowing S3 storage classes = cost optimization mastery. This table is asked constantly:

Class	Durability	Availability	Min Duration	Retrieval	Cost (per GB/mo)	Use Case
S3 Standard	11 9s	99.99%	None	Instant	$0.023	Frequently accessed data
S3 Intelligent-Tiering	11 9s	99.9%	None	Instant	$0.023 + monitoring	Unknown access patterns
S3 Standard-IA	11 9s	99.9%	30 days	Instant	$0.0125	Backups, infrequent access
S3 One Zone-IA	11 9s	99.5%	30 days	Instant	$0.01	Reproducible data, secondary backups
S3 Glacier Instant	11 9s	99.9%	90 days	Milliseconds	$0.004	Rarely accessed, but need instant access
S3 Glacier Flexible	11 9s	99.99%	90 days	1-12 hours	$0.0036	Archives, compliance data
S3 Glacier Deep Archive	11 9s	99.99%	180 days	12-48 hours	$0.00099	Regulatory retention (7+ years)

Lifecycle Policies (Auto-Transition)

Pattern: S3 Standard (0-30 days) → S3 Standard-IA (30-90 days) → Glacier Instant (90-365 days) → Glacier Deep Archive (365+ days → delete after 7 years). This reduces storage costs by 80%+ over keeping everything in Standard.

🎯 Key Takeaway

Interview tip: "I set lifecycle policies on every S3 bucket: transition to IA after 30 days, Glacier after 90, Deep Archive after 365. For unknown patterns, Intelligent-Tiering automates this. All classes have 11 9s durability — availability differs. One Zone-IA saves 20% but is single-AZ — only for reproducible data."

Advanced

S3 Advanced Features

Feature	What	Use Case
Versioning	Keep multiple versions of every object	Accidental delete protection, audit trail
Cross-Region Replication (CRR)	Async replicate objects to another region	DR, compliance (multi-region copies), low-latency access
Same-Region Replication (SRR)	Replicate within the same region	Log aggregation, live copy between accounts
S3 Object Lock	WORM (Write Once Read Many)	Regulatory compliance (SEC, HIPAA), ransomware protection
S3 Event Notifications	Trigger Lambda/SQS/SNS/EventBridge on PUT/DELETE	Image processing, ETL pipelines, real-time analytics
Pre-Signed URLs	Temporary access to private objects	File upload/download without exposing credentials
S3 Transfer Acceleration	Use CloudFront edges for faster uploads	Global users uploading to a single bucket
Multipart Upload	Upload large files in parallel parts	Required for files >5GB, recommended for >100MB

S3 Security Layers

Bucket Policy — Allow/deny access to the bucket and objects (resource-based)
Block Public Access — Account-level setting. Always enable. AWS recommends ON for all accounts.
OAC (Origin Access Control) — Only let CloudFront access S3. Replace the legacy OAI.
S3 Access Points — Named network endpoints with unique policies per use case. Simplify shared bucket management.

🎯 Key Takeaway

Interview tip: "I enable versioning + CRR for critical data, Block Public Access at the account level, Object Lock for compliance. For serving S3 via CloudFront, I use OAC (not the legacy OAI). Pre-signed URLs handle secure file uploads — never pass files through Lambda."

Intermediate

EBS vs EFS vs FSx

Feature	EBS	EFS	FSx for Lustre	FSx for Windows
Type	Block storage	NFS file system	High-performance parallel FS	SMB/Windows file server
Attach	Single EC2 (io1/io2 multi-attach)	1000s of EC2/ECS/Lambda	1000s of instances	Windows instances
AZ Scope	Single AZ	Multi-AZ (regional)	Single AZ or linked to S3	Single or Multi-AZ
Performance	Up to 256K IOPS (io2)	Elastic throughput	Hundreds of GB/s	Up to 3 GB/s
Cost	Lowest (gp3: $0.08/GB)	$0.30/GB (Standard)	$0.14/GB	$0.13/GB
Use Case	Boot volumes, databases, single-instance	Shared content, web servers, ML training data	HPC, ML training, genomics	Active Directory, .NET apps

EBS Volume Types (Know These)

Type	IOPS	Throughput	Use Case
gp3	3,000 (baseline), 16K max	125 MB/s, 1,000 max	Default for most workloads (best price/performance)
io2 Block Express	Up to 256,000	4,000 MB/s	Critical databases (Oracle, SAP HANA)
st1	N/A (throughput-optimized)	Up to 500 MB/s	Big data, Hadoop, data warehouses
sc1	N/A (cold storage)	Up to 250 MB/s	Infrequent access, lowest cost

🎯 Key Takeaway

Interview tip: "EBS for single-instance block storage (gp3 is my default), EFS for shared NFS across multiple instances (containers, Lambda), FSx for Lustre for HPC/ML workloads, FSx for Windows for Active Directory environments. Always use gp3 over gp2 — same performance at 20% lower cost."

Intermediate

Data Transfer Strategies

How do you move terabytes/petabytes to AWS? This is a common migration question:

Service	Data Volume	Speed	Use Case
S3 Transfer Acceleration	Any size	Network speed via CloudFront edges	Global uploads to a single S3 bucket
AWS DataSync	GB to TB	10 Gbps via agent	On-prem NFS/SMB → S3/EFS/FSx (recurring sync)
Snow Family (Snowcone)	8-14 TB	Physical device	Edge computing, small data migration
Snow Family (Snowball Edge)	80-210 TB	Physical device, ~1 week	Large migration when network too slow
AWS Snowmobile (Discontinued)	Up to 100 PB (no longer available)	Shipping container truck	Datacenter-scale migration
Direct Connect	Continuous	1/10/100 Gbps dedicated	Ongoing hybrid workloads

Decision Rule of Thumb

If it takes more than 1 week to transfer over your network → use Snow Family. Example: 100 TB over 1 Gbps connection = ~12 days. Use Snowball Edge instead.

🎯 Key Takeaway

Interview tip: "I calculate transfer time first: if it's over a week on the available bandwidth, I use Snowball Edge for offline transfer. For ongoing sync, DataSync with a local agent. For global users uploading files, S3 Transfer Acceleration via CloudFront edges. The key insight is: network transfer doesn't scale linearly — physical device wins for large volumes."

Intermediate

Backup & Recovery Architecture

Service	What It Backs Up	Key Feature
AWS Backup	EC2, EBS, RDS, DynamoDB, EFS, FSx, S3, Aurora	Central policy-based backup across all services. Cross-region, cross-account vault.
EBS Snapshots	EBS volumes	Incremental, stored in S3. Can copy cross-region for DR.
RDS Automated Backups	RDS/Aurora databases	Point-in-time recovery (PITR) within retention window (up to 35 days).
DynamoDB PITR	DynamoDB tables	Continuous backups, restore to any second in last 35 days.
S3 Versioning + Replication	S3 objects	Version history + cross-region copy for DR.

Backup Architecture Best Practices

Centralize with AWS Backup — One policy for all services. Define backup frequency, retention, and cross-region vault copy.
Cross-Account Vault — Copy backups to a separate "backup account" for ransomware protection. The backup account has restricted access — even if the production account is compromised, backups are safe.
Immutable Backups — Use AWS Backup Vault Lock (WORM) to prevent anyone from deleting backups before retention expires.
Test Restores — Regularly test restoration to verify RTO. A backup you can't restore is not a backup.

🎯 Key Takeaway

Interview tip: "I use AWS Backup as the central policy for all services — daily backups with 30-day retention, copied to a cross-account vault for ransomware protection, with Vault Lock for immutability. For RDS/Aurora, I enable PITR so I can restore to any second. For critical S3 data, versioning + CRR to a secondary region."

Advanced

Interview Questions — Storage

Storage decisions impact cost, performance, and durability. These questions test real-world storage architecture thinking.

Your S3 bucket costs $50,000/month. Walk me through how you'd analyze and reduce the bill without losing data.

Answer Guide

S3 Storage Lens + S3 Analytics for access patterns. Implement lifecycle policies (Standard → IA after 30 days → Glacier after 90 days). Enable Intelligent-Tiering for unpredictable access. Clean up incomplete multipart uploads and expired delete markers.
When would you choose EBS vs EFS vs FSx for Lustre? A machine learning team needs shared storage for a training dataset accessed by 100 GPU instances simultaneously.

Answer Guide

FSx for Lustre — purpose-built for HPC/ML workloads, sub-millisecond latency, integrates with S3. EFS works but has higher latency. EBS can't be shared across instances (except io2 Multi-Attach, limited). Consider FSx for Lustre with S3 data repository for lazy loading.
You need to transfer 80TB of data from on-premises to AWS. Your internet connection is 1 Gbps. Should you transfer over the network or use Snowball? Calculate the time.

Answer Guide

80TB over 1 Gbps = ~7.4 days (80,000 GB × 8 bits / 1 Gbps / 86,400 sec). With protocol overhead, ~10 days. Snowball Edge (80TB capacity) takes 2-3 days shipping + 1 day load. For one-time transfers over 10TB, Snowball is usually faster and cheaper.
Explain S3 Object Lock and how you'd use it to protect against ransomware. What's the difference between Governance mode and Compliance mode?

Answer Guide

Governance mode — users with special permissions (s3:BypassGovernanceRetention) can delete/overwrite. Compliance mode — nobody can delete, not even the root account, until retention expires. For ransomware: Compliance mode + versioning + cross-account replication.
Your application writes 10,000 objects per second to S3. Users report intermittent 503 Slow Down errors. What's happening and how do you fix it?

Answer Guide

S3 supports 5,500 GET and 3,500 PUT per second per prefix. Use prefix distribution strategy — add date-based or hash-based prefixes to spread load. S3 automatically partitions, but sudden spikes hit limits before auto-scaling kicks in.
Design a backup and recovery strategy for a regulated financial application. Requirements: RPO of 1 hour, RTO of 15 minutes, immutable backups, cross-account isolation.

Answer Guide

AWS Backup with hourly backup schedules, cross-account vault copy, Vault Lock for immutability. For RDS: PITR for point-in-time recovery (5-minute RPO). S3: versioning + CRR. Test restore procedures regularly — untested backups are not backups.
A team stores user-uploaded images in S3 and serves them through CloudFront. They want to add real-time image resizing (thumbnails, WebP conversion). Design the solution.

Answer Guide

CloudFront + Lambda@Edge (on origin-request). Check if resized version exists in S3 → if not, invoke Lambda to resize, store in S3, and return. Alternative: S3 Object Lambda for on-the-fly transformation. Discuss cache strategy to avoid redundant Lambda invocations.
What's the difference between S3 Standard, S3 Standard-IA, S3 One Zone-IA, and S3 Glacier? A customer says "just put everything in Glacier to save money." Why is this potentially wrong?

Answer Guide

Glacier has retrieval costs ($0.01-$0.03/GB) and minimum 90/180-day storage charges. If you access data within the minimum period, it's MORE expensive. Glacier retrieval can take minutes to hours. Use S3 Intelligent-Tiering for unpredictable access patterns instead.
You need to migrate a 500TB data lake from on-premises HDFS to S3. The data grows by 2TB daily. Describe your migration strategy for both the initial bulk load and ongoing synchronization.

Answer Guide

Initial: AWS Snowball Edge devices (multiple in parallel). Ongoing: AWS DataSync for daily delta sync over Direct Connect. Target S3 layout: use Hive-compatible partitioning (year/month/day) for query performance with Athena/Glue.
Explain S3 consistency model. A colleague says "S3 is eventually consistent so we might read stale data after a write." Is this still true?

Answer Guide

Since December 2020, S3 delivers strong read-after-write consistency for all operations (PUT, DELETE, LIST). This is a common gotcha — many interview prep materials are outdated. You will always read the latest version immediately after writing.

Preparation Strategy

Storage questions often involve cost calculations and data volume math. Practice converting TB to transfer times, calculating monthly storage costs across tiers, and explaining lifecycle policy economics. Interviewers love candidates who can do quick napkin math.