Highly Available AWS Deployment with Auto Scaling

Designed and deployed a fault-tolerant AWS environment across two Availability Zones with ALB and Auto Scaling. Validated zero-downtime High Availability by terminating EC2 instances mid-traffic and confirming automatic replacement.

Overview

High Availability is a design claim until it is tested under failure. This project designed a fault-tolerant AWS environment and then deliberately introduced failures to validate that the HA guarantees held — under live traffic, without manual intervention.

Architecture

Custom VPC — isolated network environment with public and private subnets
Two Availability Zones — all resources distributed across AZs to eliminate single-AZ as a single point of failure
Application Load Balancer (ALB) — distributes traffic across healthy instances; health checks detect and route around failures
Auto Scaling Group with Launch Templates — maintains desired capacity; automatically launches replacement instances when capacity drops

The Validation Test

The critical part of this project was not the deployment — it was the failure test.

With the application serving live traffic, EC2 instances were terminated mid-traffic. The ALB health checks detected the unhealthy targets, stopped routing traffic to them, and the Auto Scaling Group launched replacement instances automatically. Throughout this process, the application remained available. Zero downtime was confirmed by monitoring the response stream throughout the termination and recovery cycle.

Technologies Used

AWS EC2 — compute instances in Auto Scaling Group
Application Load Balancer — health-check-aware traffic distribution
Auto Scaling Group + Launch Templates — automated capacity management
VPC, Public/Private Subnets — network isolation across two AZs
CloudWatch — monitoring during failure and recovery cycle

Results

Zero downtime confirmed during EC2 instance termination under live traffic. Automatic instance replacement completed within the Auto Scaling Group's configured warm-up period. ALB health checks successfully identified and excluded unhealthy targets before replacement instances were ready.

Key Learnings

A system is only as highly available as its least-tested failure path. Designing for HA and validating HA are two different activities — this project made that concrete. The combination of ALB health checks and Auto Scaling creates a self-healing loop: detect failure, stop routing to it, replace the failed capacity. Understanding how each component contributes to that loop is essential for building systems that hold up under real operational conditions.