This repository is a hands-on DevOps and Linux Administration practice project, focused on building a High Availability (HA) infrastructure on cloud platforms.
It uses Terraform for infrastructure provisioning, Ansible for configuration management, HAProxy + Keepalived for redundancy, and ELK Stack for observability.
The project is structured into milestones that reflect real-world DevOps workflows and practices.
- Cloud VPC/VNet with public/private subnets
- 4 VM Instances:
load_balance_01,load_balance_02→ HAProxy + Keepalived (VIP failover)web_server_01,web_server_02→ Nginx web servers with data sync
- Terraform → Infrastructure as Code (IaC)
- Ansible → Automated configuration and deployment
- ELK Stack → Centralized logging and observability
- Optional → Docker Swarm for container orchestration
This project is broken into 5 milestones, each simulating real-world DevOps practices:
- Goal: Prepare the environment by setting up cloud networking, provisioning base VM instances, configuring SSH access, and enabling Ansible automation.
- Issues:
- Issue 0.1: Configure Terraform Provider & Variables
- Issue 0.2: Create VPC/VNet, Subnets, Route Table, and Security Groups
- Issue 0.3: Provision 4 VM Instances with Static Private IPs
- Issue 0.4: Configure Ansible Inventory and Test Connectivity
- Goal: Deploy and configure HAProxy on two redundant load balancers with Keepalived to provide a Virtual IP (VIP) and ensure high availability of backend web servers.
- Issues:
- Issue 1.1: Deploy HAProxy on Load Balancers
- Issue 1.2: Configure HAProxy Frontend and Backend Pools
- Issue 1.3: Enable HAProxy Stats Page for Monitoring
- Issue 1.4: Install and Configure Keepalived for VIP
- Issue 1.5: Test VIP Failover Between Load Balancers
- Goal: Deploy backend web servers, configure them with Nginx, and ensure data consistency across nodes using file synchronization (rsync or DRBD).
- Issues:
- Issue 2.1: Deploy Nginx on Web Servers
- Issue 2.2: Configure HAProxy to Balance Web Servers
- Issue 2.3: Implement File Synchronization Between Web Servers
- Issue 2.4: Test Data Consistency and Failover
- Goal: Deploy a centralized logging and monitoring stack using ELK (Elasticsearch, Logstash, Kibana) and configure log forwarding from all servers.
- Issues:
- Issue 3.1: Deploy ELK Stack with Docker
- Issue 3.2: Configure Log Forwarding with Filebeat
- Issue 3.3: Validate Logs in Kibana and Build a Dashboard
- Goal: Introduce container orchestration using Docker Swarm to simulate distributed app deployment.
- Issues:
- Issue 4.1: Initialize Docker Swarm Cluster
- Issue 4.2: Deploy a Sample Web App as Docker Service
- Issue 4.3: Configure Health Checks and Validate Service Recovery
- Goal: Simulate node failures, validate VIP and data recovery, and ensure proper cleanup of resources.
- Issues:
- Issue 5.1: Simulate Load Balancer Failure and Validate VIP Failover
- Issue 5.2: Test Web Server Data Recovery
- Issue 5.3: Implement Terraform Destroy and Infrastructure CleanUp
- Issue 5.4: Write Final Documentation and Lessons Learned
git clone https://github.com/AhmadMWaddah/linux-high-availability-Prjct_03.git
cd linux-high-availability-Prjct_03/
cd terraform
terraform init
terraform apply
[load_balancers]
load_balance_01 ansible_host=10.0.1.21
load_balance_02 ansible_host=10.0.1.22
[web_servers]
web_server_01 ansible_host=10.0.2.11
web_server_02 ansible_host=10.0.2.12
[all:vars]
ansible_user=ec2-user
ansible_ssh_private_key_file=~/.ssh/ha-cloud.pem
ansible all -m ping -i inventories/hosts.ini
- Milestone 0: Foundations – In Progress ✓
- Milestone 1: Core HA Architecture – Pending ✓
- Milestone 2: Backend & Data Sync – Pending ✓
- Milestone 3: Observability & Logging – Pending ✓
- Milestone 4: Container Orchestration – Pending ✓
- Milestone 5: Disaster Recovery & Cleanup – Pending ✓
- Terraform – Infrastructure as Code
- Ansible – Configuration Management
- HAProxy + Keepalived – High Availability
- Nginx – Web Servers
- ELK Stack – Observability & Logging
- Docker Swarm (Optional) – Container Orchestration
- GitHub Projects & Issues – Task management
- Rsync – File synchronization
- High Availability is both redundancy and observability. Systems need to be designed with failure in mind from the start.
- Automating with Terraform + Ansible ensures consistency and reduces human error during deployments.
- Testing failover is as important as building HA. Regular testing confirms that redundancy mechanisms work as expected.
- Documentation makes practice projects portfolio-ready and helps with operational knowledge transfer.
- Infrastructure as Code enables reproducible environments that can be easily recreated.
- Centralized logging and monitoring are critical for understanding system behavior and troubleshooting issues.
- Network security and access controls must be considered early in the design process.
This project is primarily a practice lab. Feel free to fork it, suggest improvements, or extend milestones.
MIT License – free to use and modify for learning purposes.