Autonomous RL Load Balancer

An intelligent, distributed microservice architecture that uses a Reinforcement Learning (PPO) agent to dynamically route tasks across a cluster of Java Spring Boot worker nodes based on real-time hardware capacity.

Architecture Overview

This project simulates a distributed system with three primary components:

Master Node (API Gateway & Router): A Java Spring Boot application that acts as the single point of entry. It maintains a strictly ordered, thread-safe registry of all active workers and their hardware states.
Worker Nodes (The Cluster): Isolated Java Spring Boot instances. Upon boot, they dynamically read their CPU and RAM constraints from the Docker environment and register themselves with the Master Node via gRPC/HTTP payloads.
AI Inference Service: A Python FastAPI microservice hosting a trained Proximal Policy Optimization (PPO) model (stable-baselines3). It receives the deterministic cluster state from the Master Node, calculates the optimal distribution of the incoming payload, and returns the target worker index.

Key Features

Reinforcement Learning Routing: Replaces static algorithms (Round Robin, Least Connections) with an AI model trained to prevent node starvation and optimize global cluster throughput.
Active Load Shedding: Protects the cluster from DDoS-level starvation by proactively dropping tasks (returning 429/503/500 HTTP status codes) when the RL agent detects the cluster is at maximum capacity.
Fully Containerized: Isolated Docker Bridge network ensures internal microservices communicate securely without exposing internal ports to the host machine.

Prerequisites

Docker & Docker Compose (WSL2 integration recommended for Windows)
Java 17+ (For local development/compilation)
Python 3.10+ (For local training/stress-testing)
Maven

Boot Sequence

Due to strict microservice registration dependencies, the cluster must be booted in a specific sequence to ensure the Master Node is ready to accept registrations before the workers wake up.

1. Boot the Master Node & AI Service

docker-compose up --build -d master-node ai-service

2. Boot the Worker Fleet

docker-compose up --build -d worker-1 worker-2 worker-3 worker-4

Configuration (Docker Compose)

You can test the AI's adaptability by altering the hardware constraints of the workers in your docker-compose.yml.

  worker-2:
    environment:
      - WORKER_ID=worker-2
      - WORKER_RAM=8192.0

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
rl-load-balancer		rl-load-balancer
rl-training-env		rl-training-env
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Autonomous RL Load Balancer

Architecture Overview

Key Features

Prerequisites

Boot Sequence

Configuration (Docker Compose)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Autonomous RL Load Balancer

Architecture Overview

Key Features

Prerequisites

Boot Sequence

Configuration (Docker Compose)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages