Skip to content

youzhaozhao/LLM-Heuristic-Graph-Coloring

Repository files navigation

Evolving DSatur with LLMs

LLM-Guided Design of Second-Order Heuristics for Graph Coloring

Jupyter Notebook License Graph Coloring LLM Assisted

🧬 Human-in-the-Loop Evolution of Graph Coloring Heuristics using Large Language Models
From DSatur baseline to SAT-CR: incorporating second-order neighborhood information through iterative prompt engineering


πŸ“– Overview

This project explores how Large Language Models (LLMs) can assist in designing and refining heuristic algorithms for the classic Graph Coloring Problemβ€”an NP-complete combinatorial optimization challenge.

Building upon the seminal DSatur algorithm (BrΓ©laz, 1979), we employ a human-in-the-loop "hypothesize-verify-correct" framework to iteratively evolve heuristics through prompt engineering. The culmination is SAT-CR (Saturation-Aware Crisis-Responsive), a novel heuristic that fuses lookahead strategies with crisis awareness mechanisms.

Key Highlights

  • 5-Generation Evolution: From failed feature stacking to optimized hybrid strategy
  • Second-Order Awareness: Incorporating neighbors-of-neighbors information into greedy decisions
  • Comprehensive Benchmark: 2,650 graphs across 6 categories (planar, dense/sparse random, scale-free, Mycielskian, bipartite)
  • Interpretable Design: All heuristics derived from explicit LLM prompts, fully documented

Core Insight

SAT-CR achieves 5.4% improvement over DSatur on planar graphs (4.672 β†’ 4.418, p=0.003) by adaptively combining:

  • V3's squared amplification (SatΒ²) for dense graph signal enhancement
  • V4's crisis ratio (1/(Resid+1)) for planar graph dead-end detection

🧱 Project Structure

LLM-Heuristic-Graph-Coloring/
β”‚
β”œβ”€β”€ README.md                                   # This file
β”œβ”€β”€ LICENSE                                     # MIT License
β”œβ”€β”€ requirements.txt                            # Python dependencies
β”‚
β”œβ”€β”€ Graph_Coloring_Heuristics_Experiment.ipynb  # ⭐ All algorithms & experiments
β”œβ”€β”€ Evolving_DSatur_LLM_Graph_Coloring.pdf      # GraphColoring_Paper
β”œβ”€β”€ Graph_Coloring_LLM_Presentation.pdf         # Presentation slides
β”‚
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ Evolving_DSatur_LLM_Graph_Coloring.pdf  
β”‚   └── Graph_Coloring_LLM_Presentation.pdf     
β”‚
β”œβ”€β”€ prompts/                                   # LLM prompts for each generation
β”‚   β”œβ”€β”€ gen1_feature_stacking.md               # Gen 1: Feature stacking (failed)
β”‚   β”œβ”€β”€ gen2_hierarchical.md                   # Gen 2: Tiered logic (stable)
β”‚   β”œβ”€β”€ gen3_lookahead.md                      # Gen 3: Lookahead strategy
β”‚   β”œβ”€β”€ gen4_crisis_awareness.md               # Gen 4: Crisis awareness
β”‚   └── gen5_satcr_fusion.md                   # Gen 5: SAT-CR fusion (best)
β”‚
└── assets/                                     # Figures & visualizations
    β”œβ”€β”€ results/
    β”‚   β”œβ”€β”€ performance_planar_sparse_dense.png
    β”‚   β”œβ”€β”€ performance_scalefree_myciel_bipartite.png
    β”‚   └── execution_time_comparison.png
    └── visualizations/
        └── satcr_planar_graph_60nodes.png

✨ Algorithm Evolution

We document the complete iterative design process, from initial failure to final optimization:

Generation Strategy Core Formula Key Insight Outcome
Gen 1 Feature Stacking 3Β·Sat + 1.2Β·U + 0.7Β·(1-CC)Β·Deg + 0.9Β·ln(S2) Multi-feature linear combination Failed
Gen 2 Hierarchical Logic Sat Γ— 10⁸ + TieBreaker Saturation absolute dominance Stable
Gen 3 Lookahead Sat Γ— 10⁸ + Ξ£ Sat(v)Β² Future cost via squared neighbor saturation Improved
Gen 4 Crisis Awareness Sat Γ— 10⁹ + Ξ£ Sat(v)/(Resid(v)+1) Risk ratio for dead-end detection Specialized
Gen 5 SAT-CR (Hybrid) Sat Γ— 10⁹ + Ξ£ Sat(v)Β²/(Resid(v)+1) Adaptive fusion of V3+V4 Best

πŸ“„ Full prompts available in prompts/ directory


πŸ“Š Main Results

Benchmark on 2,650 graphs (60 vertices each, 500 instances per main category):

Performance Comparison (Average Chromatic Number)

Graph Type DSatur (1979) SAT-CR (Ours) Improvement Statistical Significance
Planar 4.672 4.418 -5.4% t=4.32, p=0.003
Dense Random (p=0.5) 12.526 12.490 -0.3% β€”
Sparse Random (p=0.1) 4.190 4.222 +0.8% β€”
Scale-Free (BarabΓ‘si) 4.033 4.033 Optimal β€”
Mycielskian (Trap) 5.000 5.000 Optimal β€”
Bipartite (Sanity) 2.000 2.000 Optimal β€”

Lower is better. SAT-CR excels on structured graphs (planar) and high-conflict environments (dense), with robust correctness on special graph classes.

Execution Time Trade-off

Algorithm Relative Time Complexity
DSatur 1.0Γ— O(nΒ²)
Gen 2 (Tiered) 5.1Γ— O(nΒ²)
Gen 5 (SAT-CR) 16.7Γ— O(n·Δ²)
Gen 3 (Lookahead) 23.1Γ— O(n·Δ²)
Gen 1 (Original) 102.5Γ— O(nΒ³)

SAT-CR maintains reasonable efficiency while achieving superior solution quality. The overhead comes from single-pass second-order neighborhood traversal.


🎯 Key Contributions

1. 🧬 Iterative LLM-Guided Design Framework

Unlike traditional "black-box" neural approaches, we employ transparent, human-in-the-loop iteration:

Analyze Failures β†’ Craft LLM Prompt β†’ Generate Heuristic β†’ 
Implement & Validate β†’ Evaluate on 6 Graph Types β†’ Iterate

Each generation's complete prompt is preserved in prompts/, enabling reproducibility and pedagogical value.


2. πŸ”¬ SAT-CR: Adaptive Hybrid Heuristic

Mathematical Structure:

Score(u) = Sat(u) Γ— 10⁹  +  Ξ£_{v ∈ N(u)} [ Sat(v)Β² / (Resid(v) + 1) ]
           └─ Primary β”€β”€β”˜  └────────── Tie-Breaker β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”œβ”€ V3: Squared amplification (dense graphs)
                              └─ V4: Inverse residual penalty (planar graphs)

Adaptive Behavior:

  • Dense graphs: SatΒ² dominates numerator, counteracting denominator dilution (recovers V3 advantage)
  • Planar graphs: Small Resid makes denominator significant, preserving V4's dead-end sensitivity

3. πŸ“ˆ Comprehensive Empirical Validation

Aspect Coverage
Graph Types 6 categories covering real-world (scale-free), synthetic (random), theoretical (Mycielskian), and geometric (planar) structures
Scale 2,650 total instances, 500 per main category
Baselines Random, Welsh-Powell, DSatur, plus 5 evolved generations
Metrics Chromatic number, execution time, statistical significance (t-test)

πŸš€ Quick Start

Requirements

  • Python 3.8+
  • Jupyter Notebook
  • NetworkX, Matplotlib, NumPy, Pandas, SciPy, tqdm

Install dependencies:

pip install -r requirements.txt

Run Experiments

Launch the complete notebook:

jupyter notebook Graph_Coloring_Heuristics_Experiment.ipynb

The notebook includes:

  1. Algorithm implementations – All 5 generations + 3 baselines
  2. Dataset generation – 2,650 graphs via Delaunay triangulation & random models
  3. Benchmark execution – Automated evaluation with progress tracking
  4. Visualization – Box plots, runtime analysis, and coloring demonstrations

Reproduce Paper Results

All experiments are contained in a single notebook for easy reproduction:

Section Content
Cell 1-2 Imports & core algorithm framework
Cell 3 Baseline heuristics (Random, Welsh-Powell, DSatur)
Cell 4 All 5 generations (Gen 1-5) with detailed docstrings
Cell 5 Dataset generation (planar, random, scale-free, Mycielskian, bipartite)
Cell 6 Main benchmark – runs all algorithms on all graphs
Cell 7 Analysis & visualization – tables, box plots, timing, case studies

πŸ“ˆ Detailed Analysis

Why SAT-CR Works: Topology-Specific Adaptation

Planar Graphs (Structured, Low Degree)

  • Euler's formula constrains average degree β†’ frequent local dead-ends
  • V4's 1/(Resid+1) spikes when Resid β‰ˆ 0, prioritizing critical nodes
  • SAT-CR preserves this sensitivity

Dense Random Graphs (Homogeneous, High Conflict)

  • Uniform degree distribution β†’ many saturation ties
  • V3's SatΒ² provides nonlinear signal amplification for tie-breaking
  • SAT-CR maintains this discrimination

Scale-Free Networks (Heavy-Tailed)

  • Hub nodes create high-saturation clusters
  • Squared term identifies "super-critical" neighbors
  • SAT-CR achieves optimal 4-coloring

⚠️ Limitations & Future Work

Limitation Details Future Direction
Graph Size Experiments on 60-vertex graphs Scale to 10³–10⁢ vertices via approximation
Time Complexity SAT-CR is ~17Γ— slower than DSatur Parallel neighbor evaluation, GPU acceleration
Automation Human analysis required between iterations Fully automated prompt optimization (e.g., OPRO)
Theoretical Guarantees No approximation ratio proofs Analyze competitive ratio on specific graph classes
Generalization Focused on graph coloring Extend to other NP-hard problems (TSP, Max-Cut)

πŸ“– Citation

If you find this work useful, please cite:

@article{yuan2025evolving,
  title={Evolving DSatur: LLM-Guided Design of Second-Order Heuristics for Graph Coloring},
  author={Yuan, Zhouyan},
  year={2025},
  note={Undergraduate research project}
}

πŸ“„ License

This project is licensed under the MIT License – see the LICENSE file for details.


For full technical details, see the paper and implementation notebook.

About

Exploring LLM-assisted design of graph coloring heuristics through human-in-the-loop iteration. Evolves from DSatur baseline to SAT-CR (Saturation-Aware Crisis-Responsive), incorporating second-order neighborhood information, lookahead, and crisis awareness. Implementation + benchmark on 2,650 graphs (planar, random, scale-free).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors