Skip to content

suzzang/D3QN-LLM-for-TSC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Optimizing Traffic Signal Control Using LLM-Driven Reward Weight Adjustment in Reinforcement Learning

D3QN-LLM Flow
Fig.1.D3QN-LLM Framework Fig.2.Flow of the Proposed Methodology

πŸ“ Research Overview

  • This study utilizes a Large Language Model (LLM) to overcome the challenge of setting weights in multi-objective reward functions in reinforcement learning. By dynamically adjusting the reward function weights in real-time, the LLM enables the reinforcement learning agent to learn an optimal policy that maximizes rewards.
  • The experimental environment consists of a single intersection, with the goal of minimizing Average Travel Time through traffic signal control.
  • Dueling Double DQN (D3QN) is used as the reinforcement learning algorithm, while GPT-4o-mini is employed as the LLM for adjusting reward function weights.
  • The proposed model is referred to as D3QN-LLM, and its framework is illustrated in Fig. 1. The detailed process flow of the proposed method is shown in Fig. 2.

❗️ Contributions of the Research

  • By utilizing an LLM to dynamically address the weight-setting challenge in reward function design, a key aspect of reinforcement learning, this study proposes a new research direction.

πŸ”§ Technology Stack and Environment

πŸš€ Key Technologies

  • Python, PyTorch (GPU Acceleration) - Deep learning & reinforcement learning implementation
  • SUMO (Simulation of Urban Mobility) - Traffic simulation and vehicle data generation
  • GPT-4o-mini (LLM API) - Lightweight language model for traffic situation analysis
  • LangChain Framework - LLM prompt engineering and response control
  • Docker - Containerized environment for deployment and reproducibility

πŸ›  Development Environment

  • OS: Ubuntu 22.04.5 / Windows (WSL supported)
  • Docker: Latest version recommended (tested with Docker 27.5.1)
  • CUDA Support: PyTorch acceleration in an NVIDIA GPU environment (if available)
  • SUMO Version: 1.20.0 or later recommended
  • Python Environment: Python 3.10.12 or later (venv or Docker recommended)

πŸ“‚ Project Structure

πŸ“¦D3QN-LLM
 ┣ πŸ“‚asset
 ┣ πŸ“‚d3qn_imgs
 ┣ πŸ“‚d3qn_models
 ┣ πŸ“‚eval_d3qn_imgs
 ┣ πŸ“‚eval_d3qn_txts
 ┣ πŸ“‚llm
 ┣ πŸ“‚logs
 ┣ πŸ“‚sumo
 ┃ ┣ πŸ“‚add
 ┃ ┣ πŸ“‚detectors
 ┃ ┣ πŸ“‚net
 ┃ ┣ πŸ“‚rou
 ┃ ┣ πŸ“‚trip
 ┣ πŸ“œd3qn_agent.py
 ┣ πŸ“œd3qn_tsc_main.py
 β”— πŸ“œtsc_env.py

πŸ“„ Citation

The following research papers and presentations are related to this project.
If this research has been helpful, please cite the papers below.

πŸ“– Scopus Journal

  • S. Choi and Y. Lim, "Optimizing Traffic Signal Control Using LLM-Driven Reward Weight Adjustment in Reinforcement Learning," Journal of Information Processing Systems, vol. 21, no. 1, pp. 43-51, 2025.

🎀 Domestic Conference

  • [Oral Presentation]
    μ΅œμˆ˜μ • and μž„μœ μ§„, "LLM을 μ΄μš©ν•œ κ°•ν™”ν•™μŠ΅κΈ°λ°˜ ꡐ차둜 μ‹ ν˜Έ μ œμ–΄," ν•œκ΅­μ •λ³΄μ²˜λ¦¬ν•™νšŒ ν•™μˆ λŒ€νšŒλ…Όλ¬Έμ§‘, vol. 31, no. 2, pp. 672-675, 2024.

πŸ“š References

  1. B. P. Gokulan and D. Srinivasan, "Distributed geometric fuzzy multiagent urban traffic signal control," IEEE Transactions on Intelligent Transportation Systems, vol. 11, no. 3, pp. 714-727, Sep. 2010. DOI
  2. H. Ceylan and M. G. H. Bell, "Traffic signal timing optimisation based on genetic algorithm approach including drivers’ routing," Transportation Research Part B: Methodological, vol. 38, no. 4, pp. 329-342, 2004. DOI
  3. Sujeong Choi and Yujin Lim, "Reinforcement Learning-Based Traffic Signal Control Using Large Language Models," Annual Conference of KIPS, vol. 31, no. 2, pp. 672-675, 2024.
  4. G. Zheng, X. Zang, N. Xu, H. Wei, Z. Yu, V. Gayah, et al., "Diagnosing reinforcement learning for traffic signal control," arXiv preprint, 2019. DOI
  5. H. Lee, Y. Han, Y. Kim, and Y. H. Kim, "Effects analysis of reward functions on reinforcement learning for traffic signal control," PLoS ONE, vol. 17, no. 11, 2022. DOI
  6. S. Lai, Z. Xu, W. Zhang, H. Liu, and H. Xiong, "Large language models as traffic signal control agents: Capacity and opportunity," arXiv preprint, 2023. DOI
  7. A. Pang, M. Wang, M. O. Pun, C. S. Chen, and X. Xiong, "iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement," arXiv preprint, 2024. DOI
  8. P. A. Lopez, M. Behrisch, L. B. Walz, J. Erdmann, Y. P. Flotterod, R. Hilbrich, L. Lucken, J. Rummel, P. Wagner, and E. Wiessner, "Microscopic traffic simulation using SUMO," Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 2575-2582, 2018. DOI
  9. L. Da, M. Gao, H. Mei, and H. Wei, "Prompt to transfer: Sim-to-real transfer for traffic signal control with prompt learning," AAAI Conference on Artificial Intelligence, vol. 38, pp. 82–90, 2024. DOI

πŸ“œ Usage and Copyright Notice

This project code was developed for personal research and experimental purposes and is stored in a public repository.
This code is not open-source and may not be used for commercial purposes, copied, or distributed without permission.

Copyright (c) 2025 suzzang

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages