VISTA: VLM-based Social Robot Navigation

Overview

VISTA (Visual-Informed Socially Aware Trajectory Agent) is a novel social robot navigation framework that enhances the decision-making ability of robots in dynamic and human-centered environments. Combining Vision-Language Models (VLM) with Deep Reinforcement Learning (DRL), VISTA allows robots to adjust their navigation strategies based on human activities and environmental types.

This repository contains the code for the VISTA framework, including implementation details, model training, and simulation results. The method is validated using simulation experiments in PyBullet and real-world experiments with a TurtleBot3 platform and a RealSense D405 camera.

Key Features:

VLM Integration: Uses GPT-4o-mini to semantically interpret human activities and scene types.
DRL and PPO: Reinforcement learning-based decision-making combined with Proximal Policy Optimization (PPO).
Dynamic Social Navigation: Robots dynamically adjust their navigation strategies based on the priority of human activities (e.g., walking, carrying, talking) and environmental semantics.
Simulation and Real-World Testing: Tested in both simulated environments and real-world scenarios.

Requirements

Python 3.7+
PyBullet for simulation environments
TensorFlow or PyTorch for model training
RealSense SDK (for real-world experiments)
YOLO and DeepSORT (for human detection and tracking in real-time)

Setup

In a conda environment or virtual environment with Python 3.6, 3.7, or 3.8
Install Pytorch 1.12.1 and torchvision

Install OpenAI Baselines:

git clone https://github.com/openai/baselines.git
cd baselines
pip install -e .

Install Python-RVO2 library:

 pip install python-rvo2

Installation

To get started with VISTA, clone the repository:

git clone https://github.com/XXINCH-code/VLM_robot_navi.git
cd VLM_robot_navi

Acknowledgement

This code partly bases on HEIGHT. I thank the authors for releasing their code.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
baselines		baselines
crowd_nav		crowd_nav
crowd_sim		crowd_sim
data/ours_RH_HH_cornerEnv		data/ours_RH_HH_cornerEnv
gif		gif
trained_models/new_version		trained_models/new_version
training		training
yolov8		yolov8
.gitignore		.gitignore
README.md		README.md
check_env.py		check_env.py
plot.py		plot.py
real_world_instruction.txt		real_world_instruction.txt
region图.odt		region图.odt
requirements.txt		requirements.txt
test.png		test.png
test.py		test.py
test2.png		test2.png
train.py		train.py
vlm_prompt.ipynb		vlm_prompt.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VISTA: VLM-based Social Robot Navigation

Overview

Key Features:

Requirements

Setup

Installation

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VISTA: VLM-based Social Robot Navigation

Overview

Key Features:

Requirements

Setup

Installation

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages