This is the official implementation of our paper MixPrompt: Efficient Mixed Prompting forMultimodal Semantic Segmentation.
Authors: Zhiwei Hao, Zhongyu Xiao, Jianyuan Guo, Li Shen, Yong Luo, Han Hu, Dan Zeng
We present a prompt based multimodal semantic segmentation on the basis of pertained single-modality RGB model.
You could download the official NYU Depth V2 data here. After downloading the official data, you should modify them according to the structure of directories we provide.
You can download the dataset from the official SUNRGBD website and preprocess it according to the requirements specified on the website.
For RGB-Depth semantic segmentation, the generation of HHA maps from Depth maps can refer to https://github.com/charlesCXK/Depth2HHA-python.
You can download the dataset from the official MFNet website
You can download the dataset from the official DELIVER website
- Clone this repo.
$ git clone https://github.com/xiaoshideta/MixPrompt.git
$ cd mixprompt-main
- Install all dependencies.
$ conda create -n mixprompt python=3.8.11
$ pip install -r requirements.txt
$ conda activate mixprompt
Your directory tree should look like this:
|-- <configs>
|-- <semseg>
|-- <datasets>
|-- <models>
|-- <utils>
|-- <pretrained>
|-- <pre>
|-- <segformer>
|-- <dataset>
|-- <NYUDepthv2>
|-- <RGBFolder>
|-- <HHAFolder>
|-- <LabelFolder>
|-- train.txt
|-- test.txtDownload the pretrained segformer here.
$ bash train.sh$ CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --master_port=25035 --nproc_per_node=4 --use_env train_mm.py --cfg configs/nyu.yaml --wandb 0$ bash val.shCUDA_VISIBLE_DEVICES=5 python val_mm.py --cfg configs/nyu.yamlPart of our code is based on CMNeXt and PrimKD, thanks for their excellent work!







