IceSegNet: A Stage-Aware Dynamic Kernel Network for River Ice Segmentation in Remote Sensing Imagery
English | δΈζ
IceSegNet: A stage-aware dynamic kernel network for river ice segmentation in remote sensing imagery
Kaijun Wu, Dingju Zhou*, Juanjuan Du, Yuelian Wu, Lidong Zhang
Applied Soft Computing, Vol. 186, 2026 | π Paper
- π― Stage-Aware Kernel Update Module β Three structurally distinct stages (detail preservation β transition stabilization β semantic purification) with progressively reduced feedforward widths (2048 β 1024 β 512), cutting FFN parameters by 41.7% while improving mIoU by 0.96%.
- π§ UPerSCA-MTL Decode Head β Unified Perceptual Parsing enhanced with Spatial Cross-Attention (SCA) and Multi-Task Learning (MTL) for joint semantic segmentation and edge detection.
- π State-of-the-Art Results β 93.81% mIoU on NWPU_YRCC2 (+1.19% over K-Net, +1.31% over Mask2Former); 93.56% mIoU on NWPU_YRCC_EX (+0.43% over K-Net).
Accurate segmentation of river ice in remote sensing imagery is critical for quantifying ice coverageβa key variable in early warning and risk assessment of ice-jam disasters. IceSegNet addresses the challenges of large-scale variation, spectral similarity between ice and water, and ambiguous boundaries through two core innovations:
- A stage-aware kernel update module that refines features through three structurally distinct stages with progressively reduced hidden widths.
- UPerSCA-MTL, a multi-task decoding head that fuses spatial cross-attention and edge detection to enhance boundary accuracy.
Evaluated on two Yellow River ice datasets β NWPU_YRCC2 (1525 images, 4 classes) and NWPU_YRCC_EX (887 images, 3 classes) β IceSegNet achieves state-of-the-art performance among 18 competing segmentation models.
| Stage | Role | FFN Hidden Width |
|---|---|---|
| I | Detail Preservation β retains fine texture and edge cues | 2048 |
| II | Transition Stabilization β bridges low-level detail to semantics | 1024 |
| III | Semantic Purification β compact class-discriminative embeddings | 512 |
- PPM: Multi-scale context aggregation via pyramid pooling
- SCA (Spatial Cross-Attention): Horizontal + vertical global pooling to capture axis-specific directional dependencies, enhancing ambiguous iceβwater boundary delineation
- Depthwise Separable Convolutions: Reduce computational complexity without sacrificing feature quality
- Multi-Task Head: Parallel segmentation + edge detection branches; edge supervision derived automatically from GT mask gradients (no extra annotation needed)
| Method | Venue | mIoU (%) | PA (%) | mFscore (%) | FPS | Params (M) |
|---|---|---|---|---|---|---|
| U-Net | MICCAI 2015 | 61.62 | 80.59 | 75.55 | 3.05 | 29 |
| PSPNet | CVPR 2017 | 90.70 | 94.28 | 95.07 | 9.41 | 47 |
| DeepLabv3+ | ECCV 2018 | 90.37 | 94.74 | 94.92 | 5.42 | 60 |
| SegFormer | NeurIPS 2021 | 86.99 | 92.61 | 92.98 | 4.18 | 82 |
| Mask2Former | CVPR 2022 | 92.50 | 96.12 | 96.09 | 3.75 | 216 |
| DINOv2+Rein | CVPR 2025 | 89.86 | 94.01 | 94.63 | 2.21 | 317 |
| K-Net (baseline) | NeurIPS 2021 | 92.62 | 95.78 | 96.15 | 3.58 | 245 |
| IceSegNet (Ours) | ASOC 2025 | 93.81 | 96.31 | 96.79 | 3.43 | 247 |
| Method | mIoU (%) | PA (%) | mFscore (%) |
|---|---|---|---|
| PSPNet | 85.42 | 92.54 | 92.13 |
| DeepLabv3+ | 88.04 | 93.93 | 93.63 |
| Mask2Former | 93.26 | 96.56 | 96.50 |
| K-Net (baseline) | 93.13 | 96.49 | 96.43 |
| IceSegNet (Ours) | 93.56 | 96.73 | 96.67 |
- Python β₯ 3.8
- PyTorch β₯ 1.12 with CUDA
- MMEngine, MMCV β₯ 2.0, MMSegmentation
1. Create and activate conda environment
conda create -n icesegnet python=3.8 -y
conda activate icesegnet2. Install PyTorch (example: CUDA 11.6)
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 \
--extra-index-url https://download.pytorch.org/whl/cu1163. Install MMEngine and MMCV
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0"4. Install MMSegmentation
git clone https://github.com/open-mmlab/mmsegmentation.git
cd mmsegmentation
pip install -v -e .
cd ..5. Clone this repository
git clone https://github.com/fox-4869/IceSegNet.git
cd IceSegNet
pip install -r requirements.txtDownload the datasets:
- NWPU_YRCC2: https://github.com/nwpulab113/NWPUYRCC2
- NWPU_YRCC_EX: https://github.com/nwpulab113/NWPUYRCCEX
Organize the directory as follows:
data/
βββ NWPU_YRCC2_JPG1/
β βββ train/ # Training images (.jpg)
β βββ train_labels/ # Training annotations (.png)
β βββ val/ # Validation images
β βββ val_labels/ # Validation annotations
βββ NWPU_YRCC_EX/
βββ train/
βββ train_labels/
βββ val/
βββ val_labels/
1. Copy dataset definitions
cp datasets/NWPU_YRCC2_JPG1.py mmsegmentation/mmseg/datasets/
cp datasets/NWPU_YRCC.py mmsegmentation/mmseg/datasets/Add to mmseg/datasets/__init__.py:
from .NWPU_YRCC2_JPG1 import NWPU_YRCC2_JPG1
from .NWPU_YRCC import NWPU_YRCC2. Copy model components
cp models/sefpn.py mmsegmentation/mmseg/models/necks/
cp models/uper_att_plus_head.py mmsegmentation/mmseg/models/decode_heads/Register in the corresponding __init__.py files under mmseg/models/necks/ and mmseg/models/decode_heads/.
Single GPU
python tools/train.py configs/icesegnet-config.pyMulti-GPU (recommended β paper uses 2Γ RTX 3090)
bash tools/dist_train.sh configs/icesegnet-config.py 2Key training settings:
| Hyperparameter | Value |
|---|---|
| Optimizer | AdamW (Ξ²β=0.9, Ξ²β=0.999) |
| Learning Rate | 6Γ10β»β΅ |
| Weight Decay | 5Γ10β»β΄ |
| Batch Size | 4/GPU Γ 2 GPUs = 8 total |
| Max Iterations | 60,000 |
| LR Schedule | Linear warmup (500 iters) + CosineAnnealing |
| Crop Size | 512 Γ 512 |
| Backbone Init | ImageNet-22K pretrained Swin-L |
Standard evaluation
python tools/test.py configs/icesegnet-config.py /path/to/checkpoint.pthWith Test-Time Augmentation (multi-scale + flip)
python tools/test.py configs/icesegnet-config.py /path/to/checkpoint.pth --ttaReported metrics: mIoU, mDice, mFscore, PA, BFscore
IceSegNet/
βββ configs/
β βββ icesegnet-config.py # Full training & evaluation config
βββ datasets/
β βββ NWPU_YRCC2_JPG1.py # 4-class dataset (Land/Water/Shore Ice/Drift Ice)
β βββ NWPU_YRCC.py # 3-class dataset (Others/Water/Shore Ice)
βββ models/
β βββ sefpn.py # SEFPN neck with BN+ReLU normalization
β βββ uper_att_plus_head.py # UPerSCA-MTL decode head
βββ tools/ # Training & testing scripts (MMSeg)
βββ README.md # English README
βββ README_CN.md # Chinese README
If IceSegNet is helpful for your research, please cite:
@article{wu2026icesegnet,
title = {IceSegNet: A stage-aware dynamic kernel network for river ice
segmentation in remote sensing imagery},
author = {Wu, Kaijun and Zhou, Dingju and Du, Juanjuan and
Wu, Yuelian and Zhang, Lidong},
journal = {Applied Soft Computing},
volume = {186},
pages = {114120},
year = {2026},
publisher = {Elsevier},
doi = {10.1016/j.asoc.2025.114120}
}This work was supported by the Natural Science Foundation Key Project of Gansu Province (23JRRA860), the Inner Mongolia Key R&D and Achievement Transformation Project (2023YFSH0043, 2023YFDZ0043, 2023YFDZ0054), the Key Research and Development Project of Lanzhou Jiaotong University (ZDYF2304), and the Excellent Graduate Student "Innovation Star" Project of Gansu Province (2025CXZX-682).
This codebase is built on MMSegmentation. We also thank the authors of K-Net for the foundational dynamic kernel framework.
Corresponding Author: Dingju Zhou β dingjuzhou@163.com
Lanzhou Jiaotong University, Lanzhou 730070, China
