PHyDiff-OAM

Physics-aligned Diffusion for OAM Radar Imaging

A physics-constrained diffusion model for Orbital Angular Momentum (OAM) radar imaging, optimized for NVIDIA H800 NVL GPU.

📋 Overview

PHyDiff-OAM is a radar imaging project that combines physical constraints with deep learning. By directly injecting OAM radar signals into the Stable Diffusion model, it achieves high-quality detection and reconstruction of aircraft targets.

Key Innovations

Hard-Concatenation Strategy: Concatenates 8-channel OAM radar physical features with 4-channel noisy latent representations to form a 12-channel input
Physics-AI Hybrid: Combines OAM radar simulation based on Straton-Chu integral approximation with Stable Diffusion diffusion model
H800 Optimization: Uses bfloat16 data type to fully leverage the native computing power of H800 GPU

🔧 Hardware Requirements

GPU: NVIDIA H800 NVL
Recommended VRAM: ≥ 40GB
Data Type: bfloat16 (natively supported by H800)

📦 Installation

pip install -r requirements.txt

Main Dependencies

PyTorch >= 2.4.0
Diffusers >= 0.30.0
Transformers >= 4.44.0
NumPy >= 1.26.0
scikit-image >= 0.24.0

🚀 Quick Start

1. Training

python train.py

Training Configuration:

Batch Size: 8
Training Steps: 5000
Learning Rate: 1e-4
Optimizer: AdamW
Data Type: bfloat16

Training Output Example:

🚀 Starting PHyDiff-OAM Training on NVIDIA H800 NVL...
📦 Loading Stable Diffusion v1.5 components...
🔧 Applying Hard-Concatenation surgery to UNet...
🔥 Start Training (5000 steps)...
   Step 0100/5000 | Loss: 0.0027
   Step 0200/5000 | Loss: 0.0101
   ...
   Step 1200/5000 | Loss: 0.0092
✅ Saving model to checkpoints/radar_unet.pth...

After training, model weights will be saved to checkpoints/radar_unet.pth.

2. Inference and Evaluation

python inference.py

Evaluation Metrics:

PSNR (Peak Signal-to-Noise Ratio)
SSIM (Structural Similarity Index)

Inference Output Example:

🚀 Starting Evaluation...
📦 Loading Model Weights...
✅ Loaded trained weights.
📊 Calculating Metrics...

🏆 Final Results (Avg over 50 samples):
✅ PSNR: 9.3698 dB
✅ SSIM: 0.0694
🖼️ Comparison image saved to results/final_comparison.png

Inference comparison images will be saved to results/final_comparison.png.

📁 Project Structure

PHyDiff-OAM/
├── data_engine.py              # Synthetic radar data generation engine
├── train.py                    # H800 GPU training script
├── inference.py                # Inference and evaluation script
├── requirements.txt            # Dependency configuration
├── models/
│   ├── __init__.py
│   └── physics_adapter.py      # Physics adapter (radar signal preprocessing + UNet surgery)
├── checkpoints/                # Model weights directory
│   └── radar_unet.pth
└── results/                    # Inference results directory
    └── final_comparison.png

🔬 Technical Details

Data Generation Engine (`data_engine.py`)

The AircraftRadarDataset class generates synthetic aircraft targets and OAM radar echoes in real-time:

Image Resolution: 512×512 (ground truth)
Simulation Grid: 64×64 (physical simulation)
Number of OAM Modes: 8 (mode range: l = -3 to +4)
Radar Frequency: 10 GHz
Physical Model: Green's function kernel based on Straton-Chu integral approximation

OAM Sensing Matrix:

K(l, r) = exp(-j2kρ) * exp(jlφ)

Where:

k = 2π/λ is the wave number
ρ is the radial distance
φ is the azimuth angle
l is the OAM mode index

Physics Adapter (`models/physics_adapter.py`)

1. Radar Signal Preprocessing

preprocess_radar_signal(S_radar, target_dtype)

Converts complex radar echoes to real-valued feature maps
Computes magnitude: |S| = √(real² + imag² + ε)
Batch-wise Min-Max normalization
Output: [B, 8, 64, 64] bfloat16 tensor

2. UNet Input Layer Surgery

modify_unet_input_layer(unet, new_channels=12)

Original Input: 4 channels (noisy latent representation)
Extended Input: 12 channels (4-channel noise + 8-channel physical features)
Weight Initialization Strategy:
- First 4 channels: Copy pretrained weights (preserve knowledge)
- Last 8 channels: Zero initialization (avoid disrupting pretrained distribution)

Training Pipeline (`train.py`)

Load Stable Diffusion v1.5 components (VAE, UNet, text encoder)
Perform "surgery" on UNet input layer, expanding to 12 channels
Force conversion to bfloat16 data type (H800 optimization)
Enable gradient checkpointing to save VRAM
Freeze VAE and text encoder, train only UNet
Training loop:
- VAE encodes ground truth to latent representation
- Add noise and randomly sample timesteps
- Concatenate noisy latent representation with physical features
- UNet predicts noise, compute MSE loss
- Gradient clipping (max_norm=1.0)

Inference Pipeline (`inference.py`)

Load trained UNet weights
Perform inference on 50 test samples
Compare with traditional Back-Projection (BP) algorithm
Calculate PSNR and SSIM metrics
Generate visualization comparison images

📊 Performance Metrics

Average results based on 50 test samples:

Metric	Value
PSNR	9.37 dB
SSIM	0.069

Note: These metrics reflect the model's preliminary performance on synthetic data and can be further optimized by increasing training steps, adjusting hyperparameters, or using larger datasets.

⚙️ Configuration

Training Parameters (`train.py`)

BATCH_SIZE = 8          # Batch size
NUM_STEPS = 5000        # Training steps
LEARNING_RATE = 1e-4    # Learning rate
DEVICE = "cuda"         # Device

Data Generation Parameters (`data_engine.py`)

num_samples = 2000      # Samples per epoch
img_size = 512          # Ground truth image resolution
sim_size = 64           # Radar simulation grid resolution
num_modes = 8           # Number of OAM modes
freq = 10e9             # Radar center frequency (10 GHz)

Inference Parameters (`inference.py`)

NUM_SAMPLES = 50        # Number of evaluation samples
NUM_INFERENCE_STEPS = 50  # DDIM inference steps
GUIDANCE_SCALE = 7.5    # Classifier-free guidance strength

🌐 China Mirror Support

The project is configured with Hugging Face China mirror to accelerate model downloads:

os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"

🛠️ Troubleshooting

1. Out of Memory

Reduce batch size (BATCH_SIZE)
Enable gradient checkpointing (enabled by default)
Use gradient accumulation

2. Training Instability

Ensure bfloat16 data type is used
Check if gradient clipping is enabled
Lower learning rate

3. Slow Inference

Reduce inference steps (NUM_INFERENCE_STEPS)
Use faster schedulers (e.g., DPM-Solver++)

📝 Citation

If this project helps your research, please consider citing:

@software{phydiff_oam_2025,
  title={PHyDiff-OAM: Physics-aligned Diffusion for OAM Radar Imaging},
  author={Dryoung},
  year={2025},
  url={https://github.com/cjy20050905/PHyDiff-OAM}
}

📄 License

This project is licensed under the Apache License 2.0. See LICENSE file for details.

🙏 Acknowledgments

Stable Diffusion - Base diffusion model
Hugging Face Diffusers - Diffusion model library
NVIDIA H800 NVL - Powerful computing support

📧 Contact

For questions or suggestions, please contact:

Email: 3241347200@qq.com
GitHub Issues: Submit Issue

Last Updated: 2025-02-23

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
models		models
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
data_engine.py		data_engine.py
inference.py		inference.py
requirements.txt		requirements.txt
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

PHyDiff-OAM

📋 Overview

Key Innovations

🔧 Hardware Requirements

📦 Installation

Main Dependencies

🚀 Quick Start

1. Training

2. Inference and Evaluation

📁 Project Structure

🔬 Technical Details

Data Generation Engine (data_engine.py)

Physics Adapter (models/physics_adapter.py)

1. Radar Signal Preprocessing

2. UNet Input Layer Surgery

Training Pipeline (train.py)

Inference Pipeline (inference.py)

📊 Performance Metrics

⚙️ Configuration

Training Parameters (train.py)

Data Generation Parameters (data_engine.py)

Inference Parameters (inference.py)

🌐 China Mirror Support

🛠️ Troubleshooting

1. Out of Memory

2. Training Instability

3. Slow Inference

📝 Citation

📄 License

🙏 Acknowledgments

📧 Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Data Generation Engine (`data_engine.py`)

Physics Adapter (`models/physics_adapter.py`)

Training Pipeline (`train.py`)

Inference Pipeline (`inference.py`)

Training Parameters (`train.py`)

Data Generation Parameters (`data_engine.py`)

Inference Parameters (`inference.py`)

Packages