researcher = {
"name" : "Youcef Abdelhalim",
"degree" : "MSc Artificial Intelligence (M2) β Mohamed Khider University, Biskra",
"focus" : ["Medical Imaging", "Semantic & Instance Segmentation",
"Explainable AI (XAI)", "Multimodal Perception"],
"publication": "ICECET 2026 β Post-Hoc Explainability for Generative Volumetric VLMs",
"club" : "Debug Scientific Club",
"location" : "Biskra, Algeria π©πΏ",
}I build interpretable, high-performance vision systems β from low-level CUDA kernels to multimodal RAG pipelines.
My research sits at the intersection of deep learning transparency and clinical relevance, with a focus on making model decisions trustworthy in medical contexts.
| π₯ | National Winner β Huawei ICT Competition, Cloud Track (2025) |
|---|---|
| π₯ | 3rd Place β National AI Hackathon, University of El Oued (2025) |
| π | First-Author Paper accepted @ ICECET 2026, Rome, Italy |
| π°οΈ | Team Lead β NASA Space Apps Challenge (2025) |
| π₯ | 3rd Place β CSBIS AI Competition, University of Mohamed Khider (2023) |
CT & MRI Multimodal Explainability for Generative Volumetric Models
Tied to the ICECET 2026 accepted paper.
- Adapted post-hoc XAI techniques to generative volumetric multimodal models
- Developed modality-specific attribution alignment and volumetric saliency aggregation across slices
- Implemented gradient-based & perturbation-based attributions tailored to generative outputs
- Consistency regularization across neighboring slices + anatomical overlay visualizations
- Reproducible evaluation: quantitative faithfulness/localization metrics + expert qualitative assessment
- Stack:
PyTorchΒ·CaptumΒ·NiBabelΒ·SimpleITK
Prompt-Guided Image Segmentation (PromptSeg)
- Lightweight multimodal framework generating pixel-accurate masks from free-text prompts
- Fused frozen DINOv2 visual features with CLIP text embeddings via trainable SAM-based decoder (~9.3M params)
- Trained on RefCOCO with multi-scale FPN fusion β best validation IoU: 0.42
- Documented failure modes; proposed LoRA fine-tuning + token-level text fusion improvements
CuVision Engine β Native C++/CUDA CV Framework
- Low-latency CV framework targeting cuDNN & cuBLAS primitives directly β zero framework overhead
- 20 custom CUDA kernels: classification, RetinaNet-based detection (FPN), Attention U-Net (ASPP)
- Designed for edge deployment on NVIDIA Jetson hardware
Multimodal RAG Platform
- Modular RAG pipeline over text + images + audio
- Integrated Florence-2 (captioning) Β· Whisper (transcription) Β· DocLayout-YOLO (parsing)
- E5-small-v2 embeddings + FAISS retrieval β 94% faithfulness, 250ms avg latency
- Deployed via Docker with MLflow tracking + LLM-as-a-judge evaluation suite
Computer Vision ββββββββββββββββββββ Semantic / Instance Segmentation / Object Detection / Classification
Medical Imaging ββββββββββββββββββββ 3D Volumetric Analysis
Explainable AI ββββββββββββββββββββ Grad-CAM Β· IG Β· Guided Backprop
Multimodal AI ββββββββββββββββββββ Vision-Language Β· Vision Large-Language
Edge Systems ββββββββββββββββββββ CUDA Β· cuDNN Β· Jetson Deployment

