OpenVINO NPU Inference Benchmark Suite 🚀

Unlock the full potential of your AI PC. A professional-grade benchmarking framework designed to validate, test, and optimize AI inference performance on Intel NPU (Neural Processing Unit), CPU, and GPU.

📖 Table of Contents

Overview
Key Features
Demo Results
Supported Models
Installation
User Guide
CLI Reference
Hardware Support
Troubleshooting
Contributing
License

🌟 Overview

The OpenVINO NPU Inference Benchmark Suite is a comprehensive tool for developers, researchers, and hardware enthusiasts to measure the AI capability of their systems. It specifically targets Intel Core Ultra processors with integrated NPUs, providing deep insights into latency, throughput, and power efficiency speedups compared to traditional CPU inference.

Why Benchmarking Matters?

Modern AI workloads (Generative AI, Computer Vision, LLMs) require specialized hardware. The NPU is a dedicated accelerator for these tasks, but measuring its real-world performance can be complex. This suite simplifies the process, offering:

Direct Comparisons: CPU vs. GPU vs. NPU side-by-side.
Optimized Pipelines: Automated conversion of PyTorch/ONNX models to NPU-friendly OpenVINO IR formats.
Visual Analytics: Interactive web dashboards and professional HTML reports.

✨ Key Features

🚀 Multi-Device Support: Seamlessly benchmark across CPU, Intel Arc/Integrated GPU, and Intel AI Boost NPU.
📊 Interactive Dashboard: A stunning, glassmorphic web UI to run tests and visualize real-time performance.
📑 Professional Reports: Generate detailed HTML reports with hardware specs and speedup metrics.
🔄 Model Zoo: Curated collection of industry-standard models (ResNet, YOLO, BERT) pre-configured for NPU.
🧠 Advanced Optimization:
- INT8 Quantization: Compress models for faster NPU inference with negligible accuracy loss.
- Static Shape Enforcement: Automatically handles NPU-specific input requirements.
- Batch Sweeps: Find the optimal batch size for maximum throughput.
🐍 Python API & CLI: Flexible usage for both researchers (Python) and quick testers (CLI).

📊 Demo Results

Real benchmark results from Intel Core Ultra 7 255H with Intel AI Boost NPU:

Model	CPU (ms)	NPU (ms)	Speedup
ResNet-50	73.1	8.5	8.6x
EfficientNet-B0	17.8	3.5	5.1x
MobileNetV3-Small	3.6	1.2	3.0x

Key Metrics:

🏆 8.9x Maximum NPU Speedup
📊 5.3x Average NPU Speedup
⚡ 1.1ms Fastest NPU Latency

🌐 View Live Demo Results →

🤖 Supported Models (Model Zoo)

The suite includes a diverse set of state-of-the-art models covering major AI domains:

👁️ Computer Vision (Classification)

Model	Description	Use Case
ResNet-50	Deep residual network with 50 layers.	Image classification standard benchmark.
MobileNetV3	Optimized for mobile/edge devices.	Low-latency mobile apps.
EfficientNet-B0	Balanced accuracy and efficiency.	General purpose vision tasks.

🎯 Object Detection

Model	Description	Use Case
YOLOv8 (Nano/Small)	"You Only Look Once" - Real-time object detection.	Security, autonomous systems, robotics.
YOLOv11	Latest iteration of YOLO architecture.	Cutting-edge detection performance.

📝 Natural Language Processing (NLP)

Model	Description	Use Case
BERT Base	Bidirectional Encoder Representations from Transformers.	Text classification, QA, sentiment analysis.
DistilBERT	Smaller, faster version of BERT.	Efficient text processing.
ViT (Vision Transformer)	Transformer architecture applied to images.	Advanced image recognition.

All models are automatically downloaded, converted to OpenVINO IR (Intermediate Representation), and optimized for the NPU.

🛠️ Installation

Prerequisites

Windows 10/11 or Linux
Python 3.10 or higher
Intel Core Ultra Processor (Series 1 "Meteor Lake" or Series 2 "Lunar Lake")

Install via pip

# Clone the repository
git clone https://github.com/singhraghvendra2104/OpenVINO-NPU-Inference-Benchmark-Suite.git
cd OpenVINO-NPU-Inference-Benchmark-Suite

# Install in editable mode
pip install -e .

Optional Dependencies

# For INT8 quantization support
pip install -e ".[quantization]"

# For HuggingFace transformers support
pip install -e ".[transformers]"

# Install all optional dependencies
pip install -e ".[all]"

🚀 User Guide

Quick Start

After installation, the npu-benchmark command becomes available. Here's how to get started:

# Step 1: Verify your system has NPU support
npu-benchmark verify

# Step 2: View available models
npu-benchmark models

# Step 3: Launch the web dashboard (easiest way)
npu-benchmark web

# Or run a quick benchmark from CLI
npu-benchmark run resnet50 --iterations 100

1. Interactive Web Dashboard

The easiest and most visual way to use the benchmark suite. Launches a local web server with a beautiful glassmorphic UI.

npu-benchmark web

Features:

Open http://127.0.0.1:5000 in your browser
Select models from the Model Zoo
Choose devices (CPU, NPU, GPU)
View real-time benchmark progress
Interactive performance charts
One-click HTML report download

Options:

npu-benchmark web --host 0.0.0.0 --port 8080  # Custom host/port
npu-benchmark web --no-browser                 # Don't auto-open browser

2. Command Line Interface (CLI)

For automation, scripting, and headless environments.

Run a Benchmark

# Basic benchmark
npu-benchmark run resnet50

# With options
npu-benchmark run yolov8n --iterations 200 --warmup 20 --device CPU --device NPU

# Specify batch size
npu-benchmark run efficientnet_b0 --batch-size 4

# Choose mode (latency or throughput)
npu-benchmark run mobilenet_v3_small --mode throughput

Compare Multiple Models

# Compare all classification models
npu-benchmark compare --category classification

# Compare specific models
npu-benchmark compare --models resnet50 --models yolov8n --models efficientnet_b0

Find Optimal Batch Size

npu-benchmark batch-sweep yolov8n --batch-sizes 1,2,4,8,16

Quantize to INT8

npu-benchmark quantize models/resnet50.xml --samples 100 --output resnet50_int8

Generate HTML Report

npu-benchmark report --input ./benchmarks --output ./reports --theme dark

3. Python API

Integrate benchmarking into your own applications:

from npu_benchmark import BenchmarkRunner, BenchmarkConfig, DeviceType

# Configure benchmark
config = BenchmarkConfig(
    devices=[DeviceType.CPU, DeviceType.NPU],
    num_iterations=100,
    warmup_iterations=10,
    batch_size=1,
    save_results=True
)

# Run benchmark
runner = BenchmarkRunner()
results = runner.run_benchmark("yolov8n", config)

# Access results
for device, metrics in results.results.items():
    print(f"{device}: {metrics.latency.mean_ms:.2f}ms")

# Print speedup
speedup = results.get_speedup("CPU", "NPU")
print(f"NPU Speedup: {speedup:.2f}x")

Using the Model Zoo

from npu_benchmark.models import ModelZoo, ModelCategory

# List available models
zoo = ModelZoo()
models = zoo.list_models(category=ModelCategory.CLASSIFICATION)

for model in models:
    print(f"{model.name}: {model.description}")

# Get specific model
resnet = zoo.get_model("resnet50")
print(f"Input shape: {resnet.input_shape}")

📋 CLI Command Reference

Command	Description	Example
`info`	Show system and device information	`npu-benchmark info`
`verify`	Verify NPU availability and run quick test	`npu-benchmark verify`
`models`	List available benchmark models	`npu-benchmark models --category detection`
`run`	Run benchmark on a model	`npu-benchmark run resnet50 --iterations 100`
`compare`	Compare multiple models	`npu-benchmark compare --category classification`
`batch-sweep`	Find optimal batch size	`npu-benchmark batch-sweep yolov8n`
`quantize`	Quantize model to INT8	`npu-benchmark quantize model.xml`
`report`	Generate HTML report	`npu-benchmark report --theme dark`
`dashboard`	Launch terminal dashboard	`npu-benchmark dashboard`
`web`	Launch web dashboard	`npu-benchmark web`

Global Options

npu-benchmark --verbose <command>  # Enable verbose/debug output
npu-benchmark --help               # Show help

💻 Hardware Support Details

This suite is optimized for the Intel® Core™ Ultra processor family.

Component	Description
NPU (Series 1 - Meteor Lake)	~10 TOPS, ideal for sustained background AI workloads
NPU (Series 2 - Lunar Lake)	~45+ TOPS, capable of heavy generative AI tasks
iGPU (Intel Arc Graphics)	High-throughput parallel processing
CPU	Fallback and baseline comparison standard

🔧 Troubleshooting

NPU Not Detected

Windows: Download and install the Intel NPU Driver
Linux: Install intel-npu-driver package
Run npu-benchmark verify to check status

Model Conversion Errors

Ensure you have the latest OpenVINO version: pip install openvino --upgrade
Check model compatibility in the Model Zoo

Performance Issues

For transformer models (BERT), ensure transformers and optimum[openvino] are installed
Use INT8 quantization for better NPU performance: npu-benchmark quantize

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository.
Create your feature branch (git checkout -b feature/AmazingFeature).
Commit your changes (git commit -m 'Add some AmazingFeature').
Push to the branch (git push origin feature/AmazingFeature).
Open a Pull Request.

📄 License

Distributed under the MIT License. See LICENSE for more information.

Built with ❤️ for the AI Community

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
docs		docs
examples		examples
models		models
reports		reports
src/npu_benchmark		src/npu_benchmark
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

OpenVINO NPU Inference Benchmark Suite 🚀

📖 Table of Contents

🌟 Overview

Why Benchmarking Matters?

✨ Key Features

📊 Demo Results

🤖 Supported Models (Model Zoo)

👁️ Computer Vision (Classification)

🎯 Object Detection

📝 Natural Language Processing (NLP)

🛠️ Installation

Prerequisites

Install via pip

Optional Dependencies

🚀 User Guide

Quick Start

1. Interactive Web Dashboard

2. Command Line Interface (CLI)

Run a Benchmark

Compare Multiple Models

Find Optimal Batch Size

Quantize to INT8

Generate HTML Report

3. Python API

Using the Model Zoo

📋 CLI Command Reference

Global Options

💻 Hardware Support Details

🔧 Troubleshooting

NPU Not Detected

Model Conversion Errors

Performance Issues

🤝 Contributing

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages