Skip to content

singhraghvendra2104/OpenVINO-NPU-Inference-Benchmark-Suite

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

1 Commit
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

OpenVINO NPU Inference Benchmark Suite ๐Ÿš€

OpenVINO Python Hardware License Website

Unlock the full potential of your AI PC. A professional-grade benchmarking framework designed to validate, test, and optimize AI inference performance on Intel NPU (Neural Processing Unit), CPU, and GPU.

OpenVINO Logo


๐Ÿ“– Table of Contents


๐ŸŒŸ Overview

The OpenVINO NPU Inference Benchmark Suite is a comprehensive tool for developers, researchers, and hardware enthusiasts to measure the AI capability of their systems. It specifically targets Intel Core Ultra processors with integrated NPUs, providing deep insights into latency, throughput, and power efficiency speedups compared to traditional CPU inference.

Why Benchmarking Matters?

Modern AI workloads (Generative AI, Computer Vision, LLMs) require specialized hardware. The NPU is a dedicated accelerator for these tasks, but measuring its real-world performance can be complex. This suite simplifies the process, offering:

  • Direct Comparisons: CPU vs. GPU vs. NPU side-by-side.
  • Optimized Pipelines: Automated conversion of PyTorch/ONNX models to NPU-friendly OpenVINO IR formats.
  • Visual Analytics: Interactive web dashboards and professional HTML reports.

โœจ Key Features

  • ๐Ÿš€ Multi-Device Support: Seamlessly benchmark across CPU, Intel Arc/Integrated GPU, and Intel AI Boost NPU.
  • ๐Ÿ“Š Interactive Dashboard: A stunning, glassmorphic web UI to run tests and visualize real-time performance.
  • ๐Ÿ“‘ Professional Reports: Generate detailed HTML reports with hardware specs and speedup metrics.
  • ๐Ÿ”„ Model Zoo: Curated collection of industry-standard models (ResNet, YOLO, BERT) pre-configured for NPU.
  • ๐Ÿง  Advanced Optimization:
    • INT8 Quantization: Compress models for faster NPU inference with negligible accuracy loss.
    • Static Shape Enforcement: Automatically handles NPU-specific input requirements.
    • Batch Sweeps: Find the optimal batch size for maximum throughput.
  • ๐Ÿ Python API & CLI: Flexible usage for both researchers (Python) and quick testers (CLI).

๐Ÿ“Š Demo Results

Real benchmark results from Intel Core Ultra 7 255H with Intel AI Boost NPU:

Model CPU (ms) NPU (ms) Speedup
ResNet-50 73.1 8.5 8.6x
EfficientNet-B0 17.8 3.5 5.1x
MobileNetV3-Small 3.6 1.2 3.0x

Key Metrics:

  • ๐Ÿ† 8.9x Maximum NPU Speedup
  • ๐Ÿ“Š 5.3x Average NPU Speedup
  • โšก 1.1ms Fastest NPU Latency

๐ŸŒ View Live Demo Results โ†’


๐Ÿค– Supported Models (Model Zoo)

The suite includes a diverse set of state-of-the-art models covering major AI domains:

๐Ÿ‘๏ธ Computer Vision (Classification)

Model Description Use Case
ResNet-50 Deep residual network with 50 layers. Image classification standard benchmark.
MobileNetV3 Optimized for mobile/edge devices. Low-latency mobile apps.
EfficientNet-B0 Balanced accuracy and efficiency. General purpose vision tasks.

๐ŸŽฏ Object Detection

Model Description Use Case
YOLOv8 (Nano/Small) "You Only Look Once" - Real-time object detection. Security, autonomous systems, robotics.
YOLOv11 Latest iteration of YOLO architecture. Cutting-edge detection performance.

๐Ÿ“ Natural Language Processing (NLP)

Model Description Use Case
BERT Base Bidirectional Encoder Representations from Transformers. Text classification, QA, sentiment analysis.
DistilBERT Smaller, faster version of BERT. Efficient text processing.
ViT (Vision Transformer) Transformer architecture applied to images. Advanced image recognition.

All models are automatically downloaded, converted to OpenVINO IR (Intermediate Representation), and optimized for the NPU.


๐Ÿ› ๏ธ Installation

Prerequisites

  • Windows 10/11 or Linux
  • Python 3.10 or higher
  • Intel Core Ultra Processor (Series 1 "Meteor Lake" or Series 2 "Lunar Lake")

Install via pip

# Clone the repository
git clone https://github.com/singhraghvendra2104/OpenVINO-NPU-Inference-Benchmark-Suite.git
cd OpenVINO-NPU-Inference-Benchmark-Suite

# Install in editable mode
pip install -e .

Optional Dependencies

# For INT8 quantization support
pip install -e ".[quantization]"

# For HuggingFace transformers support
pip install -e ".[transformers]"

# Install all optional dependencies
pip install -e ".[all]"

๐Ÿš€ User Guide

Quick Start

After installation, the npu-benchmark command becomes available. Here's how to get started:

# Step 1: Verify your system has NPU support
npu-benchmark verify

# Step 2: View available models
npu-benchmark models

# Step 3: Launch the web dashboard (easiest way)
npu-benchmark web

# Or run a quick benchmark from CLI
npu-benchmark run resnet50 --iterations 100

1. Interactive Web Dashboard

The easiest and most visual way to use the benchmark suite. Launches a local web server with a beautiful glassmorphic UI.

npu-benchmark web

Features:

  • Open http://127.0.0.1:5000 in your browser
  • Select models from the Model Zoo
  • Choose devices (CPU, NPU, GPU)
  • View real-time benchmark progress
  • Interactive performance charts
  • One-click HTML report download

Options:

npu-benchmark web --host 0.0.0.0 --port 8080  # Custom host/port
npu-benchmark web --no-browser                 # Don't auto-open browser

2. Command Line Interface (CLI)

For automation, scripting, and headless environments.

Run a Benchmark

# Basic benchmark
npu-benchmark run resnet50

# With options
npu-benchmark run yolov8n --iterations 200 --warmup 20 --device CPU --device NPU

# Specify batch size
npu-benchmark run efficientnet_b0 --batch-size 4

# Choose mode (latency or throughput)
npu-benchmark run mobilenet_v3_small --mode throughput

Compare Multiple Models

# Compare all classification models
npu-benchmark compare --category classification

# Compare specific models
npu-benchmark compare --models resnet50 --models yolov8n --models efficientnet_b0

Find Optimal Batch Size

npu-benchmark batch-sweep yolov8n --batch-sizes 1,2,4,8,16

Quantize to INT8

npu-benchmark quantize models/resnet50.xml --samples 100 --output resnet50_int8

Generate HTML Report

npu-benchmark report --input ./benchmarks --output ./reports --theme dark

3. Python API

Integrate benchmarking into your own applications:

from npu_benchmark import BenchmarkRunner, BenchmarkConfig, DeviceType

# Configure benchmark
config = BenchmarkConfig(
    devices=[DeviceType.CPU, DeviceType.NPU],
    num_iterations=100,
    warmup_iterations=10,
    batch_size=1,
    save_results=True
)

# Run benchmark
runner = BenchmarkRunner()
results = runner.run_benchmark("yolov8n", config)

# Access results
for device, metrics in results.results.items():
    print(f"{device}: {metrics.latency.mean_ms:.2f}ms")

# Print speedup
speedup = results.get_speedup("CPU", "NPU")
print(f"NPU Speedup: {speedup:.2f}x")

Using the Model Zoo

from npu_benchmark.models import ModelZoo, ModelCategory

# List available models
zoo = ModelZoo()
models = zoo.list_models(category=ModelCategory.CLASSIFICATION)

for model in models:
    print(f"{model.name}: {model.description}")

# Get specific model
resnet = zoo.get_model("resnet50")
print(f"Input shape: {resnet.input_shape}")

๐Ÿ“‹ CLI Command Reference

Command Description Example
info Show system and device information npu-benchmark info
verify Verify NPU availability and run quick test npu-benchmark verify
models List available benchmark models npu-benchmark models --category detection
run Run benchmark on a model npu-benchmark run resnet50 --iterations 100
compare Compare multiple models npu-benchmark compare --category classification
batch-sweep Find optimal batch size npu-benchmark batch-sweep yolov8n
quantize Quantize model to INT8 npu-benchmark quantize model.xml
report Generate HTML report npu-benchmark report --theme dark
dashboard Launch terminal dashboard npu-benchmark dashboard
web Launch web dashboard npu-benchmark web

Global Options

npu-benchmark --verbose <command>  # Enable verbose/debug output
npu-benchmark --help               # Show help

๐Ÿ’ป Hardware Support Details

This suite is optimized for the Intelยฎ Coreโ„ข Ultra processor family.

Component Description
NPU (Series 1 - Meteor Lake) ~10 TOPS, ideal for sustained background AI workloads
NPU (Series 2 - Lunar Lake) ~45+ TOPS, capable of heavy generative AI tasks
iGPU (Intel Arc Graphics) High-throughput parallel processing
CPU Fallback and baseline comparison standard

๐Ÿ”ง Troubleshooting

NPU Not Detected

  1. Windows: Download and install the Intel NPU Driver
  2. Linux: Install intel-npu-driver package
  3. Run npu-benchmark verify to check status

Model Conversion Errors

  • Ensure you have the latest OpenVINO version: pip install openvino --upgrade
  • Check model compatibility in the Model Zoo

Performance Issues

  • For transformer models (BERT), ensure transformers and optimum[openvino] are installed
  • Use INT8 quantization for better NPU performance: npu-benchmark quantize

๐Ÿค Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository.
  2. Create your feature branch (git checkout -b feature/AmazingFeature).
  3. Commit your changes (git commit -m 'Add some AmazingFeature').
  4. Push to the branch (git push origin feature/AmazingFeature).
  5. Open a Pull Request.

๐Ÿ“„ License

Distributed under the MIT License. See LICENSE for more information.


Built with โค๏ธ for the AI Community

Releases

No releases published

Packages

 
 
 

Contributors