Add Execution Provider conformance test suite#28968
Conversation
Parameterized GoogleTest suite (EpContract/EpConformanceTest) encoding universal IExecutionProvider invariants every EP must satisfy: stable Type(), valid preferred layout, CPU mem types map to CPU-accessible devices, non-null/repeatable preferred allocators, usable allocations, CPU data-transfer round-trip integrity, and consistent metadata queries. EPs are registered via factories (no static-init construction); unavailable EPs skip rather than fail. Adding a new EP is a single guarded line in GetEpConformanceParams().
There was a problem hiding this comment.
Pull request overview
Adds a parameterized GoogleTest suite (EpContract/EpConformanceTest) under onnxruntime/test/framework/ to codify and enforce baseline invariants expected of all IExecutionProvider implementations (e.g., stable Type(), valid preferred layout, CPU memtype mapping, allocator behavior, optional data transfer correctness, and basic metadata query consistency).
Changes:
- Introduces a new parameterized conformance test fixture for
IExecutionProvidercontract invariants. - Registers CPU EP instances (arena + non-arena) and conditionally registers additional EPs behind
USE_*guards. - Validates allocator usability and (when applicable) CPU↔CPU
IDataTransfer::CopyTensorround-trip correctness.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
PreferredAllocatorsAllocateUsableMemory previously called Alloc()/Free() on every preferred allocator, only gating the host memory read/write behind UsesCpuMemory(). On WebGPU (arm64 Debug CI) the GpuBufferAllocator creates buffers mapped at creation; Free() routes through the buffer manager which throws EnforceBufferUnmapped ('Buffer is still mapped'), failing the test.
Restrict the standalone Alloc/Free exercise to CPU-accessible allocators, whose lifecycle is backend-agnostic, and GTEST_SKIP when an EP exposes no such allocator. Device allocators remain covered by PreferredAllocatorsAreNonNullAndRepeatable. Fixes the webgpu build-and-test (arm64, Debug) leg on PR #28968.
Address PR review: in ORT_USE_EP_API_ADAPTERS builds, DefaultWebGpuExecutionProvider() ORT_ENFORCEs (aborting the entire unit-test run) when the dynamic plugin EP is initialized to a different EP name, instead of cleanly returning nullptr. Mirror the guard used by base_tester.cc and default_providers.cc by listing the built-in WebGPU EP only under 'defined(USE_WEBGPU) && !defined(ORT_USE_EP_API_ADAPTERS)'.
|
it's a good idea to have more tests that EP authors can use for verification.
|
|
I like the idea of defining invariants clearly |
|
Thanks @edgchen1 — agreed, the plugin A few thoughts on getting there:
So my proposal: land this PR as the |
I suppose it wouldn't hurt to test both the internal interface and the plugin EP interface, but I think the latter would be more useful for EP development going forward. Some of the ORT-owned EPs (like CUDA and WebGPU) have already been converted to plugin EPs. |
Description
Adds a parameterized GoogleTest suite (
EpContract/EpConformanceTest) that encodes invariants everyIExecutionProviderimplementation must satisfy, independent of the hardware backend.New file:
onnxruntime/test/framework/execution_provider_conformance_test.cc(auto-discovered by the existingtest/framework/*.ccCMake glob — no build-file changes needed).The suite currently runs against the CPU EP in both arena and non-arena allocator configurations (14 instances = 7 invariants × 2 configs, all passing). CUDA/DML/WebGPU/XNNPACK are registered behind
USE_*guards; a factory that returnsnullptr(EP compiled but unavailable at runtime) skips rather than fails.Invariants enforced
Type()is non-empty and stable across repeated calls and across independent instances.GetPreferredLayout()returns a definedDataLayout(NCHW/NHWC).OrtMemTypeCPUInput/OrtMemTypeCPUOutput) map to CPU-accessible devices.CreatePreferredAllocators()returns no null entries and is repeatable (documented as stateless).IDataTransferCPU↔CPU round-trip preserves data exactly (when the EP provides a data transfer that advertises the copy).GetDeviceId()==GetDevice().Id()).Why this change matters
ONNX Runtime depends on 20+ execution providers that are all expected to be substitutable behind the
IExecutionProviderinterface — a Liskov-substitution contract. Today those expectations are implicit: scattered across header prose, framework code that silently assumes them, and tribal knowledge. There is no single place that states "every EP must do X," and nothing mechanically prevents a new or modified EP from violating them.That gap has real cost:
OrtMemTypeCPUInput, and the breakage only surfaces deep inside the framework (input/output staging copies, layout transforms, kernel assignment) as a confusing crash or wrong result — far from the root cause.This suite converts those implicit assumptions into enforced, executable contracts. It is deliberately conservative: it asserts only documented, backend-agnostic guarantees and never dereferences non-CPU memory from the test thread, so it is safe to run for EPs whose device memory cannot be inspected on the host. Adoption cost is intentionally minimal — adding an EP to the coverage is a single guarded line in
GetEpConformanceParams().Motivation and Context
Testing
Built
onnxruntime_test_all(Windows, RelWithDebInfo) and ran the suite:Relationship to EP capability segregation (#29087)
This suite currently asserts only invariants that apply to every EP, because those are the methods every
IExecutionProviderexposes. It cannot yet enforce invariants for optional capabilities — graph capture/replay, TunableOp tuning, subgraph compilation, and data-layout preference — because there is no way to tell which EPs actually support them. That gap shows up directly inMetadataQueriesAreCallable, whereGetTuningContext()can only be exercised as a(void)ep->GetTuningContext();"doesn't crash" smoke check rather than a real assertion.The capability-segregation work in #29087 (which adds
Get*Capability()query hooks returning a narrow capability mix-in, ornullptrwhen unsupported) unlocks capability-scoped conformance tests that run only on EPs that advertise a given capability:Once #29087 lands, this suite can (a) upgrade the
GetTuningContext()smoke check into a real, scoped assertion and (b) extend coverage to graph-capture / compile / data-layout invariants — running each check only where it applies instead of calling no-op defaults on every EP.The two PRs are complementary and independent: this suite enforces the
IExecutionProvidercontract, while #29087 makes the optional parts of that contract individually queryable. They touch different files (execution_provider_conformance_test.ccvsexecution_provider_capabilities_test.cc), and #29087'sexecution_provider.hchange is purely additive, so there is no ordering dependency or conflict between them.