Segregate IExecutionProvider optional capabilities into mix-in interfaces#29087
Segregate IExecutionProvider optional capabilities into mix-in interfaces#29087GopalakrishnanN wants to merge 3 commits into
Conversation
…aces Introduce IGraphCaptureCapability, ITuningCapability, and ICompileCapability mix-in interfaces that cluster the optional graph-capture, tuning, and compilation methods currently spread across IExecutionProvider as defaulted virtuals. Add non-breaking capability-query hooks (GetGraphCaptureCapability/GetTuningCapability/GetCompileCapability) that default to nullptr so callers can discover support by capability instead of relying on the large defaulted-virtual surface. Legacy virtuals are retained for compatibility; this is an additive first step toward migrating EPs off them. Add a CPU-only unit test proving unsupported capabilities return null, an exposed capability is queryable and usable, and capabilities are independent.
…erface Add IDataLayoutCapability grouping the coupled GetPreferredLayout() and ShouldConvertDataLayoutForOp() virtuals, with a GetDataLayoutCapability() query hook on IExecutionProvider (defaults to nullptr). Additive and non-breaking; only NHWC-preferring EPs implement it. Includes fake-EP unit tests.
There was a problem hiding this comment.
Pull request overview
This PR introduces a new set of optional-capability mix-in interfaces for IExecutionProvider and adds typed query hooks on IExecutionProvider so callers can explicitly detect and use supported capabilities (graph capture, tuning, compilation, data-layout) without depending on the full base interface.
Changes:
- Add
execution_provider_capabilities.hdefiningIGraphCaptureCapability,ITuningCapability,IDataLayoutCapability, andICompileCapability(build-guarded). - Add
Get*Capability()query hooks toIExecutionProvider(defaulting tonullptr) to expose supported mix-ins. - Add unit tests using lightweight fake EPs to validate the query mechanism and capability independence.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| onnxruntime/test/framework/execution_provider_capabilities_test.cc | Adds unit tests exercising the new capability query hooks with fake EP implementations. |
| include/onnxruntime/core/framework/execution_provider.h | Adds forward declarations and the new Get*Capability() query hooks on IExecutionProvider. |
| include/onnxruntime/core/framework/execution_provider_capabilities.h | Introduces the segregated optional-capability mix-in interface definitions. |
| /** Return this EP's TunableOp tuning capability, or nullptr if unsupported. */ | ||
| virtual ITuningCapability* GetTuningCapability() const noexcept { return nullptr; } | ||
|
|
||
| /** Return this EP's data-layout preference capability, or nullptr if unsupported. */ | ||
| virtual IDataLayoutCapability* GetDataLayoutCapability() const noexcept { return nullptr; } |
| ITuningCapability* GetTuningCapability() const noexcept override { | ||
| return const_cast<TuningEp*>(this); | ||
| } |
| IDataLayoutCapability* GetDataLayoutCapability() const noexcept override { | ||
| return const_cast<DataLayoutEp*>(this); | ||
| } |
| IDataLayoutCapability* dl = base.GetDataLayoutCapability(); | ||
| ASSERT_NE(dl, nullptr) << "EP advertising a data-layout preference must return a non-null capability."; |
…dispatch Add positive 'queryable and usable' tests for ITuningCapability (call-counted routing) and ICompileCapability (Compile + ValidateCompiledModelCompatibilityInfo), a multi-capability EP test verifying correct mix-in dispatch under multiple inheritance, and compile-null independence checks. 7 tests pass.
|
|
||
| // An EP that supports graph capture, implemented via the segregated mix-in and | ||
| // surfaced through the GetGraphCaptureCapability() query hook. | ||
| class GraphCaptureEp : public IExecutionProvider, public IGraphCaptureCapability { |
There was a problem hiding this comment.
class GraphCaptureEp : public IExecutionProvider, public IGraphCaptureCapability
The goal of this PR is not clearly articulated. What does it gain, memory, performance?
It does make objects binary layout more complicated as this object now has more than one virtual table and casting now has to be done very carefully.
| virtual IGraphCaptureCapability* GetGraphCaptureCapability() noexcept { return nullptr; } | ||
|
|
||
| /** Return this EP's TunableOp tuning capability, or nullptr if unsupported. */ | ||
| virtual ITuningCapability* GetTuningCapability() const noexcept { return nullptr; } |
There was a problem hiding this comment.
Now, instead of those default methods we need to deal with default methods that cast to the correct interface by dispatching it to the correct virtual table (it now has more than one), however, this will not compile cleanly as if the method is const.
Now if you return this it has a const qualifier which will have to be cast away which we do not want but the const_cast now is dictated by the interface.
|
|
||
| // ITuningCapability. Returns nullptr (no real tuning state); a call counter | ||
| // proves the call is routed to the concrete implementation through the mix-in. | ||
| ITuningContext* GetTuningContext() const override { |
| int tuning_query_count() const { return tuning_query_count_; } | ||
|
|
||
| private: | ||
| mutable int tuning_query_count_ = 0; |
Description
This PR begins segregating the large
IExecutionProviderinterface into small, focused optional-capability mix-in interfaces, following the Interface Segregation Principle (ISP).IExecutionProviderhas accumulated a large number of defaulted virtual methods. Many of them represent optional capabilities that only a subset of execution providers implement — CUDA-graph capture/replay, TunableOp tuning, ahead-of-time/JIT compilation of fused subgraphs, and non-default data-layout preference. Bundling them all on the base class couples every EP and every caller to the union of all capabilities.This change introduces narrow mix-in interfaces in a new header,
core/framework/execution_provider_capabilities.h, and a matching set ofGet*Capability()query hooks onIExecutionProviderthat return the capability pointer when supported andnullptrotherwise.Capability clusters included in this PR:
IGraphCaptureCapabilityIsGraphCaptureEnabled,IsGraphCaptured,ReplayGraph,ReleaseCapturedGraph,GetGraphCaptureNodeAssignmentPolicyITuningCapabilityGetTuningContextICompileCapability(guarded by non-minimal/extended-minimal build)Compile,GetCompiledModelCompatibilityInfo,ValidateCompiledModelCompatibilityInfoIDataLayoutCapabilityGetPreferredLayout,ShouldConvertDataLayoutForOpQuery hooks added to
IExecutionProvider(all default tonullptr):GetGraphCaptureCapability(),GetTuningCapability(),GetCompileCapability(),GetDataLayoutCapability().Motivation and Context
IExecutionProvideris the single most widely implemented and widely called interface in ORT. Its breadth means:Segregating optional capabilities into mix-ins addresses these problems directly while remaining fully backward compatible.
Advantages of this segregation
Interface Segregation Principle — callers depend only on what they use. A caller that needs graph capture can take an
IGraphCaptureCapability*and depend on five cohesive methods instead of the entireIExecutionProvidersurface. This narrows compile-time coupling and makes dependencies explicit.Explicit, typed capability negotiation.
if (auto* gc = ep.GetGraphCaptureCapability())is a single, type-safe question with a clear yes/no answer. It replaces the current pattern where a caller invokes a defaulted virtual (e.g.IsGraphCaptureEnabled()) on every EP and relies on a default no-op to mean "unsupported." Absence of a capability becomes a first-class, checkable state rather than an implicit default.Cohesion — related methods travel together. Graph capture's five methods (enable/captured/replay/release/policy) only make sense as a set. Grouping them in one interface documents that contract and prevents partial/inconsistent overrides scattered across the base class.
Safer defaults. The hooks default to
nullptr, so callers must explicitly handle "capability absent." This is safer than today's defaulted virtuals, where a silent no-op default can mask a missing implementation and hide bugs.Reduced conceptual surface of the base class. New EP authors see a small core
IExecutionProviderplus a menu of opt-in capability interfaces, instead of a wall of defaulted virtuals where it is unclear which are core and which are optional.Better testability. Each capability can be implemented and exercised by a tiny fake EP in isolation — no need to subclass the full
IExecutionProvideror bring up a real GPU/NPU backend. The included tests demonstrate this:DataLayoutEp,GraphCaptureEp, andTuningEpare a few lines each and run entirely on CPU.Incremental, non-breaking migration path. The change is purely additive: the legacy per-capability virtuals on
IExecutionProviderremain in place, so existing EPs and callers are unaffected. EPs can implement a mix-in and return it from the hook at their own pace; callers can migrate to the narrow interface independently. Once all implementers and callers of a cluster have moved over, that cluster's legacy virtuals can be removed from the base — giving a concrete path to actually shrinkIExecutionProviderlater.Clear ownership and discoverability. Capabilities are named, self-documenting types (
ICompileCapability,IDataLayoutCapability) rather than loosely related virtuals. This improves grep-ability and makes the optional surface easy to enumerate.Const-correctness aligned with intent. Each hook mirrors the const-ness of its underlying operations: graph capture is non-const (replay mutates device state), while tuning and data-layout queries are
const. This bakes the read/write contract into the interface.Foundation for future capability work. The same pattern extends naturally to other optional clusters (e.g. EPContext node generation, data-transfer/external-data loading, profiling, stream-handler registration), so subsequent segregation lands consistently.
Scope and design notes
ICompileCapabilityis guarded by#if !defined(ORT_MINIMAL_BUILD) || defined(ORT_EXTENDED_MINIMAL_BUILD), mirroring the build availability of the legacy compilation virtuals.Validation
onnxruntime_test_all(RelWithDebInfo).onnxruntime/test/framework/execution_provider_capabilities_test.ccuse lightweight fake EPs and run on CPU:The tests verify that a capability-less EP returns
nullptrfrom every hook, that an EP advertising a capability is reachable and usable through the narrow mix-in, and that capabilities are independent (implementing one does not imply another).