amdxdna: per-context DPM scaling, PMF counter integration, and DPM shim tests#1249
Merged
amdxdna: per-context DPM scaling, PMF counter integration, and DPM shim tests#1249
Conversation
Member
Author
Port the Platform Management Framework (PMF) integration from the staging driver to the OOT driver. When CONFIG_AMD_PMF is available (kernel 7.0+), the driver queries live NPU clock frequencies, power draw and per-column utilization from the PMF subsystem instead of relying on stale cached values. Changes: - Add HAVE_7_0_amd_pmf_get_npu_data compile test to configure_kernel.sh - Add update_counters callback to aie_hw_ops and wire npu4_update_counters - Implement npu4_update_counters() in aie2_smu.c using amd_pmf_get_npu_data - Update aie2_query_clock_metadata() and aie2_query_resource_info() to refresh counters via PMF before reporting - Rewrite aie2_query_sensors() to return real power and per-column utilization from PMF, adding AMDXDNA_SENSOR_TYPE_COLUMN_UTILIZATION - Add MODULE_IMPORT_NS(AMD_PMF) with kernel version compat - Unify config_kernel.h generation for both PCI and OF Makefile builds Signed-off-by: Nishad Saraf <nishads@amd.com>
Add three DPM tests for NPU4-series devices: - TEST_dpm_noop_no_qos: verify that a context without fps/latency QoS defaults to the maximum DPM level - TEST_dpm_refcount_scaling: incrementally create contexts with increasing GOPs demand, verify DPM level scales up to match, then destroy them in reverse order and verify DPM scales back down - TEST_dpm_power_modes: cycle through LOW, MEDIUM, HIGH, and TURBO power modes and verify H-clock matches the expected DPM table entry Supporting changes: - Add QoS-parameterized hw_ctx constructor to hwctx.h - Add submit_noop() helper to force FW context creation on OOT driver where contexts are virtual until first command submission - Add verify_hclk() with polling to account for clock ramp-up latency - Use 2% H-clock margin for hardware tolerance Signed-off-by: Nishad Saraf <nishads@amd.com>
maxzhen
approved these changes
Apr 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Per-context DPM level reference counting (staging driver): Introduce
per-DPM-level refcounts so each hardware context requests the minimum DPM
level satisfying its QoS parameters (GOPs, FPS, latency) at creation and
releases it at destruction. The driver programs the hardware to the highest
level with active references, skipping redundant set_dpm calls. Separate
aie2_pm_init() into init and resume paths, add POWER_MODE_LOW/MEDIUM, set
sys_eff_factor to 2, and add per-platform col_opc for DPM calculation.
PMF counter integration (OOT driver): Port Platform Management Framework
support from the staging driver. When CONFIG_AMD_PMF is available (kernel
7.0+), query live NPU clock frequencies, power draw, and per-column
utilization via amd_pmf_get_npu_data() instead of returning stale cached
values. Add AMDXDNA_SENSOR_TYPE_COLUMN_UTILIZATION to the UAPI. Unify
config_kernel.h generation for both PCI and OF Makefile builds so --nocmake
builds also detect kernel features.
DPM shim tests (NPU4-series only): Add three tests exercised via
dpm_test_bo_set (inherits elf_io_test_bo_set):
GOPs demand, verify DPM scales up, destroy in reverse, verify scale-down