Skip to content

Add always-on ring-buffer logging for all Qcom kernel drivers (Phase 1)#29

Open
hangzqcom wants to merge 1 commit intoqualcomm:mainfrom
hangzqcom:feat/always-on-logging
Open

Add always-on ring-buffer logging for all Qcom kernel drivers (Phase 1)#29
hangzqcom wants to merge 1 commit intoqualcomm:mainfrom
hangzqcom:feat/always-on-logging

Conversation

@hangzqcom
Copy link
Copy Markdown
Contributor

Summary

Implements Phase 1 of docs/always-on-logging-plan.md: every Qcom kernel driver gets an always-on, bounded ring-buffer trace session plus a one-command "Save Log" action. No driver source code changes — all risk is contained in user-mode scripts and config.

Problem: today WPP (Windows) and printk / QC_LOG_* (Linux) are opt-in. By the time a customer reports an issue, the useful history is already gone, and each driver has its own enable workflow.

Solution: provision an OS-native circular ring buffer at boot (ETW AutoLogger on Windows, ftrace instance on Linux) and expose a single SaveSnapshot() action that stops/rotates/restarts the session in under a second with < 200 ms (Windows) / < 50 ms (Linux) logging gap.

What's in this PR

Windows (tools/logging/windows/)

File Purpose
QcomTraceProviders.psd1 Single source of truth for WPP GUIDs and AutoLogger parameters (64 MB circular ETL, INFO level, keyword mask with chatty data flags off).
Install-QcomLogging.ps1 Provisions the HKLM AutoLogger tree and starts a live logman session immediately (no reboot needed for first install).
Uninstall-QcomLogging.ps1 Reverses the above.
Save-QcomDriverLog.ps1 The user-facing "Save Log". Flush → stop → rotate .etl → restart; emits manifest.json; optional -Bundle zips setupapi.dev.log + pnputil output.
QcomDrivers.wprp WPR/xperf profile mirroring the providers for on-demand captures.

Linux (tools/logging/linux/)

File Purpose
qcomlog Bash CLI: install | save | status | uninstall. Uses a dedicated /sys/kernel/tracing/instances/qcom_drivers with buffer_size_kb=4096 per CPU (~32 MB total) and overwrite mode.
qcomlogd.service systemd oneshot that provisions the instance at boot.
qcom-diag.rules udev rule granting the qcom-diag group snapshot access.
qcomlog.conf Tunables (buffer size, retention, event globs, outdir).

Docs

  • docs/always-on-logging-plan.md — full design, phasing, performance targets, risks.
  • tools/logging/README.md — quick start for each platform.
  • Top-level README.md — pointer added.

Phase coverage

Item Status
Shared provider catalogue
Windows AutoLogger (circular 64 MB .etl)
Windows "Save ETL" workflow
Windows WPR profile
Linux bounded ftrace instance
Linux "Save Log" workflow
Driver INF AddReg integration 🟡 follow-up PR (needs WDK rebuild, snippets are in the plan doc)
Linux trace_qcom_log tracepoint 🟡 follow-up PR (needs kernel rebuild of each module)
Native QcomLogSvc C++ service 🟡 follow-up PR (MSBuild project)

Build / verification

  • No driver source code modified → existing Windows/Linux driver builds are unaffected.
  • All added scripts/configs verified in CI-equivalent local checks:
    • Install-QcomLogging.ps1, Uninstall-QcomLogging.ps1, Save-QcomDriverLog.ps1 — PowerShell tokenizer OK
    • QcomDrivers.wprp — XML well-formed OK
    • QcomTraceProviders.psd1Import-PowerShellDataFile OK
    • qcomlogbash -n OK
  • .gitattributes enforces LF line endings on Linux artifacts.

Quick start

Windows (as Administrator):

cd tools\logging\windows
.\Install-QcomLogging.ps1
# ...reproduce issue...
.\Save-QcomDriverLog.ps1 -OutDir C:\QcomLogs

Linux:

cd tools/logging/linux
sudo ./qcomlog install
# ...reproduce issue...
qcomlog save -o ~/qcomlogs

Open questions / risks (discussed in plan doc §8)

  1. S-mode / HVCI AutoLogger restrictions (fallback: service creates session via API).
  2. Linux LOCKDOWN_TRACEFS (fallback: dmesg + kfifo degraded mode).
  3. Storage on small embedded targets (buffer size is configurable).
  4. PII in QC_LOG_DATA / WPP_DRV_MASK_RDATA payloads — gated behind non-default keyword.

Implements Phase 1 of docs/always-on-logging-plan.md: every Qcom kernel driver is paired with an always-on, bounded ring-buffer trace session plus a one-command Save Log action. No driver source code changes; all risk is contained in user-mode scripts and config.

Windows (tools/logging/windows/):

- QcomTraceProviders.psd1: single source of truth for WPP GUIDs and AutoLogger session parameters (64 MB circular ETL, INFO level, keyword mask with chatty data flags off by default).

- Install-QcomLogging.ps1 / Uninstall-QcomLogging.ps1: provision or tear down the HKLM AutoLogger registry tree and the equivalent live ETW session via logman, so logging begins on first install without a reboot.

- Save-QcomDriverLog.ps1: flush/stop/rotate/restart the circular ETL into C:\Users\Public\\Documents\\QcomLogs\\QcomDrivers-<timestamp>.etl; emits manifest.json and can optionally zip setupapi.dev.log + pnputil output for RMA.

- QcomDrivers.wprp: Windows Performance Recorder profile mirroring the same providers for on-demand WPR/xperf captures.

Linux (tools/logging/linux/):

- qcomlog: bash CLI (install | save | status | uninstall) that provisions /sys/kernel/tracing/instances/qcom_drivers with a bounded overwrite-mode ring and per-CPU buffer_size_kb=4096, then snapshots into timestamped tarballs.

- qcomlogd.service: systemd oneshot unit that provisions the instance at boot.

- qcom-diag.rules: udev rule granting the qcom-diag group snapshot access.

- qcomlog.conf: tunables (buffer size, retention, event globs, outdir).

Docs: docs/always-on-logging-plan.md + tools/logging/README.md cover design, phasing, performance targets (<= 0.5%% CPU, <= 32 MB RAM, <= 1 s save latency, <= 200 ms Windows / <= 50 ms Linux logging gap), and open risks (S-mode AutoLogger, LOCKDOWN_TRACEFS, etc).

Verified: PowerShell parses clean, WPRP XML well-formed, psd1 loads, qcomlog passes bash -n.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant