enhancement(agent-data-plane): add metric tag filterlist#1293
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request adds a metric tag filterlist feature to the agent-data-plane component. It enables filtering (removing or retaining) specific tags from distribution metrics based on configurable per-metric rules. The implementation includes support for runtime updates via Remote Config, comprehensive telemetry, and both include (allowlist) and exclude (denylist) filtering modes.
Changes:
- Adds three new methods to the
ContextAPI (with_tags,with_origin_tags,with_tag_sets_mut) to support tag manipulation with copy-on-write semantics - Implements a new tag filterlist synchronous transform component with dynamic configuration support
- Simplifies the host_tags component to use the new Context API methods instead of context resolvers
- Fixes a bug in
Context::from_static_partswhere an incorrect variable was passed tohash_context
Reviewed changes
Copilot reviewed 3 out of 9 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| lib/saluki-context/src/context.rs | Adds three new public methods for tag manipulation and a private helper, fixes bug in from_static_parts |
| lib/saluki-context/src/tags/tagset/owned.rs | Reorganizes test modules and updates imports for consistency |
| bin/agent-data-plane/src/components/tag_filterlist/mod.rs | New tag filterlist component with comprehensive test coverage |
| bin/agent-data-plane/src/components/tag_filterlist/telemetry.rs | Telemetry module for tag filterlist metrics |
| lib/saluki-components/src/transforms/host_tags/mod.rs | Simplifies implementation to use new Context API methods |
| bin/agent-data-plane/src/cli/run.rs | Integrates tag filterlist into DogStatsD pipeline |
| bin/agent-data-plane/src/components/mod.rs | Exports new tag_filterlist module |
| bin/agent-data-plane/Cargo.toml | Adds dependencies for foldhash, hashbrown, metrics, and saluki-metrics |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| fn overlay_matches_reference( | ||
| fn property_test_overlay_matches_reference( | ||
| base_groups in arb_base_groups(), | ||
| ops in prop::collection::vec(arb_op(), 0..20), |
There was a problem hiding this comment.
The code uses proptest::collection::vec at lines 910 and 915, but uses prop::collection::vec at line 979. This is inconsistent - both should use the same path. While prop:: is available through the prelude import, the diff shows an intentional change to explicitly use proptest:: for consistency at lines 910 and 915. Line 979 should also use proptest::collection::vec for consistency.
| ops in prop::collection::vec(arb_op(), 0..20), | |
| ops in proptest::collection::vec(arb_op(), 0..20), |
4abf7aa to
33eba26
Compare
33eba26 to
d7b994f
Compare
d7b994f to
b376b05
Compare
751d3a2 to
6fd3517
Compare
6fd3517 to
35ed30b
Compare
Binary Size Analysis (Agent Data Plane)Target: e66e11a (baseline) vs 35ed30b (comparison) diff
|
| Module | File Size | Symbols |
|---|---|---|
agent_data_plane::components::tag_filterlist |
+47.47 KiB | 18 |
core |
+25.53 KiB | 3127 |
[sections] |
+6.81 KiB | 8 |
serde_core |
+6.42 KiB | 92 |
hashbrown |
+6.16 KiB | 77 |
[Unmapped] |
+4.35 KiB | 1 |
agent_data_plane::cli::run |
+4.12 KiB | 82 |
tokio |
+3.45 KiB | 834 |
smallvec |
+1.82 KiB | 36 |
saluki_core::topology::blueprint |
+1.22 KiB | 27 |
saluki_context::tags::Tag |
+420 B | 1 |
saluki_config::dynamic::watcher |
-387 B | 1 |
serde_json |
+351 B | 71 |
saluki_app::metrics::RuntimeMetrics |
-331 B | 1 |
agent_data_plane::cli::debug |
-305 B | 92 |
agent_data_plane::cli::dogstatsd |
+301 B | 34 |
agent_data_plane::components::apm_onboarding |
-283 B | 34 |
agent_data_plane::components::ottl_filter_processor |
+268 B | 34 |
flate2 |
-152 B | 1 |
saluki_core::topology::interconnect |
-128 B | 6 |
Detailed Symbol Changes
FILE SIZE VM SIZE
-------------- --------------
[NEW] +1.79Mi [NEW] +1.79Mi std::thread::local::LocalKey<T>::with::hd6ac1a4937019760
[NEW] +119Ki [NEW] +119Ki agent_data_plane::cli::run::create_topology::_{{closure}}::h333a884522fd5022
[NEW] +62.0Ki [NEW] +61.9Ki agent_data_plane::cli::run::handle_run_command::_{{closure}}::hf2c856d69b8722fe
[NEW] +58.7Ki [NEW] +58.4Ki _<agent_data_plane::internal::control_plane::PrivilegedApiWorker as saluki_core::runtime::supervisor::Supervisable>::initialize::_{{closure}}::h57e8c387c7958785
[NEW] +49.5Ki [NEW] +49.4Ki saluki_app::bootstrap::AppBootstrapper::bootstrap::_{{closure}}::h90c91d7f3478da24
[NEW] +44.1Ki [NEW] +43.9Ki saluki_env::workload::providers::remote_agent::RemoteAgentWorkloadProvider::from_configuration::_{{closure}}::h3d079697c6e3dbd2
[NEW] +36.9Ki [NEW] +36.7Ki saluki_env::helpers::remote_agent::client::RemoteAgentClient::from_configuration::_{{closure}}::_{{closure}}::_{{closure}}::hfd96acf768c9665e
[NEW] +35.7Ki [NEW] +35.5Ki _<agent_data_plane::components::tag_filterlist::TagFilterlist as saluki_core::components::transforms::Transform>::run::_{{closure}}::h468c85120b3a918a
[NEW] +35.5Ki [NEW] +35.4Ki agent_data_plane::run_inner::_{{closure}}::h19bb9e80abf264c0
[NEW] +35.4Ki [NEW] +35.2Ki _<saluki_config::secrets::resolver::external::ExternalProcessResolver as saluki_config::secrets::resolver::Resolver>::resolve::_{{closure}}::hd7fb1656f4921783
+0.7% +34.0Ki +0.5% +18.8Ki [7235 Others]
[NEW] +33.7Ki [NEW] +33.5Ki agent_data_plane::cli::dogstatsd::handle_dogstatsd_stats::_{{closure}}::h28e674695d0df1de
[DEL] -35.5Ki [DEL] -35.4Ki agent_data_plane::run_inner::_{{closure}}::h7343601604a97e59
[DEL] -35.6Ki [DEL] -35.3Ki _<saluki_config::secrets::resolver::external::ExternalProcessResolver as saluki_config::secrets::resolver::Resolver>::resolve::_{{closure}}::h4feb71abe4ef2497
[DEL] -36.9Ki [DEL] -36.7Ki saluki_env::helpers::remote_agent::client::RemoteAgentClient::from_configuration::_{{closure}}::_{{closure}}::_{{closure}}::h5d6ae18d97ce7918
[DEL] -44.1Ki [DEL] -43.9Ki saluki_env::workload::providers::remote_agent::RemoteAgentWorkloadProvider::from_configuration::_{{closure}}::h1f72b47b9e07a2ef
[DEL] -49.5Ki [DEL] -49.4Ki saluki_app::bootstrap::AppBootstrapper::bootstrap::_{{closure}}::h797951dd66db32b5
[DEL] -58.8Ki [DEL] -58.6Ki _<agent_data_plane::internal::control_plane::PrivilegedApiWorker as saluki_core::runtime::supervisor::Supervisable>::initialize::_{{closure}}::hfdca77ab0ba5e8fd
[DEL] -64.0Ki [DEL] -63.9Ki agent_data_plane::cli::run::handle_run_command::_{{closure}}::h74e3039632156528
[DEL] -113Ki [DEL] -113Ki agent_data_plane::cli::run::create_topology::_{{closure}}::he5badade4587aa9a
[DEL] -1.79Mi [DEL] -1.79Mi std::thread::local::LocalKey<T>::with::h04db06f53370339f
+0.4% +107Ki +0.4% +91.8Ki TOTAL
Regression Detector (Agent Data Plane)Regression Detector ResultsRun ID: 2fb04923-a34a-48bc-88cb-165ba2cdf80c Baseline: e66e11a Optimization Goals: ✅ No significant changes detected
|
| perf | experiment | goal | Δ mean % | Δ mean % CI | trials | links |
|---|---|---|---|---|---|---|
| ➖ | otlp_ingest_logs_5mb_memory | memory utilization | +4.46 | [+4.18, +4.74] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_logs_5mb_throughput | ingress throughput | +0.01 | [-0.11, +0.13] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_logs_5mb_cpu | % cpu utilization | -0.23 | [-5.01, +4.55] | 1 | (metrics) (profiles) (logs) |
Fine details of change detection per experiment
| perf | experiment | goal | Δ mean % | Δ mean % CI | trials | links |
|---|---|---|---|---|---|---|
| ➖ | dsd_uds_1mb_3k_contexts_cpu | % cpu utilization | +17.18 | [-40.68, +75.04] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_logs_5mb_memory | memory utilization | +4.46 | [+4.18, +4.74] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_5mb_cpu | % cpu utilization | +4.05 | [+1.80, +6.31] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_metrics_5mb_cpu | % cpu utilization | +2.26 | [-6.31, +10.83] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_500mb_3k_contexts_throughput | ingress throughput | +1.83 | [+1.69, +1.96] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_100mb_3k_contexts_cpu | % cpu utilization | +1.73 | [-4.34, +7.80] | 1 | (metrics) (profiles) (logs) |
| ➖ | quality_gates_rss_idle | memory utilization | +1.50 | [+1.47, +1.52] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_1mb_3k_contexts_memory | memory utilization | +1.31 | [+1.13, +1.49] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_500mb_3k_contexts_memory | memory utilization | +1.26 | [+1.09, +1.43] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_512kb_3k_contexts_memory | memory utilization | +1.17 | [+1.00, +1.35] | 1 | (metrics) (profiles) (logs) |
| ➖ | quality_gates_rss_dsd_low | memory utilization | +1.11 | [+0.92, +1.31] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_10mb_3k_contexts_memory | memory utilization | +0.94 | [+0.75, +1.12] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_100mb_3k_contexts_memory | memory utilization | +0.68 | [+0.50, +0.86] | 1 | (metrics) (profiles) (logs) |
| ➖ | quality_gates_rss_dsd_medium | memory utilization | +0.60 | [+0.41, +0.80] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_metrics_5mb_memory | memory utilization | +0.56 | [+0.33, +0.79] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_ottl_filtering_5mb_memory | memory utilization | +0.52 | [+0.18, +0.86] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_500mb_3k_contexts_cpu | % cpu utilization | +0.43 | [-1.02, +1.88] | 1 | (metrics) (profiles) (logs) |
| ➖ | quality_gates_rss_dsd_ultraheavy | memory utilization | +0.05 | [-0.08, +0.18] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_logs_5mb_throughput | ingress throughput | +0.01 | [-0.11, +0.13] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_512kb_3k_contexts_throughput | ingress throughput | +0.00 | [-0.05, +0.05] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_10mb_3k_contexts_throughput | ingress throughput | +0.00 | [-0.13, +0.13] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_5mb_throughput | ingress throughput | +0.00 | [-0.02, +0.02] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_ottl_transform_5mb_throughput | ingress throughput | +0.00 | [-0.02, +0.02] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_ottl_filtering_5mb_throughput | ingress throughput | +0.00 | [-0.02, +0.02] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_1mb_3k_contexts_throughput | ingress throughput | -0.00 | [-0.06, +0.05] | 1 | (metrics) (profiles) (logs) |
| ➖ | quality_gates_rss_dsd_heavy | memory utilization | -0.01 | [-0.15, +0.14] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_100mb_3k_contexts_throughput | ingress throughput | -0.01 | [-0.04, +0.01] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_metrics_5mb_throughput | ingress throughput | -0.02 | [-0.14, +0.11] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_logs_5mb_cpu | % cpu utilization | -0.23 | [-5.01, +4.55] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_ottl_transform_5mb_memory | memory utilization | -0.37 | [-0.62, -0.12] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_512kb_3k_contexts_cpu | % cpu utilization | -0.43 | [-57.01, +56.14] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_ottl_filtering_5mb_cpu | % cpu utilization | -0.46 | [-2.84, +1.92] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_5mb_memory | memory utilization | -0.50 | [-0.75, -0.25] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_ottl_transform_5mb_cpu | % cpu utilization | -2.38 | [-4.57, -0.20] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_10mb_3k_contexts_cpu | % cpu utilization | -3.38 | [-33.74, +26.97] | 1 | (metrics) (profiles) (logs) |
Bounds Checks: ✅ Passed
| perf | experiment | bounds_check_name | replicates_passed | observed_value | links |
|---|---|---|---|---|---|
| ✅ | quality_gates_rss_dsd_heavy | memory_usage | 10/10 | 115.24MiB ≤ 140MiB | (metrics) (profiles) (logs) |
| ✅ | quality_gates_rss_dsd_low | memory_usage | 10/10 | 33.98MiB ≤ 50MiB | (metrics) (profiles) (logs) |
| ✅ | quality_gates_rss_dsd_medium | memory_usage | 10/10 | 54.21MiB ≤ 75MiB | (metrics) (profiles) (logs) |
| ✅ | quality_gates_rss_dsd_ultraheavy | memory_usage | 10/10 | 170.02MiB ≤ 200MiB | (metrics) (profiles) (logs) |
| ✅ | quality_gates_rss_idle | memory_usage | 10/10 | 21.38MiB ≤ 40MiB | (metrics) (profiles) (logs) |
Explanation
Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%
Performance changes are noted in the perf column of each table:
- ✅ = significantly better comparison variant performance
- ❌ = significantly worse comparison variant performance
- ➖ = no significant change in performance
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
## Summary <!-- Please provide a brief summary about what this PR does. This should help the reviewers give feedback faster and with higher quality. --> This PR consists of the initial tag filterlist implementation in #1247 along with minor optimizations, and is branched off of #1270 in order to make use of mutable tagsets. The correctness test + smp benchmarks pr will be based off of this branch to allow for clean prs / merging. ## Change Type - [ ] Bug fix - [x] New feature - [ ] Non-functional (chore, refactoring, docs) - [ ] Performance ## How did you test this PR? <!-- Please how you tested these changes here --> CI / Correctness Test pr stacked ontop of this one ## References <!-- Please list any issues closed by this PR. --> <!-- - Closes: <issue link> --> <!-- Any other issues or PRs relevant to this PR? Feel free to list them here. --> 5612418

Summary
This PR consists of the initial tag filterlist implementation in #1247 along with minor optimizations, and is branched off of #1270 in order to make use of mutable tagsets.
The correctness test + smp benchmarks pr will be based off of this branch to allow for clean prs / merging.
Change Type
How did you test this PR?
CI / Correctness Test pr stacked ontop of this one
References