Skip to content

Rayz/tagfilter mutable tagset#1288

Closed
rayz wants to merge 7 commits intorayz/tagfilter-cachefrom
rayz/tagfilter-mutable-tagset
Closed

Rayz/tagfilter mutable tagset#1288
rayz wants to merge 7 commits intorayz/tagfilter-cachefrom
rayz/tagfilter-mutable-tagset

Conversation

@rayz
Copy link
Copy Markdown
Contributor

@rayz rayz commented Mar 30, 2026

Summary

PR to benchmark the tagfilter implementation + mutable tagset changes

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

How did you test this PR?

References

Copilot AI review requested due to automatic review settings March 30, 2026 20:12
@dd-octo-sts dd-octo-sts bot added area/core Core functionality, event model, etc. area/components Sources, transforms, and destinations. destination/dogstatsd-stats DogStatsD Statistics destination. transform/host-tags Host Tags synchronous transform. encoder/datadog-traces Datadog Traces encoder. source/otlp OTLP source. transform/apm-stats APM Stats transform. labels Mar 30, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request implements a major refactoring of the tag management system in the Saluki codebase. The core change is converting TagSet from a simple Vec<Tag> wrapper to a sophisticated mutable data structure with copy-on-write semantics and an overlay-based mutation model. This enables efficient in-place mutations of tags while preserving immutable sharing through SharedTagSet.

Changes:

  • Refactored TagSet to use a base SharedTagSet with an optional mutation overlay (additions and removals bitset) for efficient mutations
  • Introduced new FrozenTagSet type as immutable storage backing for SharedTagSet
  • Updated all data model types (Trace, Span, Log, EventD, etc.) to use TagSet instead of SharedTagSet
  • Simplified transforms like tag_filterlist and host_tags to use direct mutation instead of context resolvers
  • Added new mutation methods to Context (with_tags_mut, with_origin_tags_mut, with_tag_sets_mut) with copy-on-write semantics
  • Added comprehensive tests and benchmarks for the new TagSet implementation

Reviewed changes

Copilot reviewed 27 out of 31 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
lib/saluki-context/src/tags/tagset/owned.rs Complete rewrite of TagSet with overlay-based mutation model
lib/saluki-context/src/tags/tagset/shared.rs Updated to use FrozenTagSet internally
lib/saluki-context/src/tags/tagset/frozen.rs New immutable storage type
lib/saluki-context/src/context.rs Updated Context API with in-place mutation methods
lib/saluki-core/src/data_model/event/*.rs Updated to use TagSet instead of SharedTagSet
lib/saluki-components/src/transforms/tag_filterlist/mod.rs Simplified to use direct mutation
lib/saluki-components/src/transforms/host_tags/mod.rs Simplified to use direct mutation
lib/saluki-common/src/collections/bitset.rs New dense contiguous bitset implementation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

let origin_tags = SharedTagSet::default();
let origin_tags = TagSet::default();

let (key, _) = hash_context(name, tags, &origin_tags);
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The from_static_parts method constructs a TagSet by iterating over the tags array and inserting each tag via insert_tag(), which skips duplicates. However, the context key is computed by passing the original tags array to hash_context() on line 46. If the array contains duplicate tags, the hash will include those duplicates, but the TagSet (which is stored in the context) will not contain them (due to deduplication in insert_tag()). This creates a mismatch: the context key is not consistent with the actual tag set stored in the context. The hash should be computed from the actual TagSet that will be stored, not from the original array. Change line 46 to: let (key, _) = hash_context(name, &tag_set, &origin_tags);

Suggested change
let (key, _) = hash_context(name, tags, &origin_tags);
let (key, _) = hash_context(name, &tag_set, &origin_tags);

Copilot uses AI. Check for mistakes.
@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Mar 30, 2026

Binary Size Analysis (Agent Data Plane)

Target: 1d50af4 (baseline) vs bfca7a2 (comparison) diff
Analysis Type: Stripped binaries (debug symbols excluded)
Baseline Size: 26.19 MiB
Comparison Size: 26.32 MiB
Size Change: +131.63 KiB (+0.49%)
Pass/Fail Threshold: +5%
Result: PASSED ✅

Changes by Module

Module File Size Symbols
core +39.62 KiB 8740
saluki_components::transforms::tag_filterlist +35.61 KiB 17
saluki_context::tags::tagset +11.94 KiB 21
serde_core +8.71 KiB 338
[sections] +8.14 KiB 7
smallvec +8.05 KiB 100
hashbrown +6.38 KiB 354
agent_data_plane::cli::run +5.40 KiB 70
saluki_components::transforms::host_tags -4.73 KiB 8
[Unmapped] +4.20 KiB 1
figment -3.03 KiB 94
saluki_context::hash::hash_context +2.87 KiB 1
axum +2.79 KiB 212
saluki_context::resolver::TagsResolver -2.35 KiB 18
saluki_config::dynamic::watcher -2.34 KiB 2
saluki_components::destinations::prometheus +2.06 KiB 64
saluki_context::hash::hash_context_with_seen -1.99 KiB 1
saluki_components::destinations::dsd_stats +1.99 KiB 10
agent_data_plane::state::metrics +1.86 KiB 6
tokio +1.77 KiB 2276

Detailed Symbol Changes

    FILE SIZE        VM SIZE    
 --------------  -------------- 
  [NEW] +1.79Mi  [NEW] +1.79Mi    std::thread::local::LocalKey<T>::with::h7c684d6f9933cd6b
  +1.7%  +212Ki  +1.8%  +192Ki    [22021 Others]
  [NEW]  +119Ki  [NEW]  +119Ki    agent_data_plane::cli::run::create_topology::_{{closure}}::ha521051891c6ebe9
  [NEW] +84.6Ki  [NEW] +84.5Ki    agent_data_plane::internal::control_plane::spawn_control_plane::_{{closure}}::h730296c1b304c20b
  [NEW] +64.2Ki  [NEW] +64.0Ki    saluki_components::common::datadog::io::run_endpoint_io_loop::_{{closure}}::he1e85622a066a7d7
  [NEW] +57.8Ki  [NEW] +57.7Ki    agent_data_plane::cli::run::handle_run_command::_{{closure}}::hed7e33f4806adb0a
  [NEW] +49.5Ki  [NEW] +49.3Ki    saluki_app::bootstrap::AppBootstrapper::bootstrap::_{{closure}}::h91289bd13f483a0d
  [NEW] +48.0Ki  [NEW] +47.8Ki    _<saluki_components::destinations::prometheus::Prometheus as saluki_core::components::destinations::Destination>::run::_{{closure}}::h708c1b0be89825a9
  [NEW] +46.4Ki  [NEW] +46.3Ki    h2::proto::connection::Connection<T,P,B>::poll::h038035d5ccad579f
  [NEW] +46.1Ki  [NEW] +45.9Ki    _<saluki_components::forwarders::otlp::OtlpForwarder as saluki_core::components::forwarders::Forwarder>::run::_{{closure}}::h290133f32882b5e1
  [DEL] -44.1Ki  [DEL] -43.9Ki    saluki_env::workload::providers::remote_agent::RemoteAgentWorkloadProvider::from_configuration::_{{closure}}::hbdb816f02e21b1cf
  [DEL] -44.4Ki  [DEL] -44.2Ki    _<saluki_components::transforms::aggregate::Aggregate as saluki_core::components::transforms::Transform>::run::_{{closure}}::hb4d1ad0684099c4c
  [DEL] -46.0Ki  [DEL] -45.8Ki    _<saluki_components::destinations::prometheus::Prometheus as saluki_core::components::destinations::Destination>::run::_{{closure}}::hea5618e9afd08b63
  [DEL] -46.1Ki  [DEL] -45.9Ki    _<saluki_components::forwarders::otlp::OtlpForwarder as saluki_core::components::forwarders::Forwarder>::run::_{{closure}}::hb0c627893c271e2c
  [DEL] -46.4Ki  [DEL] -46.3Ki    h2::proto::connection::Connection<T,P,B>::poll::h100071ec98c95c21
  [DEL] -49.5Ki  [DEL] -49.4Ki    saluki_app::bootstrap::AppBootstrapper::bootstrap::_{{closure}}::h15f926ef5659e580
  [DEL] -57.8Ki  [DEL] -57.7Ki    agent_data_plane::cli::run::handle_run_command::_{{closure}}::hebabd4c0d26708e2
  [DEL] -64.2Ki  [DEL] -64.0Ki    saluki_components::common::datadog::io::run_endpoint_io_loop::_{{closure}}::h1c7964dcd8ecbead
  [DEL] -84.6Ki  [DEL] -84.5Ki    agent_data_plane::internal::control_plane::spawn_control_plane::_{{closure}}::h78c38079d35f4f77
  [DEL]  -114Ki  [DEL]  -114Ki    agent_data_plane::cli::run::create_topology::_{{closure}}::h4a1b0b37b7e2d03e
  [DEL] -1.79Mi  [DEL] -1.79Mi    std::thread::local::LocalKey<T>::with::he517c09e5477efe1
  +0.5%  +131Ki  +0.5%  +111Ki    TOTAL

@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Mar 30, 2026

Regression Detector (Agent Data Plane)

This comment was omitted because it was over 65,536 characters.Please check the Gitlab Job logs to see its output.

Copilot AI review requested due to automatic review settings March 31, 2026 18:28
## Summary

Work in progress.

## Change Type
- [ ] Bug fix
- [ ] New feature
- [ ] Non-functional (chore, refactoring, docs)
- [ ] Performance


## How did you test this PR?
<!-- Please how you tested these changes here -->

## References

<!-- Please list any issues closed by this PR. -->

<!--
- Closes: <issue link>
-->

<!-- Any other issues or PRs relevant to this PR? Feel free to list them
here. -->
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 31 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@rayz
Copy link
Copy Markdown
Contributor Author

rayz commented Mar 31, 2026

Closing in favor of #1293

@rayz rayz closed this Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/components Sources, transforms, and destinations. area/core Core functionality, event model, etc. destination/dogstatsd-stats DogStatsD Statistics destination. encoder/datadog-traces Datadog Traces encoder. source/otlp OTLP source. transform/apm-stats APM Stats transform. transform/host-tags Host Tags synchronous transform.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants