Skip to content

chore: consolidate SourceSet variants into PopulationRange#826

Open
cdc-as81 wants to merge 6 commits intomainfrom
cdc-as81-add-population-range-source-set
Open

chore: consolidate SourceSet variants into PopulationRange#826
cdc-as81 wants to merge 6 commits intomainfrom
cdc-as81-add-population-range-source-set

Conversation

@cdc-as81
Copy link
Copy Markdown
Collaborator

@cdc-as81 cdc-as81 commented Mar 26, 2026

Summary

Add contiguous-range support to SourceSet/EntitySet while preserving the existing whole-population snapshot path.

This PR introduces range-backed helpers for empty sets, singleton sets, and arbitrary contiguous EntityId intervals. It keeps FullPopulation and PopulationIterator as the dedicated representation for "the entire population at iterator creation time", and adds a separate iterator path for non-population ranges.

What Changed

  • Added population-backed helper constructors:
    • SourceSet::population_range(...)
    • SourceSet::empty_range()
    • SourceSet::singleton(...)
    • SourceSet::full_population(...)
  • Added EntityIdRangeIterator and SourceIterator::PopulationRange for arbitrary contiguous intervals.
  • Kept PopulationIterator as the whole-population iterator instead of broadening it to arbitrary ranges.
  • Extended EntitySet simplification logic with interval-aware cases for:
    • union of overlapping or adjacent ranges
    • intersection of overlapping ranges
    • difference when the result is still a single contiguous range
  • Added conservative subset checks to enable safe range-based simplifications.
  • Updated whole-population query call sites to use SourceSet::full_population(...).
  • Expanded tests around:
    • range helper behavior
    • iterator membership after partial consumption
    • interval-aware simplification
    • clone preservation for property-backed and composite sets
    • exact size hints after simplification

@cdc-as81 cdc-as81 linked an issue Mar 26, 2026 that may be closed by this pull request
github-actions bot added a commit that referenced this pull request Mar 26, 2026
@github-actions
Copy link
Copy Markdown

Benchmark Results

Hyperfine

Command Mean [ms] Min [ms] Max [ms] Relative
large_sir::baseline 2.9 ± 0.1 2.8 3.1 1.00
large_sir::entities 12.2 ± 0.3 11.8 13.2 4.24 ± 0.13

Criterion

Regressions (slower)
Group Bench Param Change CI Lower CI Upper
sampling sampling_single_known_length_entities 10.652% 9.732% 11.601%
large_dataset bench_query_population_derived_property_entities 6.763% 5.464% 8.046%
algorithm_benches algorithm_sampling_multiple_known_length 2.959% 2.149% 3.773%
sample_entity sample_entity_single_property_unindexed 1000 2.386% 1.824% 3.169%
indexing with_query_results_multiple_individually_indexed_properties_enti 1.980% 1.483% 2.340%
Improvements (faster)
Group Bench Param Change CI Lower CI Upper
sample_entity sample_entity_whole_population 100000 -44.748% -45.347% -44.115%
sample_entity sample_entity_whole_population 1000 -43.516% -44.106% -42.955%
sample_entity sample_entity_whole_population 10000 -40.330% -41.150% -39.597%
counts multi_property_indexed_entities -21.941% -22.696% -21.111%
sample_entity sample_entity_single_property_unindexed 100000 -19.072% -19.675% -18.414%
indexing query_people_indexed_multi-property_entities -18.250% -18.501% -17.967%
counts multi_property_unindexed_entities -16.366% -17.516% -15.126%
indexing query_people_count_single_indexed_property_entities -12.639% -12.943% -12.240%
indexing with_query_results_single_indexed_property_entities -12.174% -12.743% -11.679%
sampling sampling_single_l_reservoir_entities -11.128% -12.303% -10.119%
sampling sampling_multiple_l_reservoir_entities -10.799% -11.691% -9.976%
indexing query_people_multiple_individually_indexed_properties_entities -8.460% -8.623% -8.275%
indexing query_people_count_multiple_individually_indexed_properties_enti -7.648% -7.775% -7.539%
sampling count_and_sampling_single_unindexed_concrete_plus_derived_entiti -5.750% -5.901% -5.595%
indexing query_people_count_indexed_multi-property_entities -4.700% -5.182% -4.203%
sample_entity sample_entity_multi_property_indexed 100000 -4.329% -4.580% -3.989%
sample_entity sample_entity_multi_property_indexed 1000 -4.186% -4.694% -3.682%
large_dataset bench_match_entity -4.093% -5.196% -2.601%
sample_entity sample_entity_multi_property_indexed 10000 -3.911% -4.282% -3.438%
sampling sampling_single_unindexed_concrete_plus_derived_entities -3.546% -3.914% -3.224%
sample_entity sample_entity_single_property_indexed 1000 -3.537% -3.945% -3.055%
indexing with_query_results_indexed_multi-property_entities -3.399% -4.429% -2.606%
counts single_property_indexed_entities -3.065% -3.897% -1.829%
sampling sampling_single_unindexed_entities -2.295% -2.687% -1.906%
sample_entity sample_entity_single_property_indexed 10000 -2.212% -2.674% -1.670%
sampling sampling_multiple_known_length_entities -2.168% -3.014% -1.422%
sample_entity sample_entity_single_property_indexed 100000 -2.018% -2.445% -1.506%
algorithm_benches algorithm_sampling_multiple_l_reservoir -1.817% -2.331% -1.411%
examples example-basic-infection -1.789% -2.225% -1.412%
counts concrete_plus_derived_unindexed_entities -1.605% -2.349% -1.059%
Unchanged / inconclusive (CI crosses 0%)
Group Bench Param Change CI Lower CI Upper
counts single_property_unindexed_entities -1.982% -3.251% -0.964%
large_dataset bench_query_population_multi_unindexed_entities -1.447% -2.866% -0.258%
large_dataset bench_filter_indexed_entity -1.194% -13.841% 12.609%
counts index_after_adding_entities -1.190% -1.374% -0.996%
large_dataset bench_query_population_indexed_property_entities -0.824% -1.384% -0.361%
examples example-births-deaths -0.684% -0.973% -0.417%
algorithm_benches algorithm_sampling_single_l_reservoir 0.660% 0.402% 1.003%
large_dataset bench_filter_unindexed_entity 0.611% -3.929% 5.534%
large_dataset bench_query_population_property_entities 0.451% -0.204% 1.255%
large_dataset bench_query_population_multi_indexed_entities 0.375% -0.032% 0.887%
algorithm_benches algorithm_sampling_single_known_length 0.359% -0.112% 1.046%
counts reindex_after_adding_more_entities 0.357% 0.127% 0.563%
sampling sampling_multiple_unindexed_entities 0.228% 0.123% 0.376%
algorithm_benches algorithm_sampling_single_rand_reservoir 0.198% -0.055% 0.525%
indexing query_people_single_indexed_property_entities 0.106% 0.034% 0.184%
sampling count_and_sampling_single_known_length_entities 0.038% -1.172% 1.175%
sample_entity sample_entity_single_property_unindexed 10000 0.010% -0.735% 0.882%

Comment thread src/entity/entity_set/entity_set.rs
)
}
/// Returns the contained entity id if this set is a singleton leaf.
fn as_singleton(&self) -> Option<EntityId<E>> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect once you start thinking about the simplification cases for intervals, this private helper method will disappear or be replaced with an interval analog.

Comment thread src/entity/entity_set/entity_set.rs Outdated
Comment thread src/entity/entity.rs
Comment thread src/entity/entity.rs Outdated
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 1, 2026

Benchmark Results

Hyperfine

Command Mean [ms] Min [ms] Max [ms] Relative
large_sir::baseline 2.8 ± 0.0 2.8 2.9 1.00
large_sir::entities 12.8 ± 0.1 12.6 13.2 4.54 ± 0.07

Criterion

Regressions (slower)
Group Bench Param Change CI Lower CI Upper
sample_entity sample_entity_single_property_unindexed 10000 10.556% 9.544% 11.479%
sampling sampling_single_l_reservoir_entities 8.454% 7.965% 8.871%
examples example-births-deaths 5.802% 4.818% 7.137%
sample_entity sample_entity_multi_property_indexed 10000 5.232% 4.623% 5.813%
counts reindex_after_adding_more_entities 4.752% 4.511% 4.990%
sample_entity sample_entity_multi_property_indexed 1000 3.906% 3.317% 4.459%
sampling sampling_multiple_unindexed_entities 2.807% 2.646% 3.102%
sample_entity sample_entity_multi_property_indexed 100000 2.655% 2.149% 3.135%
indexing with_query_results_indexed_multi-property_entities 2.454% 2.042% 2.843%
algorithm_benches algorithm_sampling_multiple_l_reservoir 2.219% 1.714% 2.800%
indexing with_query_results_multiple_individually_indexed_properties_enti 1.834% 1.529% 2.111%
Improvements (faster)
Group Bench Param Change CI Lower CI Upper
large_dataset bench_filter_indexed_entity -10.372% -20.098% -1.495%
indexing query_people_count_multiple_individually_indexed_properties_enti -8.501% -9.045% -7.997%
sample_entity sample_entity_single_property_unindexed 1000 -7.080% -7.773% -6.359%
large_dataset bench_match_entity -7.037% -7.244% -6.832%
counts single_property_indexed_entities -6.557% -7.145% -5.693%
indexing query_people_single_indexed_property_entities -6.418% -8.303% -4.515%
algorithm_benches algorithm_sampling_single_known_length -5.007% -5.562% -4.241%
sample_entity sample_entity_single_property_indexed 10000 -3.508% -3.922% -3.224%
sampling count_and_sampling_single_unindexed_concrete_plus_derived_entiti -3.426% -3.752% -3.057%
sample_entity sample_entity_single_property_unindexed 100000 -3.036% -3.277% -2.807%
sampling sampling_multiple_l_reservoir_entities -2.872% -3.099% -2.642%
indexing query_people_indexed_multi-property_entities -2.863% -3.114% -2.585%
examples example-basic-infection -2.231% -2.650% -1.846%
sampling sampling_single_unindexed_concrete_plus_derived_entities -2.205% -2.560% -1.857%
sample_entity sample_entity_single_property_indexed 1000 -1.999% -2.461% -1.645%
algorithm_benches algorithm_sampling_multiple_known_length -1.992% -2.580% -1.623%
sampling sampling_multiple_known_length_entities -1.968% -2.663% -1.285%
counts multi_property_unindexed_entities -1.915% -2.207% -1.416%
sample_entity sample_entity_single_property_indexed 100000 -1.834% -2.239% -1.296%
sampling sampling_single_known_length_entities -1.819% -2.363% -1.282%
Unchanged / inconclusive (CI crosses 0%)
Group Bench Param Change CI Lower CI Upper
large_dataset bench_filter_unindexed_entity -4.547% -8.200% -0.658%
indexing query_people_count_indexed_multi-property_entities 1.450% 0.652% 2.244%
indexing query_people_count_single_indexed_property_entities 1.431% 0.855% 1.999%
large_dataset bench_query_population_multi_indexed_entities -1.094% -1.389% -0.839%
counts multi_property_indexed_entities -1.084% -1.742% -0.595%
sample_entity sample_entity_whole_population 1000 -1.058% -2.248% -0.127%
indexing query_people_multiple_individually_indexed_properties_entities 1.047% 0.646% 1.382%
large_dataset bench_query_population_indexed_property_entities 0.894% 0.571% 1.330%
indexing with_query_results_single_indexed_property_entities 0.842% 0.384% 1.485%
large_dataset bench_query_population_multi_unindexed_entities 0.581% 0.224% 1.071%
sample_entity sample_entity_whole_population 100000 -0.571% -0.969% -0.274%
sample_entity sample_entity_whole_population 10000 -0.448% -0.830% -0.126%
algorithm_benches algorithm_sampling_single_rand_reservoir 0.444% 0.278% 0.592%
counts index_after_adding_entities -0.416% -0.544% -0.299%
large_dataset bench_query_population_derived_property_entities 0.389% -0.066% 0.898%
algorithm_benches algorithm_sampling_single_l_reservoir -0.360% -0.528% -0.163%
counts single_property_unindexed_entities 0.271% -0.273% 0.850%
sampling sampling_single_unindexed_entities -0.197% -0.372% -0.043%
sampling count_and_sampling_single_known_length_entities -0.141% -0.487% 0.337%
large_dataset bench_query_population_property_entities -0.090% -0.530% 0.232%
counts concrete_plus_derived_unindexed_entities -0.086% -0.345% 0.147%

@github-actions
Copy link
Copy Markdown

Benchmark Results

Hyperfine

Command Mean [ms] Min [ms] Max [ms] Relative
large_sir::baseline 2.9 ± 0.0 2.8 3.0 1.00
large_sir::entities 11.1 ± 0.1 10.8 11.6 3.90 ± 0.07

Criterion

Regressions (slower)
Group Bench Param Change CI Lower CI Upper
examples example-births-deaths 18.634% 17.152% 19.972%
indexing query_people_count_single_indexed_property_entities 15.178% 13.688% 17.012%
indexing with_query_results_single_indexed_property_entities 13.903% 13.167% 14.686%
sampling sampling_single_l_reservoir_entities 11.094% 10.924% 11.271%
sample_entity sample_entity_single_property_unindexed 10000 10.770% 8.320% 13.397%
large_dataset bench_query_population_multi_unindexed_entities 8.930% 7.180% 10.688%
sample_entity sample_entity_multi_property_indexed 10000 8.684% 8.190% 9.136%
examples example-basic-infection 7.637% 7.059% 8.166%
sample_entity sample_entity_multi_property_indexed 100000 6.750% 6.166% 7.231%
large_dataset bench_filter_unindexed_entity 6.072% 2.796% 9.607%
large_dataset bench_match_entity 5.740% 5.416% 6.151%
sampling sampling_single_unindexed_entities 5.444% 5.332% 5.547%
sampling sampling_multiple_unindexed_entities 4.998% 4.900% 5.080%
indexing query_people_indexed_multi-property_entities 4.488% 4.304% 4.688%
large_dataset bench_query_population_derived_property_entities 4.284% 2.983% 5.550%
indexing query_people_multiple_individually_indexed_properties_entities 4.090% 3.762% 4.418%
sample_entity sample_entity_multi_property_indexed 1000 4.005% 2.989% 4.851%
sampling sampling_single_unindexed_concrete_plus_derived_entities 2.220% 1.967% 2.509%
sample_entity sample_entity_whole_population 1000 1.555% 1.305% 1.776%
counts concrete_plus_derived_unindexed_entities 1.485% 1.228% 1.889%
Improvements (faster)
Group Bench Param Change CI Lower CI Upper
large_dataset bench_filter_indexed_entity -17.877% -25.738% -9.842%
indexing query_people_count_multiple_individually_indexed_properties_enti -5.604% -5.968% -5.319%
sampling sampling_single_known_length_entities -4.066% -4.393% -3.633%
indexing with_query_results_indexed_multi-property_entities -2.166% -2.747% -1.653%
sample_entity sample_entity_single_property_unindexed 1000 -1.673% -2.311% -1.070%
sampling sampling_multiple_l_reservoir_entities -1.571% -1.638% -1.486%
Unchanged / inconclusive (CI crosses 0%)
Group Bench Param Change CI Lower CI Upper
sampling sampling_multiple_known_length_entities 1.700% -0.557% 3.409%
sample_entity sample_entity_whole_population 10000 1.433% 0.794% 2.321%
counts multi_property_unindexed_entities -1.134% -1.371% -0.911%
indexing query_people_count_indexed_multi-property_entities -1.115% -1.597% -0.416%
sampling count_and_sampling_single_unindexed_concrete_plus_derived_entiti -1.049% -1.257% -0.798%
sampling count_and_sampling_single_known_length_entities -0.950% -1.403% -0.235%
algorithm_benches algorithm_sampling_multiple_l_reservoir -0.937% -1.422% -0.543%
sample_entity sample_entity_whole_population 100000 0.930% 0.460% 1.325%
indexing query_people_single_indexed_property_entities -0.919% -2.348% 0.626%
large_dataset bench_query_population_multi_indexed_entities 0.753% 0.069% 1.428%
counts reindex_after_adding_more_entities -0.716% -0.999% -0.429%
sample_entity sample_entity_single_property_indexed 10000 -0.626% -1.134% -0.167%
sample_entity sample_entity_single_property_indexed 100000 -0.588% -0.866% -0.306%
sample_entity sample_entity_single_property_indexed 1000 -0.560% -0.897% -0.178%
sample_entity sample_entity_single_property_unindexed 100000 0.365% 0.098% 0.668%
algorithm_benches algorithm_sampling_single_known_length -0.359% -1.193% 0.280%
large_dataset bench_query_population_indexed_property_entities 0.341% 0.108% 0.552%
indexing with_query_results_multiple_individually_indexed_properties_enti 0.271% 0.026% 0.663%
counts index_after_adding_entities 0.250% 0.052% 0.478%
large_dataset bench_query_population_property_entities 0.240% -0.525% 1.099%
algorithm_benches algorithm_sampling_single_rand_reservoir -0.231% -0.641% 0.060%
counts multi_property_indexed_entities 0.208% -0.199% 0.498%
counts single_property_indexed_entities 0.206% -0.362% 0.802%
counts single_property_unindexed_entities -0.143% -0.374% 0.071%
algorithm_benches algorithm_sampling_single_l_reservoir 0.077% -0.137% 0.333%
algorithm_benches algorithm_sampling_multiple_known_length -0.019% -0.594% 0.457%

github-actions bot added a commit that referenced this pull request Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a SourceSet variant for a range of EntityIds

3 participants