Seeder - Adding density distributions by theMickster · Pull Request #7191 · bitwarden/server

theMickster · 2026-03-10T12:17:46Z

🎟️ Tracking

PM-31023 - Relational Density Modeling
PM-32777 - Baked-In Density Preset Profiles

📔 Objective

Add 5 density distributions to the Seeder's preset density block and fix the Hamilton apportionment bug in Distribution<T>. This is PR 2 of 3 splitting #7157 — PR 1 (#7176, preset reorg) is merged, PR 3 (scale presets + docs) follows.

New distributions: collections-per-user, cipher type selection, multi-collection cipher assignment, personal cipher count per user, folder count per user.

Bug fix: Distribution<T> remainder items now use largest-fraction apportionment instead of dumping on the last bucket. Zero-weight buckets get exactly zero.

🧪 Testing

Expand for detailed instructions

Step 1: Verify backward compatibility

Seed the no-density validation preset:

dotnet run -- seed --preset validation.density-modeling-no-density-test --mangle

Verify: 0 CollectionGroup records, uniform round-robin group membership, every cipher assigned to at least one collection. This confirms the null-density path is unchanged.

Step 2: Verify density distributions

Seed a density validation preset:

dotnet run -- seed --preset validation.density-modeling-power-law-test --mangle

Verify: group membership follows power-law decay, permission percentages match configured weights, orphan ciphers present at configured rate.

github-actions · 2026-03-10T12:29:18Z

Checkmarx One – Scan Summary & Details – 4fb5cdf4-ca4e-4ac6-ad41-d6a146158a57

Great job! No new security vulnerabilities introduced in this pull request

codecov · 2026-03-10T12:41:33Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 57.38%. Comparing base (bf9bc84) to head (56c7dc7).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #7191      +/-   ##
==========================================
- Coverage   57.38%   57.38%   -0.01%     
==========================================
  Files        2031     2031              
  Lines       89283    89283              
  Branches     7936     7936              
==========================================
- Hits        51237    51236       -1     
- Misses      36205    36206       +1     
  Partials     1841     1841

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

sonarqubecloud · 2026-03-10T13:22:20Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

theMickster · 2026-03-10T13:32:14Z

util/Seeder/Steps/GenerateFoldersStep.cs

            var digest = userDigests[index];
            var range = distribution.Select(index, userDigests.Count);
-            var count = range.Min + (index % Math.Max(range.Max - range.Min, 1));
+            var count = range.Min + (index % Math.Max(range.Max - range.Min + 1, 1));


This was a 🐛 fix... range.Max - range.Min excluded the max value from the reachable range. With (4, 8) the old formula produced 4-7, never 8. The + 1 makes max inclusive. Same fix applied in GeneratePersonalCiphersStep.RangeCalculationTests validates this.

theMickster · 2026-03-10T13:33:19Z

util/Seeder/Steps/CreateCollectionsStep.cs

    }

-    internal static List<CollectionUser> BuildCollectionUsers(
+    internal List<CollectionUser> BuildCollectionUsers(


Changed from static to instance because the new ComputeCollectionsPerUser() method needs _density to read UserCollectionShape and UserCollectionSkew. Three existing tests updated accordingly... Same assertions, just constructing a step instance first.

theMickster · 2026-03-10T13:34:27Z

util/Seeder/Steps/GenerateCiphersStep.cs

            {
                var orphanCount = (int)(count * _density.OrphanCipherRate);
                var nonOrphanCount = count - orphanCount;
+                var primaryIndices = new int[nonOrphanCount];


📝 The multi-collection loop runs after the primary assignment loop completes. We need to know which collection each cipher was originally assigned to, and the collection index isn't stored on the CollectionCipher entity. Capturing it during the first pass gives O(1) lookup instead of searching back through the list.

theMickster · 2026-03-10T13:36:25Z

util/Seeder/Pipeline/PresetLoader.cs

+        }
+        else if (density?.PersonalCipherDistribution is not null)
+        {
+            builder.AddPersonalCiphers(0, density: density);


📝 The countPerUser: 0 looks a little strange and that it would skip personal cipher generation entirely, but it doesn't.

When PersonalCipherDistribution is set on the density profile, the distribution overrides the flat count — each user gets a variable number. The step's early-return guard checks both: if (countPerUser == 0 && density?.PersonalCipherDistribution is null).

theMickster · 2026-03-10T13:38:42Z

util/Seeder/Steps/GeneratePersonalCiphersStep.cs

        context.Registry.CipherIds.AddRange(cipherIds);
    }
+
+    private static int EstimateTotal(int userCount, Distribution<(int Min, int Max)> dist)


This is needed because typeDistribution.Select(globalIndex, total) needs the exact total cipher count upfront to place each cipher in the correct type bucket. When using a distribution, each user gets a different count, so we can't just multiply userDigests.Count * countPerUser. Without pre-computing, the total is wrong and later ciphers overflow into the last bucket (SSHKey) 🤢 🤮.

This was a real bug we caught in development.

claude · 2026-03-10T13:40:43Z

Claude finished @theMickster's task in 3m 28s —— View job

Code Review: Seeder - Adding density distributions

Reviewed commit: 56c7dc79 (Remove unnecessary and confusing guard)
PR: #7191 - Seeder - Adding density distributions

Summary

This PR adds 5 density distributions (collections-per-user, cipher type selection, multi-collection cipher assignment, personal cipher count per user, folder count per user) and fixes a significant Hamilton apportionment bug in Distribution<T>.GetCounts(). This is PR 2 of 3 splitting a larger change.

Review Result: Approved -- No Issues Found

Security: No concerns. This is a seeder/test utility; no production code paths, no sensitive data handling, and no changes to authentication or encryption logic.

Correctness: Verified the following are correct:

Hamilton apportionment fix (Distribution.cs): Remainder items are distributed to buckets with the largest fractional parts instead of being dumped on the last bucket. Zero-weight buckets correctly receive zero items. The -1.0 sentinel after awarding a remainder item prevents double-counting.
Range formula fix (GenerateFoldersStep.cs, GeneratePersonalCiphersStep.cs): The +1 in Math.Max(range.Max - range.Min + 1, 1) correctly makes the max value inclusive. The old formula excluded range.Max from the reachable set.
Multi-collection assignment (GenerateCiphersStep.cs): The secondary index formula (primaryIndices[i] + 1 + j) % collectionIds.Count guarantees no secondary equals primary. The extraCount clamp to collectionIds.Count - 1 prevents overflow. The primaryIndices array is correctly bounded by nonOrphanCount, and multiCount <= nonOrphanCount ensures no out-of-bounds access.
EstimateTotal pre-computation (GeneratePersonalCiphersStep.cs): Uses the same formula as the generation loop, so typeDistribution.Select(globalIndex, expectedTotal) receives an accurate total.
Backward compatibility contract maintained: All steps guard if (_density == null) to preserve the original round-robin path.

Breaking Changes: None. BuildCollectionUsers changed from static to instance, but it is internal and tests were updated accordingly. New parameters on AddFolders() and AddPersonalCiphers() have default values of null.

Test Coverage: Thorough. New tests cover Hamilton apportionment, zero-weight buckets, range formula inclusivity, multi-collection duplicate prevention, and all three ComputeCollectionsPerUser shapes (Uniform, PowerLaw, FrontLoaded). Codecov confirms all modified lines are covered.

CI Status: All checks pass -- build, tests, lint, Checkmarx security scan, SonarQube quality gate, and code coverage.

Add 5 density distributions and fix Hamilton apportionment bug.

9f6eb75

theMickster requested a review from nthompson-bitwarden March 10, 2026 12:19

theMickster added 2 commits March 10, 2026 13:59

Fix silent fallback failure in PresetLoader and fix poor comment

a2fcc1a

Remove unnecessary and confusing guard

56c7dc7

theMickster commented Mar 10, 2026

View reviewed changes

theMickster added the ai-review Request a Claude code review label Mar 10, 2026

theMickster marked this pull request as ready for review March 10, 2026 13:39

nthompson-bitwarden approved these changes Mar 10, 2026

View reviewed changes

theMickster merged commit bf40668 into main Mar 10, 2026
80 checks passed

theMickster deleted the PM-31023/density-distributions-and-fixes branch March 10, 2026 15:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seeder - Adding density distributions #7191

Seeder - Adding density distributions #7191
theMickster merged 3 commits intomainfrom
PM-31023/density-distributions-and-fixes

theMickster commented Mar 10, 2026

Uh oh!

github-actions bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Mar 10, 2026

Uh oh!

theMickster Mar 10, 2026

Uh oh!

theMickster Mar 10, 2026 •

edited

Loading

Uh oh!

theMickster Mar 10, 2026

Uh oh!

theMickster Mar 10, 2026

Uh oh!

theMickster Mar 10, 2026

Uh oh!

claude bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

theMickster commented Mar 10, 2026

🎟️ Tracking

📔 Objective

🧪 Testing

Step 1: Verify backward compatibility

Step 2: Verify density distributions

Uh oh!

github-actions bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Great job! No new security vulnerabilities introduced in this pull request

Uh oh!

codecov bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sonarqubecloud bot commented Mar 10, 2026

Quality Gate passed

Uh oh!

theMickster Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

theMickster Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

theMickster Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

theMickster Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

theMickster Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review: Seeder - Adding density distributions

Summary

Review Result: Approved -- No Issues Found

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Mar 10, 2026 •

edited

Loading

codecov bot commented Mar 10, 2026 •

edited

Loading

theMickster Mar 10, 2026 •

edited

Loading

claude bot commented Mar 10, 2026 •

edited

Loading