Skip to content

fix: use all changelog timestamps to estimate package age#140

Draft
HastD wants to merge 1 commit into
coreos:mainfrom
HastD:stability-calc
Draft

fix: use all changelog timestamps to estimate package age#140
HastD wants to merge 1 commit into
coreos:mainfrom
HastD:stability-calc

Conversation

@HastD

@HastD HastD commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

In the stability calculation, span_days should cover the full lookback period if the oldest timestamp is earlier than the beginning of the lookback period. For example, if a package was only updated two years ago and again one month ago, the package is over a year old, so we're looking at one update in the past year, not just one update in the past month (as we would for a package with no changelog entries earlier than a month ago).

The previous calculation underestimated stability of packages that had no updates for a long time, followed by a recent update.

Also optimize stability computation by avoiding unnecessary allocations.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the calculate_stability function in src/utils.rs to determine the oldest timestamp from changelog_times (falling back to buildtime) and adjusts the calculation of span_days depending on whether the oldest timestamp falls within the lookback period. The reviewer suggested a performance optimization to avoid unnecessary heap allocations by using iterator adapters to count relevant changes instead of allocating a vector.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/utils.rs
@HastD HastD changed the title fix: use all changelog timestamps to estimate package age feat: improve stability calculation Jun 7, 2026
@jlebon

jlebon commented Jun 8, 2026

Copy link
Copy Markdown
Member

For example, if a package was only updated two years
ago and again one month ago, the package is over a year old, so we're
looking at one update in the past year, not just one update in the
past month (as we would for a package with no changelog entries
earlier than a month ago).

Hmm, I'm not sure about this. Or at least it's not obvious to me. The reason we purposely ignore changelog items older than a year is because those old entries are no longer indicative of how frequent the package changes if we have newer information on hand. If e.g. a package wasn't touched for two years and then there are two recent changes in the last month, that should weigh differently than those three events being evenly spread across the lookback period.

I could imagine special-casing the scenario you describe where there's only n=1 recent event in the lookback period to still bias towards stability.

@jlebon

jlebon commented Jun 8, 2026

Copy link
Copy Markdown
Member

One thing we could do which I think would probably be more appropriate is to switch the lambda calculation from being an average to actually weighing the events based on their age e.g. via exponential decay.

But as always with all this, it needs to be data-driven so that we actually measure a noticeable improvement in packing performance and not just shooting in the dark.

@HastD

HastD commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

Should I split this off into two separate PRs? I see your point about disregarding older changelog entries when calculating the lookback period, but I think the change to bin changelog entries by day should be a more unambiguous improvement.

In any case, I'll run some tests to see what difference these changes make.

@jlebon

jlebon commented Jun 8, 2026

Copy link
Copy Markdown
Member

Should I split this off into two separate PRs? I see your point about disregarding older changelog entries when calculating the lookback period, but I think the change to bin changelog entries by day should be a more unambiguous improvement.

Yes, please open a separate PR!

@HastD

HastD commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

Okay, I opened a separate PR for the changelog binning: #143

Marking this PR as draft for now since it needs further testing.

@HastD HastD marked this pull request as draft June 8, 2026 15:54
@HastD HastD changed the title feat: improve stability calculation fix: use all changelog timestamps to estimate package age Jun 8, 2026
@HastD HastD force-pushed the stability-calc branch from 385d4f8 to 7dc156a Compare June 8, 2026 19:03
@HastD

HastD commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

So I just tested out this change (not including the changelog binning that I moved to separate PR) with the same dataset I looked at before, using the secureblue kinoite-main-hardened images for each day in May 2026 and comparing their manifests: https://github.com/HastD/scratchwork/actions/runs/27159280005

The difference in update sizes compared to the results with the current version of chunkah wasn't all that big, but it did decrease average update sizes by about 2% for both daily updates and every-3-days updates. Hard to tell whether this is statistically significant or just noise, but it at least suggests that this is probably a positive-or-neutral change.

In the stability calculation, `span_days` should cover the full lookback
period if the oldest timestamp is earlier than the beginning of the
lookback period. For example, if a package was only updated two years
ago and again one month ago, the package is over a year old, so we're
looking at one update in the past *year*, not just one update in the
past *month* (as we would for a package with no changelog entries
earlier than a month ago).

The previous calculation underestimated stability of packages that had
no updates for a long time, followed by a recent update.
@HastD HastD force-pushed the stability-calc branch from 7dc156a to 1802c95 Compare June 8, 2026 19:33
@HastD

HastD commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

Okay, after the changelog binning commit was merged, I redid the above test to compare the effect of this change on top of changelog binning, and this time update sizes increased by 1-2%... so probably it's just noise. 😔

I'm not sure whether it would be better to use some sort of exponential decay weighting to have more recent data be more influential with smooth drop-off as the data ages, rather than the current approach of uniformly weighting over a range with a sharp cutoff. It might give more accurate estimates sooner for packages that genuinely have a change in how actively updated it is, but on the other hand, weighting recent changelog entries too heavily could make the stability score itself less stable over time, increasing the likelihood of packages jumping between stability tiers.

@jlebon

jlebon commented Jun 9, 2026

Copy link
Copy Markdown
Member

Okay, after the changelog binning commit was merged, I redid the above test to compare the effect of this change on top of changelog binning, and this time update sizes increased by 1-2%... so probably it's just noise. 😔

Nice, thanks for testing.

but on the other hand, weighting recent changelog entries too heavily could make the stability score itself less stable over time, increasing the likelihood of packages jumping between stability tiers

Yeah, that's a valid point.

I'm open to changing the approach here more radically (even e.g. swap out Poisson for something else), assuming it yields better results. Two concerns there are (1) complexity and (2) overfitting to whatever distribution we're benchmarking against, e.g. Fedora (so ideally we cross-check against other distributions/ecosystems).

Were you dissatisfied with some of the packing results BTW or just interested in optimizing things?

@HastD

HastD commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

I think the current packing results are satisfactory in the sense that they're comparable to (for daily updates) or moderately better than (for less frequent updates) the results from build-chunked-oci. However, intuitively I feel like there's probably a fair bit of room for improvement.

One idea that I've been thinking about is that, when I look at package data, there seem to be several natural groupings of packages that usually update together; for example, on Kinoite, there's the qt6-* packages, the kf6-* packages, and the KDE Plasma packages (you can see these updating as a group in Bodhi). But Chunkah doesn't do anything special to group these together (and these are higher-level groupings than SRPMs), so they tend to end up scattered throughout their stability tier, which is inefficient.

If there was some way to either automatically detect such groupings (some sort of correlation/clustering analysis on the changelog timestamps, perhaps?) or allow them to be specified via configuration, I suspect this could result in substantial improvements in layer reuse. However, any method of automatically detecting groupings would need to either be quite stable (to minimize how often the groupings change over time) or would probably need to be accompanied by something along the lines of #39 to prevent drift in the layer plan by reading a previous manifest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants