Skip to content

SDSTOR-22783: priority scheduling with dual watermarks and cross-scan quota#436

Merged
xiaoxichen merged 1 commit into
eBay:stable/v4.xfrom
xiaoxichen:gc-sort
Jun 24, 2026
Merged

SDSTOR-22783: priority scheduling with dual watermarks and cross-scan quota#436
xiaoxichen merged 1 commit into
eBay:stable/v4.xfrom
xiaoxichen:gc-sort

Conversation

@xiaoxichen

Copy link
Copy Markdown
Collaborator
  • Sort eligible chunks by garbage ratio (desc) before submission so the most garbage-heavy chunks are always GC'd first
  • Add gc_garbage_rate_threshold_low (default 30%) as a low watermark; chunks between the two watermarks consume at most half the quota
  • Track m_pending_normal_gc_task_count in pdev_gc_actor to reflect tasks queued or running in m_gc_executor across scan cycles; previous code only capped submissions per scan, allowing unbounded queue growth
  • scan_chunks_for_gc now skips a pdev entirely when already at quota, and derives low_tier_cap proportionally from remaining_capacity
  • Add ADR docs/adr/gc-priority-scheduling.md

@codecov-commenter

codecov-commenter commented Jun 17, 2026

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 66.00000% with 17 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (stable/v4.x@5a3b409). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/lib/homestore_backend/gc_manager.cpp 64.58% 13 Missing and 4 partials ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@              Coverage Diff               @@
##             stable/v4.x     #436   +/-   ##
==============================================
  Coverage               ?   54.41%           
==============================================
  Files                  ?       36           
  Lines                  ?     5419           
  Branches               ?      685           
==============================================
  Hits                   ?     2949           
  Misses                 ?     2164           
  Partials               ?      306           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread src/lib/homestore_backend/gc_manager.cpp Outdated
Comment thread src/lib/homestore_backend/gc_manager.cpp
eligible.push_back({chunk_id, ratio_pct});
}

// Sort eligible chunks by garbage ratio descending so the most garbage-heavy chunks are GC'd first.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a topK problem.
suggest to use a heap with a capacity of remaining_capacity (priority_queue), so that we can hold at most remaining_capacity ChunkGCInfo in memory. and for any new ChunkGCInfo, we only need to compare it with the top of the heap.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO It doesnt worth those lines of code , considering the maximum chunk is 32K.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to have it since it is not complicated and will also make code more simple, for example, the loop on the entire chunk collection will not be involved.

Comment thread src/lib/homestore_backend/gc_manager.cpp
Comment thread src/lib/homestore_backend/gc_manager.cpp
Comment thread src/lib/homestore_backend/gc_manager.cpp Outdated
@xiaoxichen

Copy link
Copy Markdown
Collaborator Author

@JacksonYao287 addressed

JacksonYao287
JacksonYao287 previously approved these changes Jun 24, 2026
Sort eligible chunks by garbage ratio using a bounded top-K min-heap
(O(K) per pdev) so the most garbage-heavy chunks GC first. Add
gc_garbage_rate_threshold_low (default 30%) as a low watermark; chunks
between the two watermarks consume at most half the quota.

Track m_pending_normal_gc_task_count across scan cycles to cap
queued+running tasks (not just per-scan submissions); skip pdevs
already at quota. Fix counter underflow by incrementing before
m_gc_executor->add(), and fix quota leak by decrementing on the
early-return path in process_gc_task.

Add ADR docs/adr/gc-priority-scheduling.md.

Signed-off-by: Xiaoxi Chen <xiaoxchen@ebay.com>
@xiaoxichen xiaoxichen changed the title gc: priority scheduling with dual watermarks and cross-scan quota SDSTOR-22783: priority scheduling with dual watermarks and cross-scan quota Jun 24, 2026
@xiaoxichen xiaoxichen merged commit 6722a38 into eBay:stable/v4.x Jun 24, 2026
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants