Skip to content

Run fetch_pending_data on a dedicated reactor#895

Open
xiaoxichen wants to merge 1 commit into
eBay:stable/v7.xfrom
xiaoxichen:fetcher
Open

Run fetch_pending_data on a dedicated reactor#895
xiaoxichen wants to merge 1 commit into
eBay:stable/v7.xfrom
xiaoxichen:fetcher

Conversation

@xiaoxichen

Copy link
Copy Markdown
Collaborator

Previously fetch_pending_data shared the raft_repl_svc_timer fiber with flush_durable_commit_lsn, gc_repl_, and monitor_replace_member_. When flush_durable_commit_lsn blocked on synchronous metablk I/O (under m_rd_map_mtx), queued fetch batches sat past consensus.data_receive_timeout_ms (10s), tripping the 'Data fetch timeout' assertion / TIMEOUT path.

Spawn a dedicated raft_repl_fetcher reactor (1 fiber) that owns only the fetch timer. Queue and locking stay global; only the consumer thread changes.

Previously fetch_pending_data shared the raft_repl_svc_timer fiber with
flush_durable_commit_lsn, gc_repl_*, and monitor_replace_member_*. When
flush_durable_commit_lsn blocked on synchronous metablk I/O (under
m_rd_map_mtx), queued fetch batches sat past consensus.data_receive_timeout_ms
(10s), tripping the 'Data fetch timeout' assertion / TIMEOUT path.

Spawn a dedicated raft_repl_fetcher reactor (1 fiber) that owns only the
fetch timer. Queue and locking stay global; only the consumer thread changes.

Signed-off-by: Xiaoxi Chen <xiaoxchen@ebay.com>
@codecov-commenter

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 61.11111% with 7 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (stable/v7.x@ef9ab9d). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/lib/replication/service/raft_repl_service.cpp 61.11% 0 Missing and 7 partials ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@              Coverage Diff               @@
##             stable/v7.x     #895   +/-   ##
==============================================
  Coverage               ?   48.21%           
==============================================
  Files                  ?      110           
  Lines                  ?    12964           
  Branches               ?     6229           
==============================================
  Hits                   ?     6251           
  Misses                 ?     2574           
  Partials               ?     4139           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JacksonYao287 JacksonYao287 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

break;
}
auto const next_batch = rreqs;
auto rdev = d;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: we`d better add check to see the state of rdev here. only for the repl_dev with a state of repl_dev_stage_t::ACTIVE, we do check and fetch data.

for pending fetch of repl_dev with other state, we can probably drop it.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I dont want to change logic as of now. But it is a good point.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel free to merge it , or you can do it if you want in this PR .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants