Skip to content

Fix Q-Block re-entrancy in rate-limiting pacing#1973

Open
yavuzhaliloglu wants to merge 3 commits into
obgm:developfrom
yavuzhaliloglu:feature/q-block-response-fix
Open

Fix Q-Block re-entrancy in rate-limiting pacing#1973
yavuzhaliloglu wants to merge 3 commits into
obgm:developfrom
yavuzhaliloglu:feature/q-block-response-fix

Conversation

@yavuzhaliloglu
Copy link
Copy Markdown

@yavuzhaliloglu yavuzhaliloglu commented Apr 30, 2026

Problem

When a 4.08 (Request Entity Incomplete) or 2.31 (Continue) response triggers retransmission of Q-Block packets, the response handler calls coap_send_internal() re-entrantly from inside an outer pacing wait loop. The recursive entry was bypassing pacing entirely because session->is_rate_limiting was already set by the outer frame, causing retransmitted blocks to burst out without delay.

Fix

Two-part fix:

  1. Add a new session->is_doing_q_block flag, set when processing a Q-Block response handler or timer-driven Q-Block restart. The pacing gate in coap_send_internal() now applies pacing when either is_rate_limiting is unset OR is_doing_q_block is set, so re-entrant Q-Block sends still go through the wait loop.

  2. Save and restore the previous value of is_rate_limiting around the inner pacing loop, so nested invocations don't clobber the outer frame's state when they exit.

Testing

Tested against Contiki-NG / 6TiSCH constrained nodes running RFC 9177 Q-Block subsystem. Retransmitted blocks now respect the configured pacing interval under -I rate_limit_ppm.

Copilot AI review requested due to automatic review settings April 30, 2026 08:56
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a re-entrancy corner case where Q-Block retransmissions (triggered by 4.08 / 2.31 handling) can occur while NON pacing is already active, causing retransmitted blocks to bypass pacing and/or overlap other Q-Block send flows.

Changes:

  • Adds a per-session is_doing_q_block flag to mark “Q-Block send/handler in progress” conditions.
  • Updates NON pacing gating in coap_send_internal() to allow pacing to run even during rate-limited re-entrant paths when Q-Block processing is active.
  • Prevents overlapping timer-driven Q-Block “next MAX_PAYLOADS set” pushes and ensures nested Q-Block retransmit handlers save/restore the flag correctly.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/coap_net.c Alters the NON pacing gate in coap_send_internal() to treat Q-Block re-entrancy specially.
src/coap_block.c Adds is_doing_q_block guards/set-reset around timer-driven Q-Block sends and 4.08 retransmit handling to avoid overlaps and preserve state in nested paths.
include/coap3/coap_session_internal.h Introduces the new coap_session_t::is_doing_q_block state flag and documents its intent.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/coap_net.c
Comment thread include/coap3/coap_session_internal.h Outdated
Comment thread src/coap_net.c
yavuzhaliloglu and others added 2 commits April 30, 2026 13:05
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…cy issue

When coap_send_internal() is called recursively from a Q-Block response
handler (4.08, 2.31), the inner pacing loop would clear is_rate_limiting
to 0 on exit, clobbering the outer frame's state. Save the previous
value at entry and restore it at exit so nested invocations don't
corrupt the outer wait loop's guard.

Addresses Copilot review feedback on PR obgm#1973.
@mrdeep1
Copy link
Copy Markdown
Collaborator

mrdeep1 commented May 1, 2026

I'm doing code testing under different failure scenarios which has highlighted a couple of things not affected by this PR that could be handled better - in particular the effects of different timer timeouts when rate limiting.

@yavuzhaliloglu
Copy link
Copy Markdown
Author

Thanks for the heads-up — let me know if anything you find ends up overlapping with this PR or if you'd like me to look into a specific timer interaction.

@mrdeep1
Copy link
Copy Markdown
Collaborator

mrdeep1 commented May 4, 2026

I have come up with an alternative solution in #1976 which gives better recovery and only transmitting packets once unless the packet is reported as missing. It should solve the recursive issues and all transmitted packets are sent at the appropriate rate.

Essentially it sets up a list of blocks to be sent, with remaining blocks updated whenever there is a successful transmission. If a packet is reporting as missing, the remaining blocks gets updated unless the block already exists in the list. The next packet to send is always the first in the remaining blocks list.

Please let me know if #1976 work for you.

I have done a reasonable amount of testing, but it is possible that there are still some corner cases that are not handled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants