-
Notifications
You must be signed in to change notification settings - Fork 1.4k
sched: add a reference count to the TCB to prevent it from being deleted. #17468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
f192057 to
cae0ffd
Compare
|
Is it possible to make this PR optional? Perhaps it should affect only the bigger SMP-capable chips and not the smaller single core ones? |
|
@anchao @acassis Not Just SMP: Limitations of Critical Sections: In this scenario, even with enter_critical_section(), there can still be a race if syslog() internally leads to context switches that affect the TCB’s lifetime. On Mainstream RTOS Practice: On Performance and Size Impact: If there is a better alternative to solve this problem, then this patch is unnecessary. Patches are welcome. |
@GUIDINGLI thank you for bring more information to this discussion.
Seems like osperf is not reliable for this kind of test, maybe we need to use Segger JLink RTT or OBRTrace Orbuculum to get more precise/reliable data. |
@GUIDINGLI @hujun260 This explanation is extremely helpful for giving context and understanding this change. I recommend to please add this explanation to the PR description and also to the commit log, so that it can be easily understood in the future. Thank you! |
hartmannathan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the explanation of @GUIDINGLI to the commit log and PR description because it helps to understand the changes.
|
@acassis @hartmannathan |
You can try running this code. The nxsched_get_tcb() function incurs a 2.5~4% performance overhead, and this function is used throughout the entire nuttx: Additionally, I conducted the validation on the sim host PC; running this test on an MCU may result in even greater performance degradation: Before this change: After this change: |
Nice work @anchao !!! Based on this test alone, I can see the overhead was about 3.26% => ((0.303873630 / 0.296554913) - 1) * 100 So, we need to decide whether make sense to accept this overhead to get the benefit from this PR or not. Alternatively as @hartmannathan pointed this feature could be configurable (although I think NuttX already have enough control switches, we need to remove some instead of adding it) |
|
@anchao |
I think we are aligned on the same objective. NuttX is a real-time operating system, and our modifications should be more deterministic and faster, and should not impose any additional burden on developers familiar with NuttX. |
During the discussion about the impact of adding counter to TCB Mr anchao created a pthread mutex performance example: apache/nuttx#17468 (comment) Because this example can exercise (not ideally) the pthread mutex I decided to add it to apps/testing/sched, the program name will be called "pmutexp" (because "pmp" is not a good name). Signed-off-by: Alan C. Assis <acassis@gmail.com>
Test Objective nucleo-g431rb:nsh after this patch before this patch flash size increase 732 byte |
As I said before: |
Could we adopt @acassis and @hartmannathan suggestion and implement this feature as a configurable option? |
I've added the nxsched_get_tcb_noref interface—its performance is the same as before—and I've also reimplemented nxsched_verify_pid. Furthermore, I retested the scenario you mentioned, and there's no performance degradation at all. |
39057a1 to
a4c127c
Compare
It seems that not all implementations have replaced nxsched_get_tcb_noref, which means the system performance will still degrade. My test case is only intended to make you aware of the performance degradation caused by nxsched_get_tcb. The pthread_mutex case is just one of them—if I provide other test cases, your patch will have many more performance bottlenecks requiring optimization. could we address this issue from the perspective of architectural implementation? |
GUIDINGLI
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…deleted. To replace the large lock with smaller ones and reduce the large locks related to the TCB, in many scenarios, we only need to ensure that the TCB won't be released instead of locking, thus reducing the possibility of lock recursion. Signed-off-by: hujun5 <hujun5@xiaomi.com>

Summary
add a reference count to the TCB to prevent it from being deleted.
To replace the large lock with smaller ones and reduce the large locks related to the TCB, in many scenarios, we only need to ensure that the TCB won't be released instead of locking, thus reducing the possibility of lock recursion.
should merge with apache/nuttx-apps#3246
Impact
tcb release
Test Objective nucleo-g431rb:nsh
after this patch
size nuttx
text data bss dec hex filename
127964 904 8644 137512 21928 nuttx
before this patch
size nuttx
text data bss dec hex filename
127232 904 8612 136748 2162c nuttx
flash size increase 732 byte
Testing
test in hardware
esp32s3-devkit:nsh
user_main: scheduler lock test
sched_lock: Starting lowpri_thread at 97
sched_lock: Set lowpri_thread priority to 97
sched_lock: Starting highpri_thread at 98
sched_lock: Set highpri_thread priority to 98
sched_lock: Waiting...
sched_lock: PASSED No pre-emption occurred while scheduler was locked.
sched_lock: Starting lowpri_thread at 97
sched_lock: Set lowpri_thread priority to 97
sched_lock: Starting highpri_thread at 98
sched_lock: Set highpri_thread priority to 98
sched_lock: Waiting...
sched_lock: PASSED No pre-emption occurred while scheduler was locked.
sched_lock: Finished
End of test memory usage:
VARIABLE BEFORE AFTER
======== ======== ========
arena 5d8bc 5d8bc
ordblks 7 6
mxordblk 548a0 548a0
uordblks 5014 5014
fordblks 588a8 588a8
Final memory usage:
VARIABLE BEFORE AFTER
======== ======== ========
arena 5d8bc 5d8bc
ordblks 1 6
mxordblk 59238 548a0
uordblks 4684 5014
fordblks 59238 588a8
user_main: Exiting
ostest_main: Exiting with status 0
nsh> u
nsh: u: command not found
nsh>
nsh>
nsh>
nsh> uname -a
NuttX 12.11.0 ef91333e3ac-dirty Dec 10 2025 16:11:04 xtensa esp32s3-devkit
nsh>