Skip to content

Conversation

@ktock
Copy link
Contributor

@ktock ktock commented Oct 8, 2025

This PR fixes an issue that the timeout argument for the select syscall wasn't applied to pipes and it returned immedeately instead of waiting for events.

Although the select syscall passes the timeout value to the poll method of the stream implementations, this can't handle multiple fds because a fd can't notify readiness while another is blocking in poll. As a result, using the select syscall with a combination of pipes and other streams (e.g. PTY) can be problematic.

To address this, this PR implements a callback-based event notification. Each stream implementation's poll invocation receives a callback, allowing it asynchronously notify the select syscall when it becomes ready. The select syscall blocks until one of these callbacks is triggered or the timeout expires. This behviour is enabled only when PROXY_TO_PTHREAD is enabled to avoid blocking the main worker.

To maintain compatibility with non-pipefs streams, the select syscall allows stream implementations to ignore the callback and synchronously return the event status instead. In such cases, the select syscall still updates the flags accordingly.

@ktock ktock marked this pull request as draft October 8, 2025 03:53
@ktock ktock force-pushed the fixselect branch 3 times, most recently from 89b85b8 to 05bd5fd Compare October 8, 2025 18:24
@ktock ktock marked this pull request as ready for review October 8, 2025 23:23
@sbc100
Copy link
Collaborator

sbc100 commented Oct 16, 2025

Thanks for the PR @ktock. This looks like a fairly large and complex PR, and I'm out sick right now I'm afraid so I'm not sure when I will have time to go through it.

CC'ing @atrosinenko who wrote the original pipefs back in #4935.

Also @kripken and @tlively who might be interesting the multithreaded event stuff.

@ktock ktock force-pushed the fixselect branch 3 times, most recently from 5fce398 to 3938766 Compare October 24, 2025 02:48
@ktock
Copy link
Contributor Author

ktock commented Oct 24, 2025

Rebased. Could we move this forward?

@kripken
Copy link
Member

kripken commented Oct 30, 2025

I am unfortunately not familiar with this code (or with native pipes either). It looks like the PIPEFS code began in #4378 / #4935 by @cynecx and @atrosinenko - would one of you be interested to review this perhaps?

@ktock ktock force-pushed the fixselect branch 3 times, most recently from d7f75dc to 16b7865 Compare December 11, 2025 02:42
@ktock
Copy link
Contributor Author

ktock commented Dec 11, 2025

@sbc100 @kripken

This looks like a fairly large and complex PR

I've just simplified the patch by removing changes in libpthread.js. Instead, this patch uses emscripten_proxy_sync_with_ctx to synchronize the thread worker and the main worker. This patch allows pipefs and setTimeout to unblock the worker by calling emscripten_proxy_finish when an event occurs. I've added some wrapper functions for the proxying-related APIs so that the JS implementation of newselect can use them.

@ktock ktock force-pushed the fixselect branch 2 times, most recently from 5cef212 to ac6c045 Compare December 12, 2025 08:40
@ktock ktock force-pushed the fixselect branch 5 times, most recently from aaa6575 to 6079e85 Compare December 15, 2025 07:45
@ktock
Copy link
Contributor Author

ktock commented Dec 15, 2025

The failure in ci/circleci: test-mac-arm64 looks like unrelated to this PR:

This job was rejected because the resource class is unavailable

@sbc100
Copy link
Collaborator

sbc100 commented Dec 15, 2025

Circle CI should be fixed in #25955

@sbc100
Copy link
Collaborator

sbc100 commented Dec 16, 2025

I'm adding some basic testing of select (with no timeout) to the existing pipe test here: #25965

@sbc100
Copy link
Collaborator

sbc100 commented Dec 16, 2025

I'm adding some basic testing of select (with no timeout) to the existing pipe test here: #25965

While working on #25965 I ended up implementing select() for WasmFS and did it in terms of poll().. which makes me wonder if we should be doing that for the normal FS too?

When this change would actually be a change to poll rather than select? Does that sound reasonable?

@sbc100
Copy link
Collaborator

sbc100 commented Dec 16, 2025

If you think that is too much to change at this point, we could land this change first and then consider a refactor later.

var tv_sec = ({{{ makeGetValue('timeout', 0, 'i32') }}}),
tv_usec = ({{{ makeGetValue('timeout', POINTER_SIZE, 'i32') }}});
return (tv_sec + tv_usec / 1000000) * 1000;
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function can be completely removed once #25968 lands.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on the PR. Removed this function.

@ktock ktock force-pushed the fixselect branch 3 times, most recently from c92cf24 to b991bf3 Compare December 17, 2025 09:21
@ktock
Copy link
Contributor Author

ktock commented Dec 17, 2025

Can you run with --rebaseline and upload the results? (Remember to use emsdk install tot to make sure you have the latest llvm and binaryen installed first).

Updated the JSON file using --rebaseline.

@ktock
Copy link
Contributor Author

ktock commented Dec 17, 2025

While working on #25965 I ended up implementing select() for WasmFS and did it in terms of poll().. which makes me wonder if we should be doing that for the normal FS too?

When this change would actually be a change to poll rather than select? Does that sound reasonable?

If you think that is too much to change at this point, we could land this change first and then consider a refactor later.

Thanks for working on the PR. Currently, __syscall_poll() doesn't implement the timeout support so we first need to fix this. I think this exceeds the scope of this PR and should be done in a separated refactoring PR.

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm once the tests pass

_newselect_js: (ctx, arg, nfds, readfds, writefds, exceptfds, timeoutInMillis) => {
return doNewselect(ctx, arg, nfds, readfds, writefds, exceptfds, timeoutInMillis);
},
$doNewselect__proxy: 'none',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like its just a forwarding function. Can we merge doNewselect and _newselect_js? i.e. can we remove doNewselect and just call it _newselect_js?

// timeout is supported, although on SOCKFS and PIPEFS these are ignored and always treated as 0 - fully async
// timeout is supported, although on SOCKFS these are ignored and always treated as 0 - fully async
// and PIPEFS supports timeout only when the select is called from a worker.
#if ASSERTIONS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we asssert that we are not running on a thread here? i.e. doNewselect must be called on the main thread only right?

return -1;
}
return t.result;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems like the whole point of this custom proxying is to allow the main thread to delay the calling of _emscripten_proxy_newselect_finish until later? i.e. after doNewselect returns.

I'm happy to see this change land but I wonder if we could do this without that need for custom proxying? I.e. I wonder if we could return a promise from doNewselect and have the automatic proxying system just work in that case? Allowing all async proxyied functions to work.

fd_set fdr;
fd_set fdw;
int res;
struct timeval tv;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can do struct timeval tv = {0} here. Maybe also add a comment "// Timeout of zero means don't block. Emscripten doesn't support blocking select in the general case"

self.do_core_test('test_syscall_intercept.c')

def test_select_blocking(self):
self.do_runf('core/test_select_blocking.c', cflags=['-pthread', '-sPROXY_TO_PTHREAD=1', '-sEXIT_RUNTIME=1', '-Wno-pthreads-mem-growth'])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you should need -Wno-pthreads-mem-growth since memory growth is not enabled here, right?

@sbc100
Copy link
Collaborator

sbc100 commented Dec 18, 2025

BTW, can I ask what you use case is? Are you stuff with a PTY?

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit enables the select syscall to handle timeout with multiple event
sources when it is called from a worker.

When a thread worker calls __syscall_newselect, it blocks using
emscripten_proxy_sync_with_ctx. When __syscall_newselect is called with zero
timeout, it is unblocked immediately by calling emscripten_proxy_finish
before returning. When it is called with non-zero timeout,
emscripten_proxy_finish is invoked either by the underlying stream
implementation (where an event occurs) or by a setTimeout callback when the
tiemout expires.

In proxying.c, several wrapper functions for proxying-related APIs are added
to allow the JS implementation of newselect to use them.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
The poll method of PIPEFS receives a notification callback from the
caller. PIPEFS notifies the caller when the fd becomes readable using that
callback.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
@ktock
Copy link
Contributor Author

ktock commented Dec 18, 2025

BTW, can I ask what you use case is? Are you stuff with a PTY?

I've been working on porting QEMU to Wasm. It relies on the select and pipes for managing events and timers.

https://wiki.qemu.org/ChangeLog/10.1

Experimental support for compiling to WASM using Emscripten.

Status: https://github.com/ktock/qemu-wasm?tab=readme-ov-file#status-of-upstreaming

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
@sbc100
Copy link
Collaborator

sbc100 commented Dec 18, 2025

BTW, can I ask what you use case is? Are you stuff with a PTY?

I've been working on porting QEMU to Wasm. It relies on the select and pipes for managing events and timers.

https://wiki.qemu.org/ChangeLog/10.1

Experimental support for compiling to WASM using Emscripten.

Status: https://github.com/ktock/qemu-wasm?tab=readme-ov-file#status-of-upstreaming

Thats awesome! If there are any other emscripten improvements that would help you please let us know.

I'm excited to set better support for these things. Once this lands I think I'd like to get poll working too, and convert select to be sit on top of poll. After that we should get this stuff working with WASMFS (that next generation FS).

@sbc100 sbc100 merged commit 8694832 into emscripten-core:main Dec 18, 2025
35 checks passed
@ktock ktock deleted the fixselect branch December 19, 2025 01:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants