Skip to content

Guidellm fails for max-req 60 with streaming #516

@toslali-ibm

Description

@toslali-ibm

Describe the bug
When I use the following profile with command python -m guidellm benchmark --scenario profile.yaml --output-path here.json --request-type text_completions, I keep getting errors as follows. But when I reduce max-reqs to say 25, the errors are gone

profile

target: "url"
rate-type: sweep
max-requests: 60
rate: 5
random-seed: 42
data:
  prefix_tokens: 256
  prompt_tokens: 256
  prompt_tokens_stdev: 100
  prompt_tokens_min: 2
  prompt_tokens_max: 800
  output_tokens: 256
  output_tokens_stdev: 100
  output_tokens_min: 1
  output_tokens_max: 1024

err:

[run-workload]   File "/workspace/data/guidellm/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
[run-workload]     yield
[run-workload]   File "/workspace/data/guidellm/httpx/_transports/default.py", line 394, in handle_async_request
[run-workload]     resp = await self._pool.handle_async_request(req)
[run-workload]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[run-workload]   File "/workspace/data/guidellm/httpcore/_async/connection_pool.py", line 256, in handle_async_request
[run-workload]     raise exc from None
[run-workload]   File "/workspace/data/guidellm/httpcore/_async/connection_pool.py", line 236, in handle_async_request
[run-workload]     response = await connection.handle_async_request(
[run-workload]                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[run-workload]   File "/workspace/data/guidellm/httpcore/_async/connection.py", line 101, in handle_async_request
[run-workload]     raise exc
[run-workload]   File "/workspace/data/guidellm/httpcore/_async/connection.py", line 78, in handle_async_request
[run-workload]     stream = await self._connect(request)
[run-workload]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[run-workload]   File "/workspace/data/guidellm/httpcore/_async/connection.py", line 124, in _connect
[run-workload]     stream = await self._network_backend.connect_tcp(**kwargs)
[run-workload]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[run-workload]   File "/workspace/data/guidellm/httpcore/_backends/auto.py", line 31, in connect_tcp
[run-workload]     return await self._backend.connect_tcp(
[run-workload]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[run-workload]   File "/workspace/data/guidellm/httpcore/_backends/anyio.py", line 113, in connect_tcp
[run-workload]     with map_exceptions(exc_map):
[run-workload]          ^^^^^^^^^^^^^^^^^^^^^^^
[run-workload]   File "/workspace/data/guidellm/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
[run-workload]     yield
[run-workload]   File "/usr/local/lib/python3.12/contextlib.py", line 158, in __exit__
[run-workload]     self.gen.throw(value)
[run-workload]   File "/workspace/data/guidellm/httpcore/_exceptions.py", line 14, in map_exceptions
[run-workload]     raise to_exc(exc) from exc
[run-workload]   File "/workspace/data/guidellm/httpx/_transports/default.py", line 394, in handle_async_request
[run-workload]     resp = await self._pool.handle_async_request(req)
[run-workload]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[run-workload]   File "/workspace/data/guidellm/httpcore/_async/connection_pool.py", line 256, in handle_async_request
[run-workload]     raise exc from None
[run-workload]   File "/workspace/data/guidellm/httpcore/_async/connection_pool.py", line 236, in handle_async_request
[run-workload]     response = await connection.handle_async_request(
[run-workload]                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[run-workload]   File "/workspace/data/guidellm/httpcore/_async/connection.py", line 101, in handle_async_request
[run-workload]     raise exc
[run-workload]   File "/workspace/data/guidellm/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
[run-workload]     yield
[run-workload]   File "/workspace/data/guidellm/httpcore/_async/connection.py", line 78, in handle_async_request
[run-workload]     stream = await self._connect(request)
[run-workload]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[run-workload]   File "/workspace/data/guidellm/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
[run-workload]     yield
[run-workload]   File "/workspace/data/guidellm/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
[run-workload]     yield
[run-workload]   File "/workspace/data/guidellm/httpcore/_async/connection.py", line 124, in _connect
[run-workload]     stream = await self._network_backend.connect_tcp(**kwargs)
[run-workload]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[run-workload] httpcore.ConnectError: All connection attempts failed

Environment
Include all relevant environment information:
I build guidellm from main

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions