Skip to content

Aim crashing during initialisation of runs, cant view any runs anymore #3390

@ADH-LukeBollam

Description

@ADH-LukeBollam

🐛 Bug

I've recently started logging some figures each epoch, and am getting intermittent errors where I need to restart Aim or refresh the page until I can get it to load. Error messages in the browser say "Error: Response JSON already consumed" and I've attached one of the full logs below. I'll update this with the other issues as I hit them, there has been a few different types.

Today the latest error is preventing me from viewing any runs anymore, it seems to be hitting an exception trying to load run information and everything is green and in progress.

It's crashing at the line

    def list_corrupted_runs(self) -> List[str]:
        from aim.storage.encoding import decode_path

        def get_run_hash_from_prefix(prefix: bytes):
            return decode_path(prefix)

        container = RocksUnionContainer(os.path.join(self.path, 'meta'), read_only=True)

        return list(map(get_run_hash_from_prefix, container.corrupted_dbs))

when
container is <aim.storage.union.RocksUnionContainer object at 0x730ab670d080>
the prefix is b''
and paths is returning []

To reproduce

Add some Plotly charts to logging each epoch, here's how I'm adding mine:
aim_fig = Figure(fig)
self.aim_logger.track(aim_fig, name=f'sample {i}', step=epoch)

Expected behavior

Environment

  • Aim Version 3.29.1
  • Python version 3.9.21
  • pip version
  • OS (e.g., Linux) Ubuntu
  • Any other relevant information

Additional context

Log:
IndexError: list index out of range
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in call
return await self.app(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/fastapi/applications.py", line 1054, in call
await super().call(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/applications.py", line 112, in call
await self.middleware_stack(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/middleware/errors.py", line 187, in call
raise exc
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/middleware/errors.py", line 165, in call
await self.app(scope, receive, _send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/middleware/cors.py", line 85, in call
await self.app(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 62, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/routing.py", line 714, in call
await self.middleware_stack(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/routing.py", line 734, in app
await route.handle(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/routing.py", line 460, in handle
await self.app(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/fastapi/applications.py", line 1054, in call
await super().call(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/applications.py", line 112, in call
await self.middleware_stack(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/middleware/errors.py", line 187, in call
raise exc
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/middleware/errors.py", line 165, in call
await self.app(scope, receive, _send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/aim/web/api/utils.py", line 66, in call
await self.app(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/middleware/gzip.py", line 29, in call
await responder(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/middleware/gzip.py", line 126, in call
await super().call(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/middleware/gzip.py", line 46, in call
await self.app(scope, receive, self.send_with_compression)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 62, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/routing.py", line 714, in call
await self.middleware_stack(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/routing.py", line 734, in app
await route.handle(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/routing.py", line 288, in handle
await self.app(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/routing.py", line 76, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/starlette/routing.py", line 73, in app
response = await f(request)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/fastapi/routing.py", line 301, in app
raw_response = await run_endpoint_function(
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/fastapi/routing.py", line 212, in run_endpoint_function
return await dependant.call(**values)
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/aim/web/api/projects/views.py", line 47, in project_api
runs_corrupted = len(project.repo.list_corrupted_runs()) > 0
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/aim/sdk/repo.py", line 403, in list_corrupted_runs
return list(map(get_run_hash_from_prefix, container.corrupted_dbs))
File "/home/..../miniconda3/envs/v03/lib/python3.9/site-packages/aim/sdk/repo.py", line 400, in get_run_hash_from_prefix
return decode_path(prefix)[-1]
IndexError: list index out of range

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is neededtype / bugIssue type: something isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions