This repository was archived by the owner on Oct 7, 2025. It is now read-only.

Description
From a cursory look at the code, it seems like all requests are kept in-memory before being processed asynchronously by the workers.
Since no rate-limiting is in place, and unless back-pressure mechanisms are implemented upstream, this has the potential to flood the server with requests and exhaust memory resources easily.
Further, keeping requests in-memory makes the application less resilient towards transient failures; a crashing container will take all pending requests with it and cause them to time out.
Using a more persistent queue such as RabbitMQ, NATS or even Apache Kafka may provide stronger resilience and may even simplify the code in some places.