Skip to content

Conversation

@userlaurin
Copy link

@userlaurin userlaurin commented Jan 5, 2026

  • nodejs worker bm wrapper
  • python worker bm wrapper
  • nodejs container bm wrapper
  • python container bm wrapper
  • cf sebs implementation
  • dockerfiles cf containers

Summary by CodeRabbit

  • New Features

    • Cloudflare Workers support (native & container): packaging, deployment, invocation, metrics, and CLI/container tooling.
    • Cloudflare storage: R2 object storage and Durable Objects adapters, storage abstractions, triggers, and worker wrappers.
    • Benchmark utilities: uploader, compression, and example worker handlers for Node.js/Python.
  • Documentation

    • Added Cloudflare platform and storage guides.
  • Chores

    • Added ARM64 support and related system/config updates.
  • Bug Fixes

    • Removed Node.js from one microbenchmark language list.

✏️ Tip: You can customize this high-level summary in your review settings.

laurin and others added 30 commits October 31, 2025 10:37
… sebs/cloudflare folder for comprehensive cloudflare doc and next steps
…ge is needed and the wrappers have to work
…, at this point the other benchmarks may be translated to nodejs
…and 000 to nodejs. fixed support for nosql in wrapper and sebs. fixed durable objects implementation for sebs
When running the handler as a thread and then awaiting said thread
(asyncio.to_thread), we can run asyncio.run() in the subsequent call
stack.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Fix all issues with AI agents
In @benchmarks/100.webapps/130.crud-api/nodejs/function.js:
- Around line 23-47: The total and average calculations in queryProducts are
incorrect: priceSum currently adds product.price without accounting for
product.quantity and avgPrice divides sum(prices) by sum(quantity), which is
meaningless; change the aggregation so priceSum accumulates product.price *
product.quantity (i.e., priceSum += product.price * product.quantity) and
compute avgPrice as quantitySum > 0 ? priceSum / quantitySum : 0.0 (average unit
price per item), keeping product names appended to products and returning
total_cost: priceSum and avg_price: avgPrice.

In @sebs/cloudflare/cloudflare.py:
- Around line 1253-1280: The requests.get call in the method that fetches the
workers.dev subdomain should include a timeout to avoid hanging; update the call
to requests.get(url, headers=headers, timeout=...) and choose a sensible timeout
(e.g., 5-10s), and adjust the surrounding exception handling to specifically
catch requests.Timeout and requests.RequestException where appropriate so the
method still logs via self.logging.warning and returns None on timeout or other
request errors; refer to the existing method using _get_auth_headers,
self._workers_dev_subdomain, and the current requests.get usage to locate and
modify the call.
- Around line 148-154: The requests.get call used to verify Cloudflare
credentials lacks a timeout and can hang; update the call to include a sensible
timeout (e.g., timeout=10) or use a configurable attribute like self._timeout,
i.e., change requests.get(f"{self._api_base_url}/user/tokens/verify",
headers=headers) to include timeout, and add handling for
requests.exceptions.Timeout/requests.exceptions.RequestException to raise a
clear RuntimeError if the request times out or fails; refer to the requests.get
invocation and the response/status_code check to locate where to add the timeout
and error handling.
- Around line 1209-1213: Replace the bare except that follows response.json()
with a specific exception handler for JSON parse failures: catch
json.JSONDecodeError and ValueError (and bind it to a variable, e.g., "e")
around the call to response.json() where error_data and error_msg are set, then
fall back to the existing log path (self.logging.info(f"Container
provisioning... ({elapsed}s elapsed)"))—optionally include the exception text in
the log; ensure json is imported if not already and update the except block that
currently surrounds response.json()/error_msg assignment.
- Around line 799-814: The nested function replacer closes over loop-scoped
variables control_flow and builtins causing stale bindings; fix by capturing
their current values into locals when defining replacer (e.g., assign
local_control_flow = control_flow and local_builtins = builtins just before
defining replacer and refer to those inside replacer, or bind them as default
args like def replacer(match, control_flow=control_flow, builtins=builtins):),
then keep using replacer in the re.sub call so the correct sets are used for
each iteration.
- Around line 73-83: The initialize method currently declares a mutable default
config and never uses it; change the signature to use a safe default by
replacing config: Dict[str, str] = {} with config: Optional[Dict[str, str]] =
None (or remove the parameter entirely if it’s not needed), and if you keep it,
ensure you normalize it at the top of initialize (e.g., config = config or {})
before use and pass relevant values into initialize_resources or other code
paths; keep references to the existing initialize, _verify_credentials, and
initialize_resources symbols when making the change.
- Around line 1066-1077: The bare except around response.json() in the function
that handles the worker check should be replaced with a specific exception
handler for JSON parse errors: catch json.JSONDecodeError (and optionally
ValueError for compatibility) instead of using a bare except, and handle/log the
exception (e.g., with self.logging.debug/warning) before returning None; ensure
json is imported if not already and update the try/except that wraps
response.json().get("result") to use the specific exception types and an
exception variable (e) for logging.
🧹 Nitpick comments (8)
benchmarks/100.webapps/130.crud-api/nodejs/function.js (1)

49-78: Consider adding input validation.

The handler assumes all required fields are present in request.body and request.path. If upstream wrappers don't guarantee this, the function may throw unclear errors when accessing undefined properties.

Consider adding validation for:

  • Required body fields (cart, product_id, name, price, quantity for PUT)
  • Required path parameters (id for GET /cart/{id})
  • Data types (e.g., ensuring price and quantity are numbers)

Note: Line 67 uses body.cart for a GET request, which is unconventional (GET requests typically use path/query parameters), but may be intentional for this benchmark's design.

benchmarks/wrappers/cloudflare/nodejs/build.js (1)

108-155: Consider adding documentation for the post-processing logic.

The regex-based post-processing that converts dynamic __require calls into static imports is clever but complex. Adding comments explaining:

  • Why this transformation is necessary (Cloudflare Workers ESM requirements)
  • The pattern being matched (__require<digits>("node:..."))
  • The cache mechanism

would improve maintainability.

📝 Example documentation
       
-      // POST-PROCESS: Replace dynamic requires with static imports
+      // POST-PROCESS: Replace dynamic requires with static imports
+      // Cloudflare Workers require ESM with static imports for node: modules.
+      // esbuild generates __require\d*("node:...") calls for dynamic imports,
+      // so we transform them into static imports with a cache lookup.
       console.log('Post-processing to fix node: module imports...');
sebs/cloudflare/durable_objects.py (1)

183-183: Remove unnecessary f-string prefix.

The f-string on line 183 doesn't contain any placeholders, so the f prefix is unnecessary.

✨ Proposed fix
-        self.logging.info(f"Durable Objects data is managed within the Worker")
+        self.logging.info("Durable Objects data is managed within the Worker")
sebs/cloudflare/cloudflare.py (5)

385-391: Remove unused variable r2_bucket_configured.

The variable r2_bucket_configured is assigned on line 385 but never used afterward. This can be safely removed to clean up the code.

🧹 Proposed fix
         # Add R2 bucket binding for benchmarking files (for both native and container deployments)
-        r2_bucket_configured = False
         try:
             storage = self.system_resources.get_storage()
             bucket_name = storage.get_bucket(Resources.StorageBucketType.BENCHMARKS)
             if bucket_name:
                 toml_content += f"""# R2 bucket binding for benchmarking files
 # This bucket is used by fs and path polyfills to read benchmark data
 [[r2_buckets]]
 binding = "R2"
 bucket_name = "{bucket_name}"
 
 """
-                r2_bucket_configured = True
                 self.logging.info(f"R2 bucket '{bucket_name}' will be bound to worker as 'R2'")

654-657: Rename unused loop variable.

The loop variable dirnames on line 654 is not used within the loop body. Following Python conventions, it should be renamed to _dirnames to indicate it's intentionally unused.

✨ Proposed fix
         # Calculate total size of the package directory
         total_size = 0
-        for dirpath, dirnames, filenames in os.walk(directory):
+        for dirpath, _dirnames, filenames in os.walk(directory):
             for filename in filenames:
                 filepath = os.path.join(dirpath, filename)
                 total_size += os.path.getsize(filepath)

855-866: Remove unused variable result.

The variable result assigned on line 855 from the subprocess.run() call is never used. Since the success is verified by the lack of exception, this assignment can be removed.

🧹 Proposed fix
         # Install Node.js dependencies (needed for all containers for worker.js)
         self.logging.info(f"Installing @cloudflare/containers for worker.js orchestration in {directory}")
         try:
-            result = subprocess.run(
+            subprocess.run(
                 ["npm", "install", "--production"],
                 cwd=directory,
                 capture_output=True,
                 text=True,
                 check=True,
                 timeout=120
             )
             self.logging.info("npm install completed successfully")

924-927: Rename unused loop variable.

The loop variable dirnames on line 924 is not used within the loop body. Following Python conventions, it should be renamed to _dirnames.

✨ Proposed fix
         # Calculate package size (approximate, as it's a source directory)
         total_size = 0
-        for dirpath, dirnames, filenames in os.walk(directory):
+        for dirpath, _dirnames, filenames in os.walk(directory):
             for filename in filenames:
                 filepath = os.path.join(dirpath, filename)
                 total_size += os.path.getsize(filepath)

1495-1512: Rename unused loop variable.

The loop variable request_id on line 1495 is not used within the loop body. Following Python conventions, it should be renamed to _request_id to indicate it's intentionally unused.

✨ Proposed fix
-        for request_id, result in requests.items():
+        for _request_id, result in requests.items():
             # Count cold/warm starts
             if result.stats.cold_start:
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5049c42 and 8516207.

📒 Files selected for processing (5)
  • benchmarks/100.webapps/130.crud-api/nodejs/function.js
  • benchmarks/wrappers/cloudflare/nodejs/build.js
  • sebs/cloudflare/README.md
  • sebs/cloudflare/cloudflare.py
  • sebs/cloudflare/durable_objects.py
✅ Files skipped from review due to trivial changes (1)
  • sebs/cloudflare/README.md
🧰 Additional context used
🧬 Code graph analysis (2)
sebs/cloudflare/durable_objects.py (3)
sebs/cloudflare/resources.py (1)
  • config (37-38)
sebs/cloudflare/config.py (12)
  • CloudflareCredentials (9-127)
  • typename (32-33)
  • typename (140-141)
  • typename (210-211)
  • resources (218-219)
  • credentials (214-215)
  • api_token (36-37)
  • email (40-41)
  • api_key (44-45)
  • update_cache (118-121)
  • update_cache (167-173)
  • update_cache (248-251)
sebs/cache.py (3)
  • Cache (37-463)
  • get_nosql_config (177-178)
  • update_nosql (190-193)
sebs/cloudflare/cloudflare.py (2)
sebs/cloudflare/resources.py (4)
  • config (37-38)
  • CloudflareSystemResources (16-95)
  • get_storage (56-78)
  • _get_auth_headers (40-54)
sebs/cloudflare/triggers.py (8)
  • typename (19-20)
  • typename (72-73)
  • LibraryTrigger (7-57)
  • HTTPTrigger (60-154)
  • url (80-82)
  • url (85-86)
  • trigger_type (23-24)
  • trigger_type (76-77)
🪛 Ruff (0.14.10)
sebs/cloudflare/durable_objects.py

56-56: Avoid specifying long messages outside the exception class

(TRY003)


117-117: Unused method argument: primary_key

(ARG002)


117-117: Unused method argument: secondary_key

(ARG002)


147-147: Unused method argument: data

(ARG002)


148-148: Unused method argument: primary_key

(ARG002)


149-149: Unused method argument: secondary_key

(ARG002)


168-168: Avoid specifying long messages outside the exception class

(TRY003)


183-183: f-string without any placeholders

Remove extraneous f prefix

(F541)

sebs/cloudflare/cloudflare.py

73-73: Unused method argument: config

(ARG002)


73-73: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


117-117: Do not catch blind exception: Exception

(BLE001)


128-131: Avoid specifying long messages outside the exception class

(TRY003)


134-137: Avoid specifying long messages outside the exception class

(TRY003)


148-148: Probable use of requests call without timeout

(S113)


151-154: Avoid specifying long messages outside the exception class

(TRY003)


162-162: Starting a process with a partial executable path

(S607)


174-174: Starting a process with a partial executable path

(S607)


184-184: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


184-184: Avoid specifying long messages outside the exception class

(TRY003)


186-188: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


186-188: Avoid specifying long messages outside the exception class

(TRY003)


190-190: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


190-190: Avoid specifying long messages outside the exception class

(TRY003)


196-196: Starting a process with a partial executable path

(S607)


208-208: Starting a process with a partial executable path

(S607)


218-218: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


218-218: Avoid specifying long messages outside the exception class

(TRY003)


220-222: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


220-222: Avoid specifying long messages outside the exception class

(TRY003)


224-224: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


224-224: Avoid specifying long messages outside the exception class

(TRY003)


227-227: Unused method argument: container_uri

(ARG002)


385-385: Local variable r2_bucket_configured is assigned to but never used

Remove assignment to unused variable r2_bucket_configured

(F841)


387-387: Do not catch blind exception: Exception

(BLE001)


416-416: Avoid specifying long messages outside the exception class

(TRY003)


423-423: Unused method argument: architecture

(ARG002)


464-464: Unused method argument: is_cached

(ARG002)


482-482: Starting a process with a partial executable path

(S607)


496-496: Starting a process with a partial executable path

(S607)


507-507: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


508-508: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


508-508: Avoid specifying long messages outside the exception class

(TRY003)


510-510: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


511-511: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


511-511: Avoid specifying long messages outside the exception class

(TRY003)


513-515: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


513-515: Avoid specifying long messages outside the exception class

(TRY003)


525-525: Starting a process with a partial executable path

(S607)


533-533: Do not catch blind exception: Exception

(BLE001)


643-646: Avoid specifying long messages outside the exception class

(TRY003)


647-650: Avoid specifying long messages outside the exception class

(TRY003)


654-654: Loop control variable dirnames not used within loop body

Rename unused dirnames to _dirnames

(B007)


685-687: Avoid specifying long messages outside the exception class

(TRY003)


705-705: Avoid specifying long messages outside the exception class

(TRY003)


718-718: f-string without any placeholders

Remove extraneous f prefix

(F541)


804-804: Function definition does not bind loop variable control_flow

(B023)


804-804: Function definition does not bind loop variable builtins

(B023)


855-855: Local variable result is assigned to but never used

Remove assignment to unused variable result

(F841)


856-856: Starting a process with a partial executable path

(S607)


864-864: Do not catch blind exception: Exception

(BLE001)


865-865: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


866-866: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


866-866: Avoid specifying long messages outside the exception class

(TRY003)


924-924: Loop control variable dirnames not used within loop body

Rename unused dirnames to _dirnames

(B007)


956-956: subprocess call: check for execution of untrusted input

(S603)


957-957: Starting a process with a partial executable path

(S607)


969-969: Consider moving this statement to an else block

(TRY300)


975-975: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


976-976: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


978-978: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


978-978: Avoid specifying long messages outside the exception class

(TRY003)


1011-1011: Avoid specifying long messages outside the exception class

(TRY003)


1066-1066: Probable use of requests call without timeout

(S113)


1071-1071: Do not use bare except

(E722)


1132-1132: subprocess call: check for execution of untrusted input

(S603)


1167-1167: Consider moving this statement to an else block

(TRY300)


1170-1170: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


1170-1170: Avoid specifying long messages outside the exception class

(TRY003)


1175-1175: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


1176-1176: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


1178-1178: Unused method argument: package_dir

(ARG002)


1212-1212: Do not use bare except

(E722)


1256-1256: Probable use of requests call without timeout

(S113)


1278-1278: Do not catch blind exception: Exception

(BLE001)


1348-1348: Avoid specifying long messages outside the exception class

(TRY003)


1365-1365: Unused method argument: benchmark

(ARG002)


1391-1391: Unused method argument: resources

(ARG002)


1438-1438: Unused method argument: functions

(ARG002)


1438-1438: Unused method argument: code_package

(ARG002)


1458-1458: Unused method argument: start_time

(ARG002)


1459-1459: Unused method argument: end_time

(ARG002)


1495-1495: Loop control variable request_id not used within loop body

Rename unused request_id to _request_id

(B007)

🔇 Additional comments (11)
benchmarks/100.webapps/130.crud-api/nodejs/function.js (3)

6-13: Previous issue resolved.

The missing await on the async insert operation flagged in the previous review has been addressed. The function now properly awaits the insert operation, preventing fire-and-forget behavior.


15-21: LGTM!

The function correctly awaits the async get operation and returns the result. Implementation is clean and straightforward.


1-4: No issue found. Line 3's nosql.nosql.get_instance() pattern is correct based on the module structure. The nosql.js file exports a class named nosql, and the double property access (variable nosql containing the exports, accessing property nosql for the class) is the natural result of ES6 named exports. While the identical naming between the variable and class is unconventional, it is not a functional error.

benchmarks/wrappers/cloudflare/nodejs/build.js (5)

5-28: LGTM!

The recursive file enumeration logic correctly excludes build artifacts, test files, and configuration files while collecting source files for bundling.


30-36: LGTM!

The file copy utility correctly ensures the destination directory exists before copying.


173-173: LGTM!

The script execution is straightforward and correct.


54-60: No issues found. The request-polyfill.js file exists in the wrapper directory where the build script expects it.


88-106: No action needed. Cloudflare Workers fully supports ES2020 and all later JavaScript features through its V8 engine, which is kept in sync with Chrome stable. The es2020 target in this build configuration is appropriate and compatible.

sebs/cloudflare/durable_objects.py (3)

30-40: LGTM! Previous credential storage issue resolved.

The __init__ method now correctly stores the credentials parameter as self._credentials (line 38), fixing the previously reported AttributeError. The initialization logic is sound.


101-114: LGTM! Cache client property usage is now correct.

The update_cache method now correctly uses self.cache_client (line 107) instead of the non-existent self._cache_client, matching the pattern in retrieve_cache. This resolves the previously reported AttributeError.


186-213: LGTM! Dictionary mutation issue resolved.

The remove_table method now uses a safe two-step approach: first finding the benchmark and table_key (lines 198-206 with list() wrappers to avoid mutation), then performing the deletion after iteration (lines 209-210). This prevents the RuntimeError: dictionary changed size during iteration that was previously flagged.

Comment on lines +23 to +47
async function queryProducts(cartId) {
const res = await nosqlClient.query(
nosqlTableName,
["cart_id", cartId],
"product_id"
);

const products = [];
let priceSum = 0;
let quantitySum = 0;

for (const product of res) {
products.push(product.name);
priceSum += product.price;
quantitySum += product.quantity;
}

const avgPrice = quantitySum > 0 ? priceSum / quantitySum : 0.0;

return {
products: products,
total_cost: priceSum,
avg_price: avgPrice
};
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n benchmarks/100.webapps/130.crud-api/nodejs/function.js | head -80

Repository: spcl/serverless-benchmarks

Length of output: 2363


Fix the price aggregation logic.

Lines 36 and 40 contain incorrect calculations for a shopping cart total:

  • Line 36 adds product.price directly to priceSum, ignoring the quantity for that item
  • Line 40 calculates average price as sum(price) / sum(quantity), which is not a meaningful metric

Since product.price is a unit price (as shown in the addProduct function at line 11), total_cost should be sum(price × quantity), not sum(price). For example, 2 units at $10 + 3 units at $5 should total $35, not $15.

Proposed fix:

   for (const product of res) {
     products.push(product.name);
-    priceSum += product.price;
+    priceSum += product.price * product.quantity;
     quantitySum += product.quantity;
   }
 
   const avgPrice = quantitySum > 0 ? priceSum / quantitySum : 0.0;

This calculates the correct total cost and average price per item.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async function queryProducts(cartId) {
const res = await nosqlClient.query(
nosqlTableName,
["cart_id", cartId],
"product_id"
);
const products = [];
let priceSum = 0;
let quantitySum = 0;
for (const product of res) {
products.push(product.name);
priceSum += product.price;
quantitySum += product.quantity;
}
const avgPrice = quantitySum > 0 ? priceSum / quantitySum : 0.0;
return {
products: products,
total_cost: priceSum,
avg_price: avgPrice
};
}
async function queryProducts(cartId) {
const res = await nosqlClient.query(
nosqlTableName,
["cart_id", cartId],
"product_id"
);
const products = [];
let priceSum = 0;
let quantitySum = 0;
for (const product of res) {
products.push(product.name);
priceSum += product.price * product.quantity;
quantitySum += product.quantity;
}
const avgPrice = quantitySum > 0 ? priceSum / quantitySum : 0.0;
return {
products: products,
total_cost: priceSum,
avg_price: avgPrice
};
}
🤖 Prompt for AI Agents
In @benchmarks/100.webapps/130.crud-api/nodejs/function.js around lines 23 - 47,
The total and average calculations in queryProducts are incorrect: priceSum
currently adds product.price without accounting for product.quantity and
avgPrice divides sum(prices) by sum(quantity), which is meaningless; change the
aggregation so priceSum accumulates product.price * product.quantity (i.e.,
priceSum += product.price * product.quantity) and compute avgPrice as
quantitySum > 0 ? priceSum / quantitySum : 0.0 (average unit price per item),
keeping product names appended to products and returning total_cost: priceSum
and avg_price: avgPrice.

Comment on lines +73 to +83
def initialize(self, config: Dict[str, str] = {}, resource_prefix: Optional[str] = None):
"""
Initialize the Cloudflare Workers platform.
Args:
config: Additional configuration parameters
resource_prefix: Prefix for resource naming
"""
# Verify credentials are valid
self._verify_credentials()
self.initialize_resources(select_prefix=resource_prefix)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix mutable default argument and unused parameter.

Line 73 has two issues:

  1. The config parameter is defined but never used in the method body.
  2. The mutable default config: Dict[str, str] = {} can cause unexpected behavior if the same dict instance is reused across calls.
🔧 Proposed fix
-    def initialize(self, config: Dict[str, str] = {}, resource_prefix: Optional[str] = None):
+    def initialize(self, config: Optional[Dict[str, str]] = None, resource_prefix: Optional[str] = None):
         """
         Initialize the Cloudflare Workers platform.
 
         Args:
             config: Additional configuration parameters
             resource_prefix: Prefix for resource naming
         """
+        if config is None:
+            config = {}
         # Verify credentials are valid
         self._verify_credentials()
🧰 Tools
🪛 Ruff (0.14.10)

73-73: Unused method argument: config

(ARG002)


73-73: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

🤖 Prompt for AI Agents
In @sebs/cloudflare/cloudflare.py around lines 73 - 83, The initialize method
currently declares a mutable default config and never uses it; change the
signature to use a safe default by replacing config: Dict[str, str] = {} with
config: Optional[Dict[str, str]] = None (or remove the parameter entirely if
it’s not needed), and if you keep it, ensure you normalize it at the top of
initialize (e.g., config = config or {}) before use and pass relevant values
into initialize_resources or other code paths; keep references to the existing
initialize, _verify_credentials, and initialize_resources symbols when making
the change.

Comment on lines +148 to +154
response = requests.get(f"{self._api_base_url}/user/tokens/verify", headers=headers)

if response.status_code != 200:
raise RuntimeError(
f"Failed to verify Cloudflare credentials: {response.status_code} - {response.text}\n"
f"Please check that your CLOUDFLARE_API_TOKEN and CLOUDFLARE_ACCOUNT_ID are correct."
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add timeout to requests call for security.

The requests.get() call on line 148 lacks a timeout parameter, which can cause the application to hang indefinitely if the Cloudflare API is unresponsive. This is a security and reliability concern.

🔒 Proposed fix
-        response = requests.get(f"{self._api_base_url}/user/tokens/verify", headers=headers)
+        response = requests.get(
+            f"{self._api_base_url}/user/tokens/verify",
+            headers=headers,
+            timeout=30  # 30 second timeout for credential verification
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
response = requests.get(f"{self._api_base_url}/user/tokens/verify", headers=headers)
if response.status_code != 200:
raise RuntimeError(
f"Failed to verify Cloudflare credentials: {response.status_code} - {response.text}\n"
f"Please check that your CLOUDFLARE_API_TOKEN and CLOUDFLARE_ACCOUNT_ID are correct."
)
response = requests.get(
f"{self._api_base_url}/user/tokens/verify",
headers=headers,
timeout=30 # 30 second timeout for credential verification
)
if response.status_code != 200:
raise RuntimeError(
f"Failed to verify Cloudflare credentials: {response.status_code} - {response.text}\n"
f"Please check that your CLOUDFLARE_API_TOKEN and CLOUDFLARE_ACCOUNT_ID are correct."
)
🧰 Tools
🪛 Ruff (0.14.10)

148-148: Probable use of requests call without timeout

(S113)


151-154: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
In @sebs/cloudflare/cloudflare.py around lines 148 - 154, The requests.get call
used to verify Cloudflare credentials lacks a timeout and can hang; update the
call to include a sensible timeout (e.g., timeout=10) or use a configurable
attribute like self._timeout, i.e., change
requests.get(f"{self._api_base_url}/user/tokens/verify", headers=headers) to
include timeout, and add handling for
requests.exceptions.Timeout/requests.exceptions.RequestException to raise a
clear RuntimeError if the request times out or fails; refer to the requests.get
invocation and the response/status_code check to locate where to add the timeout
and error handling.

Comment on lines +1253 to +1280
try:
headers = self._get_auth_headers()
url = f"{self._api_base_url}/accounts/{account_id}/workers/subdomain"
resp = requests.get(url, headers=headers)
if resp.status_code == 200:
body = resp.json()
sub = None
# result may contain 'subdomain' or nested structure
if isinstance(body, dict):
sub = body.get("result", {}).get("subdomain")

if sub:
self._workers_dev_subdomain = sub
return sub
else:
self.logging.warning(
"Could not find workers.dev subdomain in API response; "
"please enable the workers.dev subdomain in your Cloudflare dashboard."
)
return None
else:
self.logging.warning(
f"Failed to fetch workers.dev subdomain: {resp.status_code} - {resp.text}"
)
return None
except Exception as e:
self.logging.warning(f"Error fetching workers.dev subdomain: {e}")
return None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add timeout to requests call for reliability.

The requests.get() call on line 1256 lacks a timeout parameter, which can cause the application to hang indefinitely if the Cloudflare API is unresponsive. This affects reliability when fetching the workers.dev subdomain.

🔒 Proposed fix
         try:
             headers = self._get_auth_headers()
             url = f"{self._api_base_url}/accounts/{account_id}/workers/subdomain"
-            resp = requests.get(url, headers=headers)
+            resp = requests.get(url, headers=headers, timeout=30)
             if resp.status_code == 200:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
headers = self._get_auth_headers()
url = f"{self._api_base_url}/accounts/{account_id}/workers/subdomain"
resp = requests.get(url, headers=headers)
if resp.status_code == 200:
body = resp.json()
sub = None
# result may contain 'subdomain' or nested structure
if isinstance(body, dict):
sub = body.get("result", {}).get("subdomain")
if sub:
self._workers_dev_subdomain = sub
return sub
else:
self.logging.warning(
"Could not find workers.dev subdomain in API response; "
"please enable the workers.dev subdomain in your Cloudflare dashboard."
)
return None
else:
self.logging.warning(
f"Failed to fetch workers.dev subdomain: {resp.status_code} - {resp.text}"
)
return None
except Exception as e:
self.logging.warning(f"Error fetching workers.dev subdomain: {e}")
return None
try:
headers = self._get_auth_headers()
url = f"{self._api_base_url}/accounts/{account_id}/workers/subdomain"
resp = requests.get(url, headers=headers, timeout=30)
if resp.status_code == 200:
body = resp.json()
sub = None
# result may contain 'subdomain' or nested structure
if isinstance(body, dict):
sub = body.get("result", {}).get("subdomain")
if sub:
self._workers_dev_subdomain = sub
return sub
else:
self.logging.warning(
"Could not find workers.dev subdomain in API response; "
"please enable the workers.dev subdomain in your Cloudflare dashboard."
)
return None
else:
self.logging.warning(
f"Failed to fetch workers.dev subdomain: {resp.status_code} - {resp.text}"
)
return None
except Exception as e:
self.logging.warning(f"Error fetching workers.dev subdomain: {e}")
return None
🧰 Tools
🪛 Ruff (0.14.10)

1256-1256: Probable use of requests call without timeout

(S113)


1278-1278: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
In @sebs/cloudflare/cloudflare.py around lines 1253 - 1280, The requests.get
call in the method that fetches the workers.dev subdomain should include a
timeout to avoid hanging; update the call to requests.get(url, headers=headers,
timeout=...) and choose a sensible timeout (e.g., 5-10s), and adjust the
surrounding exception handling to specifically catch requests.Timeout and
requests.RequestException where appropriate so the method still logs via
self.logging.warning and returns None on timeout or other request errors; refer
to the existing method using _get_auth_headers, self._workers_dev_subdomain, and
the current requests.get usage to locate and modify the call.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Fix all issues with AI agents
In @benchmarks/wrappers/cloudflare/python/nosql.py:
- Line 192: The code is brittle because get_res is being massaged with
get_res.replace("'", "\"") before JSON parsing; stop doing that and parse the
data in its true format: ensure the producer stores JSON strings (fix the writer
to use json.dumps) or, if the KV contains Python literal reprs, replace the
fragile replace() with a safe parser (use json.loads(get_res) when data is valid
JSON, or ast.literal_eval(get_res) when Python literals are expected) and add
validation/error handling; remove the replace call and update the code that
reads/writes the KV so the stored format is unambiguous and parsed with the
proper function (reference the get_res variable and the surrounding KV
read/write logic).
- Line 91: Replace the hard assertion "assert len(keys) <= 100" with explicit
validation that raises a descriptive exception when the limit is exceeded: check
if len(keys) > 100 and raise a ValueError (or a custom exception) that includes
the actual len(keys) and the maximum allowed (100) so callers get a clear error
message instead of an abrupt assertion failure; update any docstring/comments
for the function in nosql.py that handles "keys" to reflect this explicit error
behavior.
- Line 184: Replace the hard assertion "assert len(keys) <= 100" with proper
error handling: detect when len(keys) > 100 and raise a descriptive exception
(e.g., ValueError) that includes the actual count and context (which
operation/function and parameter causing it). Apply the same pattern used to fix
nosql_do.query(): validate the keys collection early in the function that
constructs/uses the "keys" variable and raise a clear, actionable error instead
of asserting so the caller can handle it.
- Line 117: The class-level annotation for the singleton "instance" is
referencing the wrong name "nosql"; update the type hint to reference the
correct class "nosql_kv" (i.e., change the annotation on the variable instance:
Optional["nosql"] to Optional["nosql_kv"]) so the forward reference matches the
actual class name used in this module (look for the class definition named
nosql_kv and the variable instance).
- Around line 120-122: The init_instance function incorrectly constructs and
assigns nosql.instance using the nosql class; change it to instantiate and
assign nosql_kv instead (i.e., use nosql_kv() for creation and assign to
nosql.instance.env from the entry.env), ensuring init_instance references the
nosql_kv class for both the new instance and the instance property.
- Around line 204-207: get_instance currently constructs the wrong class by
calling nosql() (the nosql_do alias) instead of the KV implementation; change
the instantiation so that when nosql.instance is None you assign nosql.instance
= nosql_kv() (or the correct nosql_kv class/constructor) and return
nosql.instance, ensuring you're referencing the nosql_kv symbol used for the
key-value implementation rather than nosql().
- Line 88: Remove the leftover debug print by deleting the "print(key)" call in
benchmarks/wrappers/cloudflare/python/nosql.py; if key inspection is still
needed use the module's logger (e.g., logger.debug(...)) instead of printing to
stdout to avoid leaking debug output in production.
🧹 Nitpick comments (5)
benchmarks/wrappers/cloudflare/python/nosql.py (5)

51-51: Remove unused variable assignment.

The put_res variable is assigned but never used. Consider removing it or using it for error handling.

♻️ Proposed fix
     send_data = self.data_pre(data)
     k=self.key_maker(primary_key, secondary_key)
-    put_res = run_sync(self.get_table(table_name).put(k, send_data))
+    run_sync(self.get_table(table_name).put(k, send_data))
     return

94-94: Consider implementing bulk operations.

The TODO comment indicates that bulk operations would be preferred over individual gets. This could improve performance for queries returning multiple keys.

Would you like me to help investigate or implement bulk get operations for this query method?


140-144: Remove unused variable assignment.

Similar to line 51, the put_res variable is assigned but never used.

♻️ Proposed fix
-    put_res = (
-        run_sync(self.get_table(table_name).put(
+    run_sync(self.get_table(table_name).put(
             self.key_maker(primary_key, secondary_key),
-            json.dumps(data))
-        ))
+            json.dumps(data)))
     return

154-158: Remove unused variable assignment.

The put_res variable is assigned but never used.

♻️ Proposed fix
-    put_res = run_sync(
-        self.get_table(table_name).put(
+    run_sync(
+        self.get_table(table_name).put(
             self.key_maker(primary_key, secondary_key),
-            json.dumps(data)
-        ))
+            json.dumps(data)))
     return

187-187: Consider implementing bulk operations.

Similar to the TODO in nosql_do.query(), bulk operations would improve performance.

Would you like me to help investigate or implement bulk get operations for this query method?

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8516207 and 8b7003b.

📒 Files selected for processing (1)
  • benchmarks/wrappers/cloudflare/python/nosql.py
🧰 Additional context used
🧬 Code graph analysis (1)
benchmarks/wrappers/cloudflare/python/nosql.py (1)
benchmarks/wrappers/cloudflare/python/container/nosql.py (8)
  • init_instance (18-22)
  • insert (53-66)
  • update (68-81)
  • get (83-92)
  • query (94-103)
  • delete (105-111)
  • get_instance (114-117)
  • nosql (11-117)
🪛 Ruff (0.14.10)
benchmarks/wrappers/cloudflare/python/nosql.py

23-23: Use explicit conversion flag

Replace with conversion flag

(RUF010)


26-26: Use explicit conversion flag

Replace with conversion flag

(RUF010)


38-38: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)


40-40: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)


51-51: Local variable put_res is assigned to but never used

Remove assignment to unused variable put_res

(F841)


125-125: Use explicit conversion flag

Replace with conversion flag

(RUF010)


128-128: Use explicit conversion flag

Replace with conversion flag

(RUF010)


140-140: Local variable put_res is assigned to but never used

Remove assignment to unused variable put_res

(F841)


154-154: Local variable put_res is assigned to but never used

Remove assignment to unused variable put_res

(F841)

🔇 Additional comments (1)
benchmarks/wrappers/cloudflare/python/nosql.py (1)

9-15: Previously reported critical issues have been fixed.

The class name mismatches in nosql_do that were flagged in the previous review have been correctly addressed. The type annotation now uses "nosql_do" and all instance references use nosql_do.instance.

keys = []
for key in list_res:
if key.startswith(prefix_key):
print(key)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove debug print statement.

This print statement appears to be leftover debug code and should be removed for production use.

🔧 Proposed fix
     for key in list_res:
         if key.startswith(prefix_key):
-            print(key)
             keys.append(key)
🤖 Prompt for AI Agents
In @benchmarks/wrappers/cloudflare/python/nosql.py at line 88, Remove the
leftover debug print by deleting the "print(key)" call in
benchmarks/wrappers/cloudflare/python/nosql.py; if key inspection is still
needed use the module's logger (e.g., logger.debug(...)) instead of printing to
stdout to avoid leaking debug output in production.

print(key)
keys.append(key)
##print("keys", keys)
assert len(keys) <= 100
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replace hard assertion with proper error handling.

The hard assertion will crash the program if more than 100 keys are found. This should raise a descriptive exception instead to provide better error handling.

🛡️ Proposed fix
     ##print("keys", keys)
-    assert len(keys) <= 100
+    if len(keys) > 100:
+        raise ValueError(f"Query returned {len(keys)} keys, exceeding the 100 key limit")
🤖 Prompt for AI Agents
In @benchmarks/wrappers/cloudflare/python/nosql.py at line 91, Replace the hard
assertion "assert len(keys) <= 100" with explicit validation that raises a
descriptive exception when the limit is exceeded: check if len(keys) > 100 and
raise a ValueError (or a custom exception) that includes the actual len(keys)
and the maximum allowed (100) so callers get a clear error message instead of an
abrupt assertion failure; update any docstring/comments for the function in
nosql.py that handles "keys" to reflect this explicit error behavior.

for key in list_res.keys:
keys.append(key.name)
##print("keys", keys)
assert len(keys) <= 100
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replace hard assertion with proper error handling.

Same issue as in nosql_do.query(): the hard assertion will crash if more than 100 keys are found. Replace with a descriptive exception.

🛡️ Proposed fix
     ##print("keys", keys)
-    assert len(keys) <= 100
+    if len(keys) > 100:
+        raise ValueError(f"Query returned {len(keys)} keys, exceeding the 100 key limit")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assert len(keys) <= 100
if len(keys) > 100:
raise ValueError(f"Query returned {len(keys)} keys, exceeding the 100 key limit")
🤖 Prompt for AI Agents
In @benchmarks/wrappers/cloudflare/python/nosql.py at line 184, Replace the hard
assertion "assert len(keys) <= 100" with proper error handling: detect when
len(keys) > 100 and raise a descriptive exception (e.g., ValueError) that
includes the actual count and context (which operation/function and parameter
causing it). Apply the same pattern used to fix nosql_do.query(): validate the
keys collection early in the function that constructs/uses the "keys" variable
and raise a clear, actionable error instead of asserting so the caller can
handle it.

for key in keys:

get_res = run_sync(self.get_table(table_name).get(key))
get_res = get_res.replace("\'", "\"")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fragile string manipulation for JSON parsing.

Replacing single quotes with double quotes to fix JSON parsing suggests the data is being stored or retrieved in an incorrect format. This approach is brittle and could fail if the data contains escaped quotes or quotes within string values.

Investigation needed

Consider the following scenarios where this will break:

  • Data containing: {"name": "O'Brien"} → becomes {"name": "O"Brien"} (invalid JSON)
  • Data containing: {"quote": "She said 'hello'"} → mismatched quotes

Root cause options:

  1. If the KV store is incorrectly storing Python repr() instead of JSON, fix the storage format
  2. If this is intentional, document why and add validation
  3. Consider using ast.literal_eval() if Python literals are expected
🤖 Prompt for AI Agents
In @benchmarks/wrappers/cloudflare/python/nosql.py at line 192, The code is
brittle because get_res is being massaged with get_res.replace("'", "\"") before
JSON parsing; stop doing that and parse the data in its true format: ensure the
producer stores JSON strings (fix the writer to use json.dumps) or, if the KV
contains Python literal reprs, replace the fragile replace() with a safe parser
(use json.loads(get_res) when data is valid JSON, or ast.literal_eval(get_res)
when Python literals are expected) and add validation/error handling; remove the
replace call and update the code that reads/writes the KV so the stored format
is unambiguous and parsed with the proper function (reference the get_res
variable and the surrounding KV read/write logic).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @benchmarks/wrappers/cloudflare/python/nosql.py:
- Around line 161-168: The get method currently returns the raw stored string,
violating its Optional[dict] contract; update the get function (involving get(),
key_maker(), get_table(), run_sync()) to parse the retrieved JSON string with
json.loads before returning, handle the case where the run_sync result is None
by returning None, and propagate or wrap JSON parsing errors appropriately so
callers receive a dict or None consistent with insert/update behavior.
🧹 Nitpick comments (1)
benchmarks/wrappers/cloudflare/python/nosql.py (1)

54-63: Clarify update vs. insert semantics.

The comment on line 54 questions whether update should have different behavior from insert. In typical NoSQL patterns, update might want to:

  • Merge with existing data rather than overwrite
  • Return an error if the key doesn't exist (vs. insert creating it)
  • Perform atomic updates or conditional writes

If this distinction matters for your use case, the methods should be differentiated; otherwise, consider removing the comment or explicitly documenting that they're intentionally identical.

Would you like me to suggest an implementation that differentiates insert and update operations?

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b7003b and 5d9e36d.

📒 Files selected for processing (1)
  • benchmarks/wrappers/cloudflare/python/nosql.py
🧰 Additional context used
🪛 Ruff (0.14.10)
benchmarks/wrappers/cloudflare/python/nosql.py

23-23: Use explicit conversion flag

Replace with conversion flag

(RUF010)


26-26: Use explicit conversion flag

Replace with conversion flag

(RUF010)


38-38: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)


40-40: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)


51-51: Local variable put_res is assigned to but never used

Remove assignment to unused variable put_res

(F841)


125-125: Use explicit conversion flag

Replace with conversion flag

(RUF010)


128-128: Use explicit conversion flag

Replace with conversion flag

(RUF010)


140-140: Local variable put_res is assigned to but never used

Remove assignment to unused variable put_res

(F841)


154-154: Local variable put_res is assigned to but never used

Remove assignment to unused variable put_res

(F841)

🔇 Additional comments (1)
benchmarks/wrappers/cloudflare/python/nosql.py (1)

29-40: Verify pickle security implications.

Pickle deserialization can be unsafe with untrusted data. Ensure that the Durable Object storage is fully trusted and cannot be manipulated by untrusted sources. If there's any possibility of untrusted data entering this path, switch to JSON serialization instead.

Comment on lines +161 to +168
def get(
self, table_name: str, primary_key: Tuple[str, str], secondary_key: Tuple[str, str]
) -> Optional[dict]:
get_res = run_sync(
self.get_table(table_name).get(
self.key_maker(primary_key, secondary_key)
))
return get_res
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Parse JSON in get method to match the return type contract.

The insert and update methods store data using json.dumps() (lines 143, 157), but this get method returns the raw string without parsing. This violates the return type Optional[dict] and creates inconsistent behavior with nosql_do.get().

🔧 Proposed fix
     def get(
         self, table_name: str, primary_key: Tuple[str, str], secondary_key: Tuple[str, str]
     ) -> Optional[dict]:
         get_res = run_sync(
             self.get_table(table_name).get(
                 self.key_maker(primary_key, secondary_key)
             ))
-        return get_res
+        if get_res is None:
+            return None
+        return json.loads(get_res)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get(
self, table_name: str, primary_key: Tuple[str, str], secondary_key: Tuple[str, str]
) -> Optional[dict]:
get_res = run_sync(
self.get_table(table_name).get(
self.key_maker(primary_key, secondary_key)
))
return get_res
def get(
self, table_name: str, primary_key: Tuple[str, str], secondary_key: Tuple[str, str]
) -> Optional[dict]:
get_res = run_sync(
self.get_table(table_name).get(
self.key_maker(primary_key, secondary_key)
))
if get_res is None:
return None
return json.loads(get_res)
🤖 Prompt for AI Agents
In @benchmarks/wrappers/cloudflare/python/nosql.py around lines 161 - 168, The
get method currently returns the raw stored string, violating its Optional[dict]
contract; update the get function (involving get(), key_maker(), get_table(),
run_sync()) to parse the retrieved JSON string with json.loads before returning,
handle the case where the run_sync result is None by returning None, and
propagate or wrap JSON parsing errors appropriately so callers receive a dict or
None consistent with insert/update behavior.

ldzgch and others added 2 commits January 7, 2026 16:37
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @benchmarks/wrappers/cloudflare/python/storage.py:
- Around line 46-61: The bug is double-prefixing "/tmp" for already-absolute tmp
paths causing invalid paths; fix by (1) in download(bucket, key, filepath) add
the actual filesystem path (real_fp) to written_files instead of the requested
filepath (self.written_files.add(real_fp)) and (2) in upload(bucket, key,
filepath) when you detect filepath in written_files, only prepend "/tmp" if the
filepath does not already start with "/tmp" (e.g., if filepath in
self.written_files: filepath = filepath if filepath.startswith("/tmp") else
"/tmp" + os.path.abspath(filepath)). Ensure these changes use the existing
upload, download, and written_files symbols.
- Around line 101-109: In adownload_stream, get_res from
get_bucket(...).get(key) is an R2ObjectBody and does not have .bytes(); replace
the await get_res.bytes() call with await get_res.arrayBuffer() and convert the
returned ArrayBuffer to Python bytes by wrapping it in a JS Uint8Array (e.g.,
js.Uint8Array.new(array_buffer)) and then calling bytes(...) on that result so
the function returns proper binary data instead of breaking at runtime.
🧹 Nitpick comments (3)
benchmarks/wrappers/cloudflare/python/storage.py (3)

35-38: Unused bucket parameter in get_bucket

bucket is never used here, which triggers ARG002 and can confuse readers. If you don’t plan to support multiple bindings, either drop the parameter or rename it to _bucket to make the intent explicit.


75-96: Unused put_res from bobj.put

put_res is assigned but never used, triggering F841 and slightly obscuring intent. Since you only care about the side effect of the PUT (and exceptions will propagate), you can drop the variable:

♻️ Simplify `aupload_stream`
-        bobj = self.get_bucket(bucket)
-        put_res = await bobj.put(unique_key, data_js)
-        return unique_key
+        bobj = self.get_bucket(bucket)
+        await bobj.put(unique_key, data_js)
+        return unique_key

111-116: Duplicate return storage.instance is dead code

The second return storage.instance is unreachable and can be removed:

✂️ Remove dead code
     def get_instance():
         if storage.instance is None:
             raise RuntimeError("must init storage singleton first")
-        return storage.instance
-        return storage.instance
+        return storage.instance

The current exception message is short and clear; TRY003 from Ruff is mostly stylistic here, so I’d keep it as-is unless you prefer a custom exception type.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d9e36d and 8d4cc0f.

📒 Files selected for processing (1)
  • benchmarks/wrappers/cloudflare/python/storage.py
🧰 Additional context used
🪛 Ruff (0.14.10)
benchmarks/wrappers/cloudflare/python/storage.py

35-35: Unused method argument: bucket

(ARG002)


48-48: Probable insecure usage of temporary file or directory: "/tmp"

(S108)


57-57: Probable insecure usage of temporary file or directory: "/tmp"

(S108)


58-58: Probable insecure usage of temporary file or directory: "/tmp"

(S108)


95-95: Local variable put_res is assigned to but never used

Remove assignment to unused variable put_res

(F841)


114-114: Avoid specifying long messages outside the exception class

(TRY003)

Comment on lines +46 to +61
def upload(self, bucket, key, filepath):
if filepath in self.written_files:
filepath = "/tmp" + os.path.abspath(filepath)
with open(filepath, "rb") as f:
unique_key = self.upload_stream(bucket, key, f.read())
return unique_key

def download(self, bucket, key, filepath):
data = self.download_stream(bucket, key)
# should only allow writes to tmp dir. so do have to edit the filepath here?
real_fp = filepath
if not filepath.startswith("/tmp"):
real_fp = "/tmp" + os.path.abspath(filepath)

self.written_files.add(filepath)
with open(real_fp, "wb") as f:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

/tmp prefixing can generate invalid paths for already-absolute tmp files

The interaction between upload, download, and written_files has a corner case:

  • download records the requested filepath in written_files.
  • upload rewrites any path found in written_files by prepending "/tmp" + os.path.abspath(filepath).

If a caller ever uses a path that already starts with /tmp (e.g., via download_directory(..., out_path="/tmp")), you’ll end up trying to open /tmp/tmp/..., which likely doesn’t exist.

You can retain the current behavior for non‑/tmp paths and avoid the double prefix with a small guard:

🔧 Suggested fix for `/tmp` double-prefix
     def upload(self, bucket, key, filepath):
-        if filepath in self.written_files:
-            filepath = "/tmp" + os.path.abspath(filepath)
+        if filepath in self.written_files and not filepath.startswith("/tmp"):
+            filepath = "/tmp" + os.path.abspath(filepath)

Note: Ruff’s S108 warning about /tmp is expected in Workers, since /tmp is the canonical writable directory; your os.path.abspath usage already mitigates simple traversal issues.

🧰 Tools
🪛 Ruff (0.14.10)

48-48: Probable insecure usage of temporary file or directory: "/tmp"

(S108)


57-57: Probable insecure usage of temporary file or directory: "/tmp"

(S108)


58-58: Probable insecure usage of temporary file or directory: "/tmp"

(S108)

🤖 Prompt for AI Agents
In @benchmarks/wrappers/cloudflare/python/storage.py around lines 46 - 61, The
bug is double-prefixing "/tmp" for already-absolute tmp paths causing invalid
paths; fix by (1) in download(bucket, key, filepath) add the actual filesystem
path (real_fp) to written_files instead of the requested filepath
(self.written_files.add(real_fp)) and (2) in upload(bucket, key, filepath) when
you detect filepath in written_files, only prepend "/tmp" if the filepath does
not already start with "/tmp" (e.g., if filepath in self.written_files: filepath
= filepath if filepath.startswith("/tmp") else "/tmp" +
os.path.abspath(filepath)). Ensure these changes use the existing upload,
download, and written_files symbols.

Comment on lines +101 to +109
async def adownload_stream(self, bucket, key):
bobj = self.get_bucket(bucket)
get_res = await bobj.get(key)
if get_res == jsnull:
print("key not stored in bucket")
return b''
# Always read as raw binary data (Blob/ArrayBuffer)
data = await get_res.bytes()
return bytes(data)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Cloudflare Workers R2 JavaScript API Bucket.get() return object methods bytes() arrayBuffer()

💡 Result:

Short answer: env.BUCKET.get(key) returns Promise<R2ObjectBody | R2Object | null>. When you get an R2ObjectBody it exposes a ReadableStream in .body plus convenience read helpers — arrayBuffer(), text(), json(), blob() and bodyUsed. There is no bytes() method on the R2ObjectBody; use arrayBuffer() (or blob()/text()) to read the bytes. [1][2]

Example:

  • Get object and read as ArrayBuffer:
    const obj = await env.MY_BUCKET.get("file");
    if (obj?.body) {
    const buf = await obj.arrayBuffer(); // ArrayBuffer of the object's bytes
    // buf.byteLength ...
    }

Sources: Cloudflare Workers R2 API reference (R2Object / R2ObjectBody) and Download objects docs. [1][2]


🏁 Script executed:

# First, understand the file structure and locate relevant files
git ls-files | grep -E "(cloudflare|R2|storage)" | head -20

Repository: spcl/serverless-benchmarks

Length of output: 1122


🏁 Script executed:

# Look at the storage.py file to understand the full context
cat -n benchmarks/wrappers/cloudflare/python/storage.py

Repository: spcl/serverless-benchmarks

Length of output: 4927


🏁 Script executed:

# Search for usage of 'bytes()' method or 'get_res' patterns in the codebase
rg "\.bytes\(\)" --type py -B 2 -A 2

Repository: spcl/serverless-benchmarks

Length of output: 915


🏁 Script executed:

# Search for R2 imports or references
rg -i "r2\|bucket\.get" --type py -B 2 -A 2

Repository: spcl/serverless-benchmarks

Length of output: 52


Fix: R2 get() returns R2ObjectBody which has no .bytes() method

The Cloudflare R2 API for Bucket.get() returns an R2ObjectBody that exposes arrayBuffer(), text(), json(), and blob(), but not bytes(). Line 108 will fail at runtime with "bytes is not a function".

Use arrayBuffer() instead:

Suggested fix
data = await get_res.arrayBuffer()
return bytes(js.Uint8Array.new(data))
🤖 Prompt for AI Agents
In @benchmarks/wrappers/cloudflare/python/storage.py around lines 101 - 109, In
adownload_stream, get_res from get_bucket(...).get(key) is an R2ObjectBody and
does not have .bytes(); replace the await get_res.bytes() call with await
get_res.arrayBuffer() and convert the returned ArrayBuffer to Python bytes by
wrapping it in a JS Uint8Array (e.g., js.Uint8Array.new(array_buffer)) and then
calling bytes(...) on that result so the function returns proper binary data
instead of breaking at runtime.

@@ -0,0 +1,474 @@
# Cloudflare Workers Implementation for SeBS

This directory contains the **complete implementation** of Cloudflare Workers support for the SeBS (Serverless Benchmarking Suite).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please move it to docs?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should merge with existing docs, possibly splitting the file - something to benchmark, something else to platforms, something else to storage, etc.

## Implementation Status

**Fully Implemented** - All features are production-ready:
- Multi-language support (JavaScript, Python, Java, Go, Rust) via containers
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have Go and Rust support :-)

from sebs.faas.function import Trigger, ExecutionResult


class LibraryTrigger(Trigger):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you don't need this class at all; just don't implement this trigger type.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

COPY . .

# Create ffmpeg directory and symlink for video-processing benchmark compatibility
RUN mkdir -p /app/ffmpeg && ln -s /usr/bin/ffmpeg /app/ffmpeg/ffmpeg
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is does not look correct - all of this should be handled by the benchmark itself.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this symlink the benchmark 220 is looking for ffmpeg at /app and throws:

FileNotFoundError: [Errno 2] No such file or directory: '/app/ffmpeg/ffmpeg'\n'}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh i just noticed the other platforms just run the init.sh file in the benchmark directory. this installs the ffpeg etc at the right locations. so i updated our code to do the same.

# self.logging.info("Waiting 120 seconds for benchmark initialization...")
# time.sleep(400)

# For container deployments, wait for Durable Object infrastructure to initialize
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we do it earlier?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

safety net for sometimes provisioning of the container itself can take longer even though the /health endpoint already works

raise RuntimeError(error_msg)

def _wait_for_durable_object_ready(self, worker_name: str, package_dir: str, env: dict):
"""Wait for container Durable Object to be fully provisioned and ready."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this - it looks like we wait until at least one worker is ready and running but how is this related to Durable Object?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well the workers instantiate the durable object.... so we need to wait for the worker to be ready and thus transitively the durable object.

def _build_workers_dev_url(self, worker_name: str, account_id: Optional[str]) -> str:
"""Build a best-effort public URL for a worker.
Prefer using the account's readable workers.dev subdomain when available
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does it fail?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just if the user has not configured correctly

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@benchmarks/wrappers/cloudflare/python/nosql.py`:
- Around line 107-111: The get_instance method on nosql_do currently constructs
a nosql_do() without ensuring the required binding attribute, which leads to
AttributeError if called before init_instance; update nosql_do.get_instance to
validate that an instance exists and that instance.binding is set (or otherwise
initialized) and if not, raise a clear RuntimeError instructing callers to call
nosql_do.init_instance(...) first (or provide the needed binding), or
alternatively call the existing init_instance flow if safe; reference the
nosql_do.get_instance and nosql_do.init_instance symbols and the
instance.binding attribute in your change.
- Around line 203-207: get_instance currently constructs a nosql_kv() without
ensuring the required env attribute is set; change get_instance (and mirror the
change made for nosql_do.get_instance) so it either calls init_instance to
create the singleton with env or validates and raises a clear error if env
hasn't been initialized; specifically update nosql_kv.get_instance to return the
existing instance if set, otherwise call nosql_kv.init_instance(...) (or require
an env parameter) to create the instance with env populated, or throw an
explicit exception telling callers to call init_instance first, so subsequent
methods on the singleton won't hit AttributeError due to a missing env.

In `@docs/platforms.md`:
- Around line 227-228: Update the workers.dev example URL in the HTTP Trigger
bullet: replace the placeholder https://{name}.{account}.workers.dev with
https://{name}.{subdomain}.workers.dev in the "- **HTTP Trigger**" line so the
docs use the correct {subdomain} token instead of {account}.
- Around line 236-244: Add blank lines before and after the markdown table that
begins with "| Instance Type | vCPU | Memory | Disk |" so the table is separated
from surrounding paragraphs (specifically the "Metrics Collection" paragraph
that follows); this satisfies markdownlint rule MD058—insert one empty line
above the table and one empty line below it in docs/platforms.md.

In `@sebs/cloudflare/cloudflare.py`:
- Around line 1451-1479: The digit-prefix rule must apply to all workers: in
format_function_name, after computing formatted (and stripping hyphens), check
if formatted is non-empty and starts with a digit; if so, prefix it
appropriately — use 'container-' when container_deployment is True and 'w-' (or
another agreed native-worker prefix) when container_deployment is False —
instead of only handling the container_deployment branch; update the conditional
that currently checks "if container_deployment and formatted and
formatted[0].isdigit()" to always enforce the digit-prefix rule based on
formatted and container_deployment.
♻️ Duplicate comments (9)
benchmarks/wrappers/cloudflare/python/nosql.py (3)

86-91: Remove debug print and replace assertion with proper error handling.

Line 88 contains a debug print statement that should be removed. Line 91 uses a hard assert that will crash with an unhelpful AssertionError if more than 100 keys are returned.

🔧 Proposed fix
         keys = []
         for key in list_res:
             if key.startswith(prefix_key):
-                print(key)
                 keys.append(key)
         ##print("keys", keys)
-        assert len(keys) <= 100
+        if len(keys) > 100:
+            raise ValueError(f"Query returned {len(keys)} keys, exceeding the 100 key limit")

161-168: get method returns raw string instead of parsed dict.

The method signature declares -> Optional[dict], but it returns the raw string from KV storage. Since insert and update store data with json.dumps(), this method should parse the result with json.loads() for consistency.

🔧 Proposed fix
     def get(
         self, table_name: str, primary_key: Tuple[str, str], secondary_key: Tuple[str, str]
     ) -> Optional[dict]:
         get_res = run_sync(
             self.get_table(table_name).get(
                 self.key_maker(primary_key, secondary_key)
             ))
-        return get_res
+        if get_res is None:
+            return None
+        return json.loads(get_res)

180-196: Replace assertion and fix fragile JSON parsing.

Line 184 has the same hard assertion issue. Line 192 uses brittle string replacement ("'""\"") which will corrupt data containing apostrophes (e.g., {"name": "O'Brien"}).

🔧 Proposed fix

If the get method is fixed to return json.loads(get_res), this query method can be simplified:

         keys = []
         for key in list_res.keys:
             keys.append(key.name)
         ##print("keys", keys)
-        assert len(keys) <= 100
+        if len(keys) > 100:
+            raise ValueError(f"Query returned {len(keys)} keys, exceeding the 100 key limit")


         # todo: please use bulk sometime (it didn't work when i tried it)
         res = []
         for key in keys:
             
             get_res = run_sync(self.get_table(table_name).get(key))
-            get_res = get_res.replace("\'", "\"")
-            ##print("gr", get_res)
-        
-            res.append(json.loads(get_res))
+            if get_res is not None:
+                res.append(json.loads(get_res))
         return res

The root cause of needing quote replacement suggests data may be stored incorrectly elsewhere. Verify that all writers use json.dumps() consistently.

sebs/cloudflare/cloudflare.py (6)

73-83: Avoid mutable default & unused config parameter.
The default {} is unsafe and config is unused. Prefer None and normalize, or drop the parameter entirely.

🔧 Proposed fix
-    def initialize(self, config: Dict[str, str] = {}, resource_prefix: Optional[str] = None):
+    def initialize(self, config: Optional[Dict[str, str]] = None, resource_prefix: Optional[str] = None):
@@
-        # Verify credentials are valid
+        if config is None:
+            config = {}
+        # Verify credentials are valid
         self._verify_credentials()

148-154: Add timeout + request error handling for credential verification.
requests.get() without a timeout can hang indefinitely; handle request errors explicitly.

🔧 Proposed fix
-        response = requests.get(f"{self._api_base_url}/user/tokens/verify", headers=headers)
+        try:
+            response = requests.get(
+                f"{self._api_base_url}/user/tokens/verify",
+                headers=headers,
+                timeout=30,
+            )
+        except requests.exceptions.RequestException as e:
+            raise RuntimeError("Failed to verify Cloudflare credentials") from e

841-849: Bind loop variables in replacer to avoid stale closure.
The nested function captures control_flow/builtins from the loop scope; bind them explicitly.

🐛 Proposed fix
-                        def replacer(match):
+                        def replacer(match, control_flow=control_flow, builtins=builtins):

1088-1094: Don’t create LibraryTriggers when the platform doesn’t support them.
Docs state Library triggers aren’t supported, but they’re still created and returned. This is likely to surface as unsupported trigger usage at runtime.

🔧 Proposed fix
-        from sebs.cloudflare.triggers import LibraryTrigger, HTTPTrigger
+        from sebs.cloudflare.triggers import HTTPTrigger
@@
-        library_trigger = LibraryTrigger(func_name, self)
-        library_trigger.logging_handlers = self.logging_handlers
-        worker.add_trigger(library_trigger)
+        # Library triggers are not supported for Cloudflare Workers
@@
-        from sebs.cloudflare.triggers import LibraryTrigger, HTTPTrigger
+        from sebs.cloudflare.triggers import HTTPTrigger
@@
-        if trigger_type == Trigger.TriggerType.LIBRARY:
-            trigger = LibraryTrigger(worker.name, self)
-            trigger.logging_handlers = self.logging_handlers
-            return trigger
-        elif trigger_type == Trigger.TriggerType.HTTP:
+        if trigger_type == Trigger.TriggerType.HTTP:
             account_id = worker.account_id or self.config.credentials.account_id
             worker_url = self._build_workers_dev_url(worker.name, account_id)
             trigger = HTTPTrigger(worker.name, worker_url)
             trigger.logging_handlers = self.logging_handlers
             return trigger
+        elif trigger_type == Trigger.TriggerType.LIBRARY:
+            raise NotImplementedError("Library triggers are not supported for Cloudflare Workers")
Cloudflare Workers trigger types support library trigger

Also applies to: 1622-1635


1109-1115: Add timeout + specific JSON error handling in _get_worker.

🔧 Proposed fix
-        response = requests.get(url, headers=headers)
+        try:
+            response = requests.get(url, headers=headers, timeout=30)
+        except requests.exceptions.RequestException as e:
+            self.logging.warning(f"Failed to query worker {worker_name}: {e}")
+            return None
@@
-            try:
-                return response.json().get("result")
-            except:
-                return None
+            try:
+                return response.json().get("result")
+            except (json.JSONDecodeError, ValueError, KeyError):
+                return None

1296-1322: Add timeout + narrow exception handling in _get_workers_dev_subdomain.

🔧 Proposed fix
-            resp = requests.get(url, headers=headers)
+            resp = requests.get(url, headers=headers, timeout=30)
@@
-        except Exception as e:
+        except requests.exceptions.RequestException as e:
             self.logging.warning(f"Error fetching workers.dev subdomain: {e}")
             return None
🧹 Nitpick comments (2)
benchmarks/wrappers/cloudflare/python/nosql.py (1)

28-40: Consider security implications of pickle serialization.

Using pickle for serialization carries inherent risks (arbitrary code execution during deserialization). While this appears to be internal Worker-to-Durable-Object communication where data is trusted, consider using JSON serialization for defense-in-depth, similar to how nosql_kv handles data.

♻️ Suggested alternative using JSON
-## these data conversion funcs should not be necessary. i couldn't get pyodide to clone the data otherwise
-    def data_pre(self, data):
-        return pickle.dumps(data, 0).decode("ascii")
-
-    def data_post(self, data):
-        # Handle None (key not found in storage)
-        if data is None:
-            return None
-        # Handle both string and bytes data from Durable Object storage
-        if isinstance(data, str):
-            return pickle.loads(bytes(data, "ascii"))
-        else:
-            return pickle.loads(data)
+    def data_pre(self, data):
+        return json.dumps(data)
+
+    def data_post(self, data):
+        if data is None:
+            return None
+        return json.loads(data)

If pickle is required due to Pyodide cloning limitations as the comment suggests, document this constraint more explicitly and ensure data sources are always trusted.

sebs/cloudflare/cloudflare.py (1)

1189-1204: Avoid double-waiting after container deployment.
_wait_for_durable_object_ready() already waits; the extra 60s sleep is redundant unless readiness failed.

♻️ Proposed refactor
-            if container_deployment:
-                self.logging.info("Waiting for container Durable Object to initialize...")
-                self._wait_for_durable_object_ready(worker_name, package_dir, env)
+            if container_deployment:
+                self.logging.info("Waiting for container Durable Object to initialize...")
+                ready = self._wait_for_durable_object_ready(worker_name, package_dir, env)
@@
-            if container_deployment:
-                self.logging.info("Waiting 60 seconds for container Durable Object to initialize...")
-                time.sleep(60)
+            if container_deployment and not ready:
+                self.logging.info("Waiting 60 seconds for container Durable Object to initialize...")
+                time.sleep(60)

Comment on lines +107 to +111
@staticmethod
def get_instance():
if nosql_do.instance is None:
nosql_do.instance = nosql_do()
return nosql_do.instance
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

get_instance creates instance without required binding attribute.

If get_instance() is called before init_instance(), the created instance will lack the binding attribute, causing AttributeError on subsequent operations. Consider raising an explicit error or initializing defensively.

🔧 Proposed fix
     `@staticmethod`
     def get_instance():
         if nosql_do.instance is None:
-            nosql_do.instance = nosql_do()
+            raise RuntimeError("nosql_do not initialized. Call init_instance() first.")
         return nosql_do.instance
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@staticmethod
def get_instance():
if nosql_do.instance is None:
nosql_do.instance = nosql_do()
return nosql_do.instance
`@staticmethod`
def get_instance():
if nosql_do.instance is None:
raise RuntimeError("nosql_do not initialized. Call init_instance() first.")
return nosql_do.instance
🤖 Prompt for AI Agents
In `@benchmarks/wrappers/cloudflare/python/nosql.py` around lines 107 - 111, The
get_instance method on nosql_do currently constructs a nosql_do() without
ensuring the required binding attribute, which leads to AttributeError if called
before init_instance; update nosql_do.get_instance to validate that an instance
exists and that instance.binding is set (or otherwise initialized) and if not,
raise a clear RuntimeError instructing callers to call
nosql_do.init_instance(...) first (or provide the needed binding), or
alternatively call the existing init_instance flow if safe; reference the
nosql_do.get_instance and nosql_do.init_instance symbols and the
instance.binding attribute in your change.

Comment on lines +203 to +207
@staticmethod
def get_instance():
if nosql_kv.instance is None:
nosql_kv.instance = nosql_kv()
return nosql_kv.instance
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

get_instance creates instance without required env attribute.

Same issue as nosql_do.get_instance(): if called before init_instance(), the instance will lack the env attribute, causing AttributeError on subsequent operations.

🔧 Proposed fix
     `@staticmethod`
     def get_instance():
         if nosql_kv.instance is None:
-            nosql_kv.instance = nosql_kv()
+            raise RuntimeError("nosql_kv not initialized. Call init_instance() first.")
         return nosql_kv.instance
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@staticmethod
def get_instance():
if nosql_kv.instance is None:
nosql_kv.instance = nosql_kv()
return nosql_kv.instance
`@staticmethod`
def get_instance():
if nosql_kv.instance is None:
raise RuntimeError("nosql_kv not initialized. Call init_instance() first.")
return nosql_kv.instance
🤖 Prompt for AI Agents
In `@benchmarks/wrappers/cloudflare/python/nosql.py` around lines 203 - 207,
get_instance currently constructs a nosql_kv() without ensuring the required env
attribute is set; change get_instance (and mirror the change made for
nosql_do.get_instance) so it either calls init_instance to create the singleton
with env or validates and raises a clear error if env hasn't been initialized;
specifically update nosql_kv.get_instance to return the existing instance if
set, otherwise call nosql_kv.init_instance(...) (or require an env parameter) to
create the instance with env populated, or throw an explicit exception telling
callers to call init_instance first, so subsequent methods on the singleton
won't hit AttributeError due to a missing env.

Comment on lines +227 to +228
- **HTTP Trigger**: ✅ Fully supported - Workers are automatically accessible at `https://{name}.{account}.workers.dev`
- **Library Trigger**: ❌ Not currently supported
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

Cloudflare Workers workers.dev URL format documentation

💡 Result:

Format: <YOUR_WORKER_NAME>.<YOUR_ACCOUNT_SUBDOMAIN>.workers.dev. Workers.dev account subdomain itself is <YOUR_ACCOUNT_SUBDOMAIN>.workers.dev. Worker-name restrictions when using workers.dev: ≤63 characters, only letters/numbers/dashes, and cannot start or end with a dash. [1][2]

Preview/versioned preview URL pattern: <VERSION_PREFIX>-<WORKER_NAME>..workers.dev. [3]

Sources: Cloudflare Workers docs. [1] [2] [3]


Replace {account} placeholder with {subdomain} in workers.dev URL.
The workers.dev URL format uses an account subdomain, not an account ID. The placeholder should reflect this to avoid confusion when users configure Workers.

✏️ Documentation fix
-- **HTTP Trigger**: ✅ Fully supported - Workers are automatically accessible at `https://{name}.{account}.workers.dev`
+- **HTTP Trigger**: ✅ Fully supported - Workers are automatically accessible at `https://{name}.{subdomain}.workers.dev`
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- **HTTP Trigger**: ✅ Fully supported - Workers are automatically accessible at `https://{name}.{account}.workers.dev`
- **Library Trigger**: ❌ Not currently supported
- **HTTP Trigger**: ✅ Fully supported - Workers are automatically accessible at `https://{name}.{subdomain}.workers.dev`
- **Library Trigger**: ❌ Not currently supported
🤖 Prompt for AI Agents
In `@docs/platforms.md` around lines 227 - 228, Update the workers.dev example URL
in the HTTP Trigger bullet: replace the placeholder
https://{name}.{account}.workers.dev with https://{name}.{subdomain}.workers.dev
in the "- **HTTP Trigger**" line so the docs use the correct {subdomain} token
instead of {account}.

Comment on lines +236 to +244
| Instance Type | vCPU | Memory | Disk |
|---------------|------|--------|------|
| lite | 1/16 | 256 MiB | 2 GB |
| basic | 1/4 | 1 GiB | 4 GB |
| standard-1 | 1/2 | 4 GiB | 8 GB |
| standard-2 | 1 | 6 GiB | 12 GB |
| standard-3 | 2 | 8 GiB | 16 GB |
| standard-4 | 4 | 12 GiB | 20 GB |
- **Metrics Collection**: Uses response-based per-invocation metrics. During each function invocation, the worker handler measures performance metrics (CPU time, wall time, memory usage) and embeds them directly in the JSON response. SeBS extracts these metrics immediately from each response. When `download_metrics()` is called for postprocessing, it only aggregates the metrics that were already collected during invocations—no additional data is fetched from external services. This approach provides immediate per-invocation granularity without delays. Note that while Cloudflare does expose an Analytics Engine, it only provides aggregated metrics without individual request-level data, making it unsuitable for detailed benchmarking purposes.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add blank lines around the table to satisfy markdownlint (MD058).

✏️ Proposed formatting fix
 - **Memory/Timeout Configuration (Containers)**: Managed by Cloudflare, available in different tiers:
 
+ 
   | Instance Type | vCPU | Memory | Disk |
   |---------------|------|--------|------|
   | lite | 1/16 | 256 MiB | 2 GB |
   | basic | 1/4 | 1 GiB | 4 GB |
   | standard-1 | 1/2 | 4 GiB | 8 GB |
   | standard-2 | 1 | 6 GiB | 12 GB |
   | standard-3 | 2 | 8 GiB | 16 GB |
   | standard-4 | 4 | 12 GB | 20 GB |
+ 
 - **Metrics Collection**: Uses response-based per-invocation metrics. During each function invocation, the worker handler measures performance metrics (CPU time, wall time, memory usage) and embeds them directly in the JSON response. SeBS extracts these metrics immediately from each response. When `download_metrics()` is called for postprocessing, it only aggregates the metrics that were already collected during invocations—no additional data is fetched from external services. This approach provides immediate per-invocation granularity without delays. Note that while Cloudflare does expose an Analytics Engine, it only provides aggregated metrics without individual request-level data, making it unsuitable for detailed benchmarking purposes.
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

243-243: Tables should be surrounded by blank lines

(MD058, blanks-around-tables)

🤖 Prompt for AI Agents
In `@docs/platforms.md` around lines 236 - 244, Add blank lines before and after
the markdown table that begins with "| Instance Type | vCPU | Memory | Disk |"
so the table is separated from surrounding paragraphs (specifically the "Metrics
Collection" paragraph that follows); this satisfies markdownlint rule
MD058—insert one empty line above the table and one empty line below it in
docs/platforms.md.

Comment on lines +1451 to +1479
@staticmethod
def format_function_name(name: str, container_deployment: bool = False) -> str:
"""
Format a function name to comply with Cloudflare Worker naming rules.
Worker names must:
- Be lowercase
- Contain only alphanumeric characters and hyphens
- Not start or end with a hyphen
- Not start with a digit
Args:
name: The original name
container_deployment: Whether this is a container worker (adds 'w-' prefix if name starts with digit)
Returns:
Formatted name
"""
# Convert to lowercase and replace invalid characters
formatted = name.lower().replace('_', '-').replace('.', '-')
# Remove any characters that aren't alphanumeric or hyphen
formatted = ''.join(c for c in formatted if c.isalnum() or c == '-')
# Remove leading/trailing hyphens
formatted = formatted.strip('-')
# Ensure container worker names don't start with a digit (Cloudflare requirement)
# Only add prefix for container workers to differentiate from native workers
if container_deployment and formatted and formatted[0].isdigit():
formatted = 'container-' + formatted
return formatted
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd /tmp && git ls-files | head -20

Repository: spcl/serverless-benchmarks

Length of output: 138


🏁 Script executed:

cat sebs/cloudflare/cloudflare.py | wc -l

Repository: spcl/serverless-benchmarks

Length of output: 74


🏁 Script executed:

sed -n '1451,1479p' sebs/cloudflare/cloudflare.py

Repository: spcl/serverless-benchmarks

Length of output: 1348


🏁 Script executed:

rg "format_function_name" --type py -B 2 -A 3

Repository: spcl/serverless-benchmarks

Length of output: 2585


🌐 Web query:

Cloudflare Workers script naming rules requirements

💡 Result:

Summary — Worker script name rules

  • Allowed characters: letters (A–Z, a–z), digits (0–9) and dashes (-). Do not use underscores. [1][2]
  • Maximum length: up to 255 characters for a Worker name. [1]
  • If you use a workers.dev subdomain (workers.dev URL / DNS label): name must be ≤ 63 characters and cannot start or end with a dash. [2]
  • The name is required in wrangler config (name property / script_name used in API/URLs). [1][3]

Sources: Cloudflare Workers / Wrangler documentation. [1] Wranger configuration (name rules). [2] workers.dev routing limits. [3] Cloudflare Workers Scripts API reference.


Apply the digit-prefix rule to all workers, not just container deployments.
The function's docstring documents that worker names must not start with a digit, but the current implementation only enforces this when container_deployment=True. Benchmark names like 311.compression will still produce invalid worker names for native deployments.

🔧 Suggested fix
-        if container_deployment and formatted and formatted[0].isdigit():
-            formatted = 'container-' + formatted
+        if formatted and formatted[0].isdigit():
+            prefix = "container-" if container_deployment else "w-"
+            formatted = prefix + formatted
🤖 Prompt for AI Agents
In `@sebs/cloudflare/cloudflare.py` around lines 1451 - 1479, The digit-prefix
rule must apply to all workers: in format_function_name, after computing
formatted (and stripping hyphens), check if formatted is non-empty and starts
with a digit; if so, prefix it appropriately — use 'container-' when
container_deployment is True and 'w-' (or another agreed native-worker prefix)
when container_deployment is False — instead of only handling the
container_deployment branch; update the conditional that currently checks "if
container_deployment and formatted and formatted[0].isdigit()" to always enforce
the digit-prefix rule based on formatted and container_deployment.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@sebs/cloudflare/cli.py`:
- Around line 103-110: The readiness loop in cli.py (where
docker_instance.logs(stream=True, follow=True) is iterated via dkg and
next(dkg).decode(...)) can block forever; add a timeout (e.g., max_wait_seconds)
and track elapsed time (using time.time()) inside the while True to break and
raise/return after the timeout, and also sleep briefly on StopIteration to avoid
busy looping; ensure you update the block that assigns dkg and calls next(dkg)
so it stops with a clear error/exception when the timeout elapses instead of
hanging.

In `@sebs/cloudflare/cloudflare.py`:
- Around line 354-365: The requests.get call should include a timeout and the
bare except should be replaced with specific exception handling: call
requests.get(url, headers=headers, timeout=<reasonable_seconds>) and catch
JSONDecodeError (or ValueError) when calling response.json() instead of using a
bare except; also consider catching requests.exceptions.RequestException around
the requests.get call and log errors via self.logging.warning including
response.status_code or exception details to aid debugging (refer to
requests.get, response.json(), and self.logging.warning in the surrounding
method).

In `@sebs/cloudflare/containers.py`:
- Around line 356-362: Update the exception handling and tar extraction call in
the tarfile.open block: preserve the original exception by re-raising
RuntimeError using "from e" (i.e., raise RuntimeError(...) from e) so the
exception chain is kept, and when calling tar.extractall(directory) pass the
recommended filter parameter (e.g., filter='data' or an equivalent safe filter)
to harden extraction; locate the tarfile.open(...) block and the except
Exception as e handling that uses self.logging.error/f"npm install failed: {e}"
and change the extractall call and the raise accordingly.
♻️ Duplicate comments (6)
sebs/cloudflare/cloudflare.py (4)

80-90: Fix mutable default argument.

The config parameter uses a mutable default {} which can cause unexpected behavior if the same dict instance is reused across calls. Additionally, the parameter is unused in the method body.

♻️ Proposed fix
-    def initialize(self, config: Dict[str, str] = {}, resource_prefix: Optional[str] = None):
+    def initialize(self, config: Optional[Dict[str, str]] = None, resource_prefix: Optional[str] = None):
         """
         Initialize the Cloudflare Workers platform.
 
         Args:
             config: Additional configuration parameters
             resource_prefix: Prefix for resource naming
         """
+        if config is None:
+            config = {}
         # Verify credentials are valid
         self._verify_credentials()

155-161: Add timeout to requests call.

The requests.get() call lacks a timeout parameter, which can cause the application to hang indefinitely if the Cloudflare API is unresponsive.

🔒 Proposed fix
-        response = requests.get(f"{self._api_base_url}/user/tokens/verify", headers=headers)
+        response = requests.get(
+            f"{self._api_base_url}/user/tokens/verify",
+            headers=headers,
+            timeout=30
+        )

455-482: Add timeout to requests call.

The requests.get() call on line 458 lacks a timeout parameter, which can cause the application to hang indefinitely when fetching the workers.dev subdomain.

🔒 Proposed fix
-            resp = requests.get(url, headers=headers)
+            resp = requests.get(url, headers=headers, timeout=30)

610-638: Apply digit-prefix rule to all workers, not just containers.

The docstring correctly states worker names must not start with a digit, but the implementation only enforces this for container_deployment=True. Benchmark names like 311.compression will produce invalid worker names for native deployments.

🐛 Proposed fix
-        if container_deployment and formatted and formatted[0].isdigit():
-            formatted = 'container-' + formatted
+        if formatted and formatted[0].isdigit():
+            prefix = "container-" if container_deployment else "w-"
+            formatted = prefix + formatted
docs/platforms.md (2)

229-230: Replace {account} with {subdomain} in workers.dev URL.

The workers.dev URL format uses an account subdomain, not an account ID. The placeholder should be {subdomain} to match Cloudflare's actual URL structure.

✏️ Proposed fix
-- **HTTP Trigger**: ✅ Fully supported - Workers are automatically accessible at `https://{name}.{account}.workers.dev`
+- **HTTP Trigger**: ✅ Fully supported - Workers are automatically accessible at `https://{name}.{subdomain}.workers.dev`

236-246: Add blank lines around the table for markdown compliance.

The table should be surrounded by blank lines to satisfy markdownlint rule MD058.

✏️ Proposed fix
 - **Memory/Timeout Configuration (Containers)**: Managed by Cloudflare, available in different tiers:
 
+
   | Instance Type | vCPU | Memory | Disk |
   |---------------|------|--------|------|
   | lite | 1/16 | 256 MiB | 2 GB |
   | basic | 1/4 | 1 GiB | 4 GB |
   | standard-1 | 1/2 | 4 GiB | 8 GB |
   | standard-2 | 1 | 6 GiB | 12 GB |
   | standard-3 | 2 | 8 GiB | 16 GB |
   | standard-4 | 4 | 12 GiB | 20 GB |
+
 - **Metrics Collection**: ...
🧹 Nitpick comments (20)
sebs/cloudflare/durable_objects.py (2)

143-172: Consider logging a warning when data seeding is skipped.

The write_to_table method silently skips data seeding due to platform limitations. Per a past reviewer comment, this silent skip could create a false impression that data was written. Consider adding a warning log to inform users that the operation was not performed.

♻️ Proposed fix
         table_name = self._get_table_name(benchmark, table)
         
         if not table_name:
             raise ValueError(f"Table {table} not found for benchmark {benchmark}")
         
-        # Silently skip data seeding for Cloudflare Durable Objects
-        # This is a platform limitation
-        pass
+        # Data seeding is not supported for Cloudflare Durable Objects
+        # This is a platform limitation - Durable Objects can only be written to from within Workers
+        self.logging.warning(
+            f"Skipping data seeding for table '{table}' - Cloudflare Durable Objects "
+            f"cannot be pre-populated via external API. Use 'large' benchmark size "
+            f"which creates data at runtime."
+        )

183-183: Remove extraneous f prefix from static string.

This f-string has no placeholders, making the f prefix unnecessary.

♻️ Proposed fix
-        self.logging.warning(f"Durable Objects data is managed within the Worker")
+        self.logging.warning("Durable Objects data is managed within the Worker")
sebs/cloudflare/cloudflare.py (2)

438-441: Preserve exception chain with raise ... from e.

When re-raising exceptions, use raise ... from e to preserve the exception chain for better debugging.

♻️ Proposed fix
         except RuntimeError as e:
             error_msg = f"Wrangler deployment failed for worker {worker_name}: {str(e)}"
-            self.logging.error(error_msg)
-            raise RuntimeError(error_msg)
+            self.logging.exception(error_msg)
+            raise RuntimeError(error_msg) from e

431-434: Consider removing or making the hardcoded 60-second sleep configurable.

After waiting for the Durable Object to report ready via the health endpoint, there's an additional unconditional 60-second sleep. This significantly increases deployment time. If this is truly needed as a safety net, consider making it configurable or reducing it based on observed provisioning times.

sebs/cloudflare/containers.py (4)

458-465: Preserve exception chain and use logging.exception.

When re-raising exceptions, use raise ... from e to preserve the exception chain. Also, logging.exception automatically includes traceback information.

♻️ Proposed fix
         except docker.errors.BuildError as e:
             error_msg = f"Docker build failed for {image_tag}: {e}"
-            self.logging.error(error_msg)
-            raise RuntimeError(error_msg)
+            self.logging.exception(error_msg)
+            raise RuntimeError(error_msg) from e
         except Exception as e:
             error_msg = f"Unexpected error building Docker image {image_tag}: {e}"
-            self.logging.error(error_msg)
-            raise RuntimeError(error_msg)
+            self.logging.exception(error_msg)
+            raise RuntimeError(error_msg) from e

489-490: Remove unused variables.

consecutive_failures and max_consecutive_failures are defined but never used. Remove them or implement the intended retry logic.

♻️ Proposed fix - remove unused variables
-        consecutive_failures = 0
-        max_consecutive_failures = 5
-        
         while time.time() - start_time < max_wait_seconds:

252-252: Remove extraneous f prefix from static string.

This f-string has no placeholders.

♻️ Proposed fix
-            self.logging.info(f"Copied worker.js orchestration file from nodejs/container")
+            self.logging.info("Copied worker.js orchestration file from nodejs/container")

351-355: Prefix unused variable with underscore.

The stat variable from get_archive is unused.

♻️ Proposed fix
-            bits, stat = cli.docker_instance.get_archive(f"{container_path}/node_modules")
+            bits, _stat = cli.docker_instance.get_archive(f"{container_path}/node_modules")
dockerfiles/cloudflare/Dockerfile.manage (1)

24-25: Consider pinning uv version for reproducible builds.

The uv installer script fetches the latest version by default, which could lead to build inconsistencies over time. Pinning to a specific version improves reproducibility.

♻️ Example with version pinning
-RUN curl -LsSf https://astral.sh/uv/install.sh | sh && \
+RUN curl -LsSf https://astral.sh/uv/0.9.26/install.sh | sh && \
     /root/.local/bin/uv tool install workers-py
sebs/cloudflare/cli.py (4)

51-79: Add exception chaining for better debugging.

When re-raising exceptions within except blocks, chain them to preserve the original traceback. This helps with debugging by showing the full error context.

Proposed fix
                 if not os.path.exists(dockerfile_path):
                     raise RuntimeError(
                         f"Dockerfile not found at {dockerfile_path}. "
                         "Cannot build Cloudflare CLI container."
-                    )
+                    ) from None
                 
                 # ... build logic ...
                 
                 except docker.errors.BuildError as build_error:
                     raise RuntimeError(
                         f"Failed to build Docker image {full_image_name}: {build_error}"
-                    )
+                    ) from build_error

28-40: Inconsistent logging usage.

The code mixes module-level logging (lines 30, 34, 39, 59, 75) with self.logging (line 101). Since the class inherits from LoggingBase, prefer using self.logging consistently for structured logging.

Also applies to: 101-101


116-116: Use explicit Optional type hint.

PEP 484 prohibits implicit Optional. Update the type hint to be explicit.

Proposed fix
-    def execute(self, cmd: str, env: dict = None):
+    def execute(self, cmd: str, env: dict | None = None):

Apply the same fix to wrangler_deploy (line 187) and pywrangler_deploy (line 202).


231-244: Consider caching for faster Docker builds.

The --no-cache flag ensures reproducible builds but significantly increases build time. Consider making this configurable or using --no-cache only for production deployments.

sebs/cloudflare/workers.py (7)

8-12: Unused import: json.

The json module is imported but not used in this file.

Proposed fix
 import os
 import shutil
-import json
 import io
 import tarfile

62-65: Unused parameter container_uri.

This parameter is declared but never used. If it's a placeholder for future functionality, consider documenting it or removing it until needed.

Options

Either remove the unused parameter:

     def generate_wrangler_toml(
         self,
         worker_name: str,
         package_dir: str,
         language: str,
         account_id: str,
         benchmark_name: Optional[str] = None,
         code_package: Optional[Benchmark] = None,
-        container_uri: str = "",
     ) -> str:

Or add a TODO comment explaining its intended future use.


142-149: Fragile TOML writing fallback.

Catching TypeError to distinguish between tomli_w.dump() (binary) and toml.dumps() (text) is brittle. A TypeError could occur for other reasons, masking real bugs.

Proposed fix using explicit check
         # Write wrangler.toml to package directory
         toml_path = os.path.join(package_dir, "wrangler.toml")
-        try:
-            # Try tomli_w (writes binary)
-            with open(toml_path, 'wb') as f:
-                tomli_w.dump(config, f)
-        except TypeError:
-            # Fallback to toml library (writes text)
-            with open(toml_path, 'w') as f:
-                f.write(tomli_w.dumps(config))
+        if hasattr(tomli_w, 'dump'):
+            # tomli_w writes binary
+            with open(toml_path, 'wb') as f:
+                tomli_w.dump(config, f)
+        else:
+            # toml library writes text via dumps()
+            with open(toml_path, 'w') as f:
+                f.write(tomli_w.dumps(config))

213-215: Improve exception handling with chaining and proper logging.

Use exception chaining (from e) and logging.exception() for better debugging context.

Proposed fix
                 except Exception as e:
-                    self.logging.error(f"npm install in container failed: {e}")
-                    raise RuntimeError(f"Failed to install Node.js dependencies: {e}")
+                    self.logging.exception(f"npm install in container failed: {e}")
+                    raise RuntimeError(f"Failed to install Node.js dependencies: {e}") from e

Apply the same pattern at lines 240-242:

                     except Exception as e:
-                        self.logging.error(f"Failed to install esbuild: {e}")
-                        raise RuntimeError(f"Failed to install esbuild: {e}")
+                        self.logging.exception(f"Failed to install esbuild: {e}")
+                        raise RuntimeError(f"Failed to install esbuild: {e}") from e

Also applies to: 240-242


208-209: tarfile.extractall() has potential path traversal risk.

While the tar archive comes from a controlled container, consider using the filter parameter (Python 3.12+) or manual member validation to mitigate path traversal attacks for defense in depth.

Proposed safer extraction (Python 3.12+)
                     with tarfile.open(fileobj=file_obj) as tar:
-                        tar.extractall(directory)
+                        tar.extractall(directory, filter='data')

For older Python versions, you can validate members before extraction.

Also applies to: 236-237


154-161: Unused parameter is_cached.

The is_cached parameter is never used. It could be leveraged to skip dependency installation for cached packages, improving performance.


263-293: Consider externalizing the hardcoded Pyodide package list.

This list of ~200 packages will become outdated as Pyodide releases new versions. Pyodide publishes its supported packages via a versioned repodata.json at https://cdn.jsdelivr.net/pyodide/vX.Y.Z/full/repodata.json. Consider:

  1. Fetching the package list at build time from the appropriate Pyodide version's repodata.json
  2. Storing in a separate configuration file that can be updated alongside Pyodide version upgrades
  3. Adding a comment with a link to the Pyodide package index

Comment on lines +103 to +110
# Wait for container to be ready
while True:
try:
dkg = self.docker_instance.logs(stream=True, follow=True)
next(dkg).decode("utf-8")
break
except StopIteration:
pass
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Readiness wait loop could block indefinitely.

The while True loop has no timeout mechanism. If the container doesn't produce logs (e.g., due to a startup failure or misconfiguration), this will hang forever.

Proposed fix with timeout
+        import time
+        
         # Wait for container to be ready
+        max_wait_seconds = 30
+        start_time = time.time()
         while True:
+            if time.time() - start_time > max_wait_seconds:
+                raise RuntimeError(
+                    f"Container {self.docker_instance.id} did not become ready within {max_wait_seconds}s"
+                )
             try:
                 dkg = self.docker_instance.logs(stream=True, follow=True)
                 next(dkg).decode("utf-8")
                 break
             except StopIteration:
-                pass
+                time.sleep(0.1)  # Brief sleep before retry
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Wait for container to be ready
while True:
try:
dkg = self.docker_instance.logs(stream=True, follow=True)
next(dkg).decode("utf-8")
break
except StopIteration:
pass
import time
# Wait for container to be ready
max_wait_seconds = 30
start_time = time.time()
while True:
if time.time() - start_time > max_wait_seconds:
raise RuntimeError(
f"Container {self.docker_instance.id} did not become ready within {max_wait_seconds}s"
)
try:
dkg = self.docker_instance.logs(stream=True, follow=True)
next(dkg).decode("utf-8")
break
except StopIteration:
time.sleep(0.1) # Brief sleep before retry
🤖 Prompt for AI Agents
In `@sebs/cloudflare/cli.py` around lines 103 - 110, The readiness loop in cli.py
(where docker_instance.logs(stream=True, follow=True) is iterated via dkg and
next(dkg).decode(...)) can block forever; add a timeout (e.g., max_wait_seconds)
and track elapsed time (using time.time()) inside the while True to break and
raise/return after the timeout, and also sleep briefly on StopIteration to avoid
busy looping; ensure you update the block that assigns dkg and calls next(dkg)
so it stops with a clear error/exception when the timeout elapses instead of
hanging.

Comment on lines +354 to +365
response = requests.get(url, headers=headers)

if response.status_code == 200:
try:
return response.json().get("result")
except:
return None
elif response.status_code == 404:
return None
else:
self.logging.warning(f"Unexpected response checking worker: {response.status_code}")
return None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add timeout and replace bare except with specific exception.

Two issues:

  1. The requests.get() call lacks a timeout parameter
  2. The bare except: clause catches all exceptions including system-exiting ones
🔒 Proposed fix
-        response = requests.get(url, headers=headers)
+        response = requests.get(url, headers=headers, timeout=30)
 
         if response.status_code == 200:
             try:
                 return response.json().get("result")
-            except:
+            except (json.JSONDecodeError, ValueError, KeyError):
                 return None
🧰 Tools
🪛 Ruff (0.14.13)

354-354: Probable use of requests call without timeout

(S113)


359-359: Do not use bare except

(E722)

🤖 Prompt for AI Agents
In `@sebs/cloudflare/cloudflare.py` around lines 354 - 365, The requests.get call
should include a timeout and the bare except should be replaced with specific
exception handling: call requests.get(url, headers=headers,
timeout=<reasonable_seconds>) and catch JSONDecodeError (or ValueError) when
calling response.json() instead of using a bare except; also consider catching
requests.exceptions.RequestException around the requests.get call and log errors
via self.logging.warning including response.status_code or exception details to
aid debugging (refer to requests.get, response.json(), and self.logging.warning
in the surrounding method).

Comment on lines +356 to +362
with tarfile.open(fileobj=file_obj) as tar:
tar.extractall(directory)

self.logging.info(f"Downloaded node_modules to {directory} for wrangler deployment")
except Exception as e:
self.logging.error(f"npm install failed: {e}")
raise RuntimeError(f"Failed to install Node.js dependencies: {e}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

Python tarfile extractall filter parameter security CVE-2007-4559

💡 Result:

Short answer:

  • CVE-2007-4559 is an old directory‑traversal issue in Python’s tarfile.extract()/extractall() where “..” paths in a TAR can overwrite files outside the target directory. [1]
  • Separately, recent (2024–2025) CVEs/alerts describe bypasses of the newer tarfile extraction "filter" feature (added in Python 3.12, defaulting to "data" in 3.14) that let specially crafted archives cause writes or metadata changes outside the extraction directory when using extract()/extractall(filter="data" or "tar"). See advisories for CVE-2024-12718 / CVE-2025-4517 and related GHSA entries. [2][3]

Mitigation summary:

  • Do not extract untrusted tarballs with tarfile.extract()/extractall() without explicit, safe checks.
  • Avoid using filter="data" or filter="tar" on untrusted input (or set filter="none"/omit the filter) until you’ve updated to a patched Python.
  • Apply vendor/CPython security patches or upgrade to patched Python releases when available.
  • As an extra defense, validate each TarInfo (sanitize path, forbid absolute paths and any .. components, reject symlinks/device nodes) and extract into a safe temporary directory, then move validated files into place.

Sources:
[1] NVD entry for CVE-2007-4559 (tarfile directory traversal).
[2] OSS‑security / Python security announcement summarizing multiple tarfile CVEs (June 2025).
[3] CVE writeups for the extraction‑filter bypass (CVE-2025-4517 / CVE-2024-12718) and related advisories.


🏁 Script executed:

# Locate and view the actual code in the repository
find . -name "containers.py" -path "*/cloudflare/*" | head -5

Repository: spcl/serverless-benchmarks

Length of output: 101


🏁 Script executed:

# View the actual code at lines 356-362
sed -n '350,365p' ./sebs/cloudflare/containers.py

Repository: spcl/serverless-benchmarks

Length of output: 845


🏁 Script executed:

# Check Python version requirements and context around the code
# Look for Python version specifications in the repo
find . -name "setup.py" -o -name "pyproject.toml" -o -name ".python-version" | head -10

Repository: spcl/serverless-benchmarks

Length of output: 159


🏁 Script executed:

# Check how the tar data is sourced and if it's trusted
sed -n '340,370p' ./sebs/cloudflare/containers.py | cat -n

Repository: spcl/serverless-benchmarks

Length of output: 1830


🏁 Script executed:

# Check Python version requirements in the project
cat pyproject.toml 2>/dev/null || cat setup.py 2>/dev/null | grep -A 5 "python" | head -15

Repository: spcl/serverless-benchmarks

Length of output: 156


🏁 Script executed:

# Look for other tarfile.extractall usages to see if there's a pattern
rg "tarfile.extractall|tar\.extractall" -B 2 -A 2

Repository: spcl/serverless-benchmarks

Length of output: 1686


Add exception chaining and consider filter parameter for tarfile.extractall().

Two improvements:

  1. Exception chaining: Add from e to preserve the exception chain for better debugging (line 362).
  2. Tarfile filter: The tar data comes from cli.docker_instance.get_archive() (internal Docker container), which is a trusted source; however, adding an explicit filter parameter like filter='data' follows defense-in-depth principles and is recommended for Python 3.12+.
Suggested changes
             with tarfile.open(fileobj=file_obj) as tar:
-                tar.extractall(directory)
+                tar.extractall(directory, filter='data')
             
             self.logging.info(f"Downloaded node_modules to {directory} for wrangler deployment")
         except Exception as e:
             self.logging.error(f"npm install failed: {e}")
-            raise RuntimeError(f"Failed to install Node.js dependencies: {e}")
+            raise RuntimeError(f"Failed to install Node.js dependencies: {e}") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
with tarfile.open(fileobj=file_obj) as tar:
tar.extractall(directory)
self.logging.info(f"Downloaded node_modules to {directory} for wrangler deployment")
except Exception as e:
self.logging.error(f"npm install failed: {e}")
raise RuntimeError(f"Failed to install Node.js dependencies: {e}")
with tarfile.open(fileobj=file_obj) as tar:
tar.extractall(directory, filter='data')
self.logging.info(f"Downloaded node_modules to {directory} for wrangler deployment")
except Exception as e:
self.logging.error(f"npm install failed: {e}")
raise RuntimeError(f"Failed to install Node.js dependencies: {e}") from e
🧰 Tools
🪛 Ruff (0.14.13)

357-357: Uses of tarfile.extractall()

(S202)


360-360: Do not catch blind exception: Exception

(BLE001)


361-361: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


362-362: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


362-362: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
In `@sebs/cloudflare/containers.py` around lines 356 - 362, Update the exception
handling and tar extraction call in the tarfile.open block: preserve the
original exception by re-raising RuntimeError using "from e" (i.e., raise
RuntimeError(...) from e) so the exception chain is kept, and when calling
tar.extractall(directory) pass the recommended filter parameter (e.g.,
filter='data' or an equivalent safe filter) to harden extraction; locate the
tarfile.open(...) block and the except Exception as e handling that uses
self.logging.error/f"npm install failed: {e}" and change the extractall call and
the raise accordingly.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@sebs/cloudflare/containers.py`:
- Around line 283-294: The regex replacement for "from . import " can rewrite
occurrences inside strings/comments and corrupt code; instead parse each Python
file (use the ast module) and transform ast.ImportFrom nodes where node.level ==
1 and node.module is None or '' into equivalent ast.Import nodes (preserve alias
names), then unparse the AST back to source before writing to file_path; operate
only when language_name == "python" and skip files in the existing exclusion
list (handler.py, storage.py, nosql.py, worker.py) to replicate the current
behavior.
♻️ Duplicate comments (1)
sebs/cloudflare/containers.py (1)

356-362: Add tarfile filter and exception chaining (previously flagged).

This addresses security (CVE-2007-4559 and related) and improves debugging with proper exception chaining.

♻️ Suggested fix
             with tarfile.open(fileobj=file_obj) as tar:
-                tar.extractall(directory)
+                tar.extractall(directory, filter='data')
             
             self.logging.info(f"Downloaded node_modules to {directory} for wrangler deployment")
         except Exception as e:
-            self.logging.error(f"npm install failed: {e}")
-            raise RuntimeError(f"Failed to install Node.js dependencies: {e}")
+            self.logging.exception("npm install failed")
+            raise RuntimeError(f"Failed to install Node.js dependencies: {e}") from e
🧹 Nitpick comments (10)
sebs/cloudflare/containers.py (10)

61-70: Remove or document unused parameters.

The language (line 65) and container_uri (line 69) parameters are declared but never used in the method body. Either remove them or add a comment explaining they're reserved for future use.

♻️ Suggested fix
     def generate_wrangler_toml(
         self,
         worker_name: str,
         package_dir: str,
-        language: str,
         account_id: str,
         benchmark_name: Optional[str] = None,
         code_package: Optional[Benchmark] = None,
-        container_uri: str = "",
     ) -> str:

110-122: Accessing private _tables attribute and missing key safety checks.

  1. nosql_storage._tables accesses a private attribute that may change without notice.
  2. Lines 116 and 122 assume config['durable_objects']['bindings'] and config['migrations'][0]['new_sqlite_classes'] exist in the template. If the template structure changes, this will raise KeyError/IndexError.
♻️ Suggested improvements
-                nosql_tables = nosql_storage._tables.get(benchmark_name, {})
+                # Consider exposing a public method like get_tables(benchmark_name)
+                nosql_tables = nosql_storage._tables.get(benchmark_name, {})
                 
                 # Add durable object bindings for each nosql table
                 for table_name in nosql_tables.keys():
+                    if 'durable_objects' not in config:
+                        config['durable_objects'] = {'bindings': []}
+                    if 'bindings' not in config['durable_objects']:
+                        config['durable_objects']['bindings'] = []
                     config['durable_objects']['bindings'].append({
                         'name': table_name,
                         'class_name': 'KVApiObject'
                     })
                 
                 # Update migrations to include KVApiObject
+                if 'migrations' in config and config['migrations']:
+                    if 'new_sqlite_classes' not in config['migrations'][0]:
+                        config['migrations'][0]['new_sqlite_classes'] = []
                 config['migrations'][0]['new_sqlite_classes'].append('KVApiObject')

143-147: Broad exception handling obscures specific failures.

Catching bare Exception here makes debugging difficult. Consider catching more specific exceptions (e.g., KeyError, AttributeError) or at minimum, log the exception type.

♻️ Suggested fix
-        except Exception as e:
+        except (KeyError, AttributeError) as e:
             self.logging.warning(
                 f"R2 bucket binding not configured: {e}. "
                 f"Benchmarks requiring file access will not work properly."
             )

250-252: Remove extraneous f-string prefix.

Line 252 uses an f-string without any placeholders.

♻️ Suggested fix
-            self.logging.info(f"Copied worker.js orchestration file from nodejs/container")
+            self.logging.info("Copied worker.js orchestration file from nodejs/container")

296-303: Remove dead code block.

The Node.js transformation block contains only pass with a comment suggesting future work. This is dead code that should be removed or implemented.

♻️ Suggested fix
-        # For Node.js containers, transform benchmark code to be async-compatible
-        # The container wrapper uses async HTTP calls, but benchmarks expect sync
-        elif language_name == "nodejs":
-            for item in os.listdir(directory):
-                if item.endswith('.js') and item not in ['handler.js', 'storage.js', 'nosql.js', 'worker.js', 'build.js', 'request-polyfill.js']:
-                    file_path = os.path.join(directory, item)
-                    # Could add transformations here if needed
-                    pass

351-351: Rename unused variable to indicate intent.

The stat variable from get_archive() is never used.

♻️ Suggested fix
-            bits, stat = cli.docker_instance.get_archive(f"{container_path}/node_modules")
+            bits, _stat = cli.docker_instance.get_archive(f"{container_path}/node_modules")

406-411: Rename unused loop variable.

The dirnames variable is not used within the loop body.

♻️ Suggested fix
-        for dirpath, dirnames, filenames in os.walk(directory):
+        for dirpath, _dirnames, filenames in os.walk(directory):

458-465: Improve exception handling with proper chaining and logging.

Use raise ... from e to preserve exception chains and logging.exception to automatically include stack traces.

♻️ Suggested fix
         except docker.errors.BuildError as e:
             error_msg = f"Docker build failed for {image_tag}: {e}"
-            self.logging.error(error_msg)
-            raise RuntimeError(error_msg)
+            self.logging.exception(error_msg)
+            raise RuntimeError(error_msg) from e
         except Exception as e:
             error_msg = f"Unexpected error building Docker image {image_tag}: {e}"
-            self.logging.error(error_msg)
-            raise RuntimeError(error_msg)
+            self.logging.exception(error_msg)
+            raise RuntimeError(error_msg) from e

467-471: Unused parameter worker_name.

The worker_name parameter is declared but never used. Consider using it in log messages for better observability or remove it.

♻️ Suggested fix - use in logging
-        self.logging.info("Checking container Durable Object readiness via health endpoint...")
+        self.logging.info(f"Checking container Durable Object readiness for {worker_name} via health endpoint...")

Or remove the parameter if not needed.


489-490: Remove unused circuit breaker variables.

consecutive_failures and max_consecutive_failures are assigned but never used. Either implement the intended circuit breaker logic or remove these variables.

♻️ Option 1: Remove unused variables
-        consecutive_failures = 0
-        max_consecutive_failures = 5
♻️ Option 2: Implement circuit breaker pattern
         consecutive_failures = 0
         max_consecutive_failures = 5
         
         while time.time() - start_time < max_wait_seconds:
             try:
                 response = requests.get(
                     f"{worker_url}/health",
                     timeout=60
                 )
                 
                 if response.status_code == 200:
                     self.logging.info("Container Durable Object is ready!")
                     return True
                 elif response.status_code == 503:
+                    consecutive_failures = 0  # Reset on expected "not ready" response
                     # ... existing logging ...
                 else:
+                    consecutive_failures += 1
                     self.logging.warning(...)
+                    if consecutive_failures >= max_consecutive_failures:
+                        self.logging.error("Too many consecutive failures, aborting wait")
+                        return False
                     
             except requests.exceptions.Timeout:
+                consecutive_failures += 1
                 # ... existing logging ...
             except requests.exceptions.RequestException as e:
+                consecutive_failures += 1
                 # ... existing logging ...
+            
+            if consecutive_failures >= max_consecutive_failures:
+                self.logging.error("Too many consecutive failures, aborting wait")
+                return False

Comment on lines +283 to +294
# For Python containers, fix relative imports in benchmark code
# Containers use flat structure, so "from . import storage" must become "import storage"
if language_name == "python":
for item in os.listdir(directory):
if item.endswith('.py') and item not in ['handler.py', 'storage.py', 'nosql.py', 'worker.py']:
file_path = os.path.join(directory, item)
with open(file_path, 'r') as f:
content = f.read()
# Fix relative imports
content = re.sub(r'from \. import ', 'import ', content)
with open(file_path, 'w') as f:
f.write(content)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Regex-based import rewriting may produce false positives.

The regex from \. import could match occurrences inside string literals or comments, potentially corrupting benchmark code. Consider using AST-based transformation or a more targeted approach.

♻️ Safer alternative using line-by-line check
-                    # Fix relative imports
-                    content = re.sub(r'from \. import ', 'import ', content)
+                    # Fix relative imports (only at line start, not in strings/comments)
+                    lines = content.split('\n')
+                    fixed_lines = []
+                    for line in lines:
+                        stripped = line.lstrip()
+                        if stripped.startswith('from . import '):
+                            line = line.replace('from . import ', 'import ', 1)
+                        fixed_lines.append(line)
+                    content = '\n'.join(fixed_lines)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# For Python containers, fix relative imports in benchmark code
# Containers use flat structure, so "from . import storage" must become "import storage"
if language_name == "python":
for item in os.listdir(directory):
if item.endswith('.py') and item not in ['handler.py', 'storage.py', 'nosql.py', 'worker.py']:
file_path = os.path.join(directory, item)
with open(file_path, 'r') as f:
content = f.read()
# Fix relative imports
content = re.sub(r'from \. import ', 'import ', content)
with open(file_path, 'w') as f:
f.write(content)
# For Python containers, fix relative imports in benchmark code
# Containers use flat structure, so "from . import storage" must become "import storage"
if language_name == "python":
for item in os.listdir(directory):
if item.endswith('.py') and item not in ['handler.py', 'storage.py', 'nosql.py', 'worker.py']:
file_path = os.path.join(directory, item)
with open(file_path, 'r') as f:
content = f.read()
# Fix relative imports (only at line start, not in strings/comments)
lines = content.split('\n')
fixed_lines = []
for line in lines:
stripped = line.lstrip()
if stripped.startswith('from . import '):
line = line.replace('from . import ', 'import ', 1)
fixed_lines.append(line)
content = '\n'.join(fixed_lines)
with open(file_path, 'w') as f:
f.write(content)
🤖 Prompt for AI Agents
In `@sebs/cloudflare/containers.py` around lines 283 - 294, The regex replacement
for "from . import " can rewrite occurrences inside strings/comments and corrupt
code; instead parse each Python file (use the ast module) and transform
ast.ImportFrom nodes where node.level == 1 and node.module is None or '' into
equivalent ast.Import nodes (preserve alias names), then unparse the AST back to
source before writing to file_path; operate only when language_name == "python"
and skip files in the existing exclusion list (handler.py, storage.py, nosql.py,
worker.py) to replicate the current behavior.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Fix all issues with AI agents
In `@sebs/cloudflare/r2.py`:
- Around line 12-14: The typename() static method incorrectly returns
"Cloudlfare.R2"; update the return value in typename() (in sebs.cloudflare.r2:
typename) to the correct string "Cloudflare.R2" so the provider name is spelled
properly.
- Around line 347-354: The clean_bucket method currently logs a warning and
returns silently which can mislead callers (similar to download); update the
implementation to raise NotImplementedError instead of silently passing so
callers get an explicit failure. Modify the clean_bucket function (and any
similar stubs like download) to raise NotImplementedError with a clear message
such as "clean_bucket not implemented for R2 bucket {bucket_name}" and remove or
keep the warning log as supplementary; ensure the method signature and docstring
remain unchanged.
- Around line 368-380: The requests.delete call lacks a timeout; update the HTTP
call to include the same timeout used elsewhere in this module by adding the
timeout keyword argument to requests.delete(delete_bucket_uri,
headers=self._get_auth_headers(), timeout=<module timeout value>), where <module
timeout value> should be the existing timeout constant or attribute used by
other calls in this file (for example self._timeout or DEFAULT_TIMEOUT), so that
the delete request for delete_bucket_uri follows the same timeout behavior as
other requests.
- Around line 87-106: The requests.post call in the R2 bucket creation (inside
the method using create_bucket_uri and create_bucket_response) needs a timeout
to avoid hangs and the inner bare except should be replaced with specific
exception handling; add a timeout argument to requests.post (e.g. timeout=10 or
a configurable constant) and wrap the response.json() call in a try/except that
catches json.JSONDecodeError (or ValueError) only, logging
create_bucket_response.text when JSON parsing fails; also consider catching
requests.exceptions.RequestException around the POST to log and re-raise or
handle network errors instead of letting it hang.
- Around line 122-133: The download method currently only logs a warning and
returns silently, which can mislead callers; replace the silent pass with
raising a NotImplementedError (include a clear message referencing the
bucket_name and key) so callers receive an explicit failure, keep the method
signature download(self, bucket_name: str, key: str, filepath: str) unchanged
and remove the trailing pass; retain or refine the log message if desired but
ensure the NotImplementedError is raised.
- Around line 58-60: The _create_bucket function currently uses a mutable
default for buckets and includes an unused randomize_name parameter; change the
signature of _create_bucket(name: str, buckets: Optional[list[str]] = None) (or
remove the buckets default entirely) and inside the function initialize buckets
= [] if buckets is None to avoid shared mutable state, and either remove the
randomize_name parameter or keep it but mark it with a TODO and do not rely on
it; update any callers of _create_bucket to pass an explicit list if they relied
on the previous default.
- Around line 299-316: The requests.get call to list_buckets_uri needs a timeout
and the error handling must avoid a bare except and properly capture JSON parse
errors and include the exception text in the log; update the call in the method
that calls requests.get(list_buckets_uri, headers=self._get_auth_headers()) to
pass a sensible timeout (e.g. timeout=10), replace the bare except with a
specific exception handler (catch json.JSONDecodeError or ValueError as e) when
calling response.json(), and include the exception details in the
self.logging.error message (or use self.logging.exception) so the log contains
the actual parse error while still returning [] on 403.
♻️ Duplicate comments (1)
sebs/cloudflare/r2.py (1)

145-232: Significant code duplication in S3 client initialization.

The S3 client initialization logic (lines 163-170, 210-217, and 261-268) is repeated across upload, upload_bytes, and list_bucket methods. Extract this into a helper method for maintainability.

♻️ Proposed refactor

Add a helper method and refactor the upload methods:

+    def _get_s3_client(self):
+        """Get or create S3-compatible client for R2 operations."""
+        if not self._credentials.r2_access_key_id or not self._credentials.r2_secret_access_key:
+            return None
+        
+        import boto3
+        from botocore.config import Config
+        
+        account_id = self._credentials.account_id
+        return boto3.client(
+            's3',
+            endpoint_url=f'https://{account_id}.r2.cloudflarestorage.com',
+            aws_access_key_id=self._credentials.r2_access_key_id,
+            aws_secret_access_key=self._credentials.r2_secret_access_key,
+            config=Config(signature_version='s3v4'),
+            region_name='auto'
+        )
+
     def upload(self, bucket_name: str, filepath: str, key: str):
         ...
         try:
-            import boto3
-            from botocore.config import Config
-            
-            account_id = self._credentials.account_id
-            
-            # R2 uses S3-compatible API, but requires special configuration
-            # The endpoint is: https://<account_id>.r2.cloudflarestorage.com
-            # You need to create R2 API tokens in the Cloudflare dashboard
-            
-            # Check if we have S3-compatible credentials
-            if not self._credentials.r2_access_key_id or not self._credentials.r2_secret_access_key:
+            s3_client = self._get_s3_client()
+            if s3_client is None:
                 self.logging.warning(
                     "R2 upload requires S3-compatible API credentials (r2_access_key_id, r2_secret_access_key). "
                     "File upload skipped. Set CLOUDFLARE_R2_ACCESS_KEY_ID and CLOUDFLARE_R2_SECRET_ACCESS_KEY."
                 )
                 return
-            
-            s3_client = boto3.client(
-                's3',
-                endpoint_url=f'https://{account_id}.r2.cloudflarestorage.com',
-                aws_access_key_id=self._credentials.r2_access_key_id,
-                aws_secret_access_key=self._credentials.r2_secret_access_key,
-                config=Config(signature_version='s3v4'),
-                region_name='auto'
-            )

Based on past review comments, this duplication was previously flagged.

🧹 Nitpick comments (1)
sebs/cloudflare/r2.py (1)

20-26: Remove redundant property - already defined in parent class.

The replace_existing property is already defined in PersistentStorage (the parent class). This duplication is unnecessary and could lead to maintenance issues.

♻️ Proposed fix
-    `@property`
-    def replace_existing(self) -> bool:
-        return self._replace_existing
-
-    `@replace_existing.setter`
-    def replace_existing(self, val: bool):
-        self._replace_existing = val
-
     def __init__(

Comment on lines +12 to +14
@staticmethod
def typename() -> str:
return "Cloudlfare.R2"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Typo in typename: "Cloudlfare" should be "Cloudflare".

🔧 Proposed fix
     `@staticmethod`
     def typename() -> str:
-        return "Cloudlfare.R2"
+        return "Cloudflare.R2"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@staticmethod
def typename() -> str:
return "Cloudlfare.R2"
`@staticmethod`
def typename() -> str:
return "Cloudflare.R2"
🤖 Prompt for AI Agents
In `@sebs/cloudflare/r2.py` around lines 12 - 14, The typename() static method
incorrectly returns "Cloudlfare.R2"; update the return value in typename() (in
sebs.cloudflare.r2: typename) to the correct string "Cloudflare.R2" so the
provider name is spelled properly.

Comment on lines +58 to +60
def _create_bucket(
self, name: str, buckets: list[str] = [], randomize_name: bool = False
) -> str:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix mutable default argument and unused parameter.

Using a mutable default argument ([]) can cause unexpected behavior if the list is modified. The randomize_name parameter is also unused.

🔧 Proposed fix
     def _create_bucket(
-        self, name: str, buckets: list[str] = [], randomize_name: bool = False
+        self, name: str, buckets: Optional[List[str]] = None, randomize_name: bool = False
     ) -> str:
+        if buckets is None:
+            buckets = []
         for bucket_name in buckets:

If randomize_name is intended for future use, consider adding a TODO comment or removing it until needed.

🧰 Tools
🪛 Ruff (0.14.13)

59-59: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


59-59: Unused method argument: randomize_name

(ARG002)

🤖 Prompt for AI Agents
In `@sebs/cloudflare/r2.py` around lines 58 - 60, The _create_bucket function
currently uses a mutable default for buckets and includes an unused
randomize_name parameter; change the signature of _create_bucket(name: str,
buckets: Optional[list[str]] = None) (or remove the buckets default entirely)
and inside the function initialize buckets = [] if buckets is None to avoid
shared mutable state, and either remove the randomize_name parameter or keep it
but mark it with a TODO and do not rely on it; update any callers of
_create_bucket to pass an explicit list if they relied on the previous default.

Comment on lines +87 to +106
try:
create_bucket_response = requests.post(
create_bucket_uri, json=params, headers=self._get_auth_headers()
)

# Log the response for debugging
if create_bucket_response.status_code >= 400:
try:
error_data = create_bucket_response.json()
self.logging.error(
f"R2 bucket creation failed. Status: {create_bucket_response.status_code}, "
f"Response: {error_data}"
)
except:
self.logging.error(
f"R2 bucket creation failed. Status: {create_bucket_response.status_code}, "
f"Response: {create_bucket_response.text}"
)

create_bucket_response.raise_for_status()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add timeout to HTTP request and avoid bare except.

The requests.post call lacks a timeout, which could cause indefinite hangs. Also, bare except: catches all exceptions including KeyboardInterrupt and SystemExit.

🔧 Proposed fix
         try:
             create_bucket_response = requests.post(
-                create_bucket_uri, json=params, headers=self._get_auth_headers()
+                create_bucket_uri, json=params, headers=self._get_auth_headers(), timeout=30
             )
             
             # Log the response for debugging
             if create_bucket_response.status_code >= 400:
                 try:
                     error_data = create_bucket_response.json()
                     self.logging.error(
                         f"R2 bucket creation failed. Status: {create_bucket_response.status_code}, "
                         f"Response: {error_data}"
                     )
-                except:
+                except json.JSONDecodeError:
                     self.logging.error(
                         f"R2 bucket creation failed. Status: {create_bucket_response.status_code}, "
                         f"Response: {create_bucket_response.text}"
                     )
🧰 Tools
🪛 Ruff (0.14.13)

88-88: Probable use of requests call without timeout

(S113)


100-100: Do not use bare except

(E722)


101-104: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🤖 Prompt for AI Agents
In `@sebs/cloudflare/r2.py` around lines 87 - 106, The requests.post call in the
R2 bucket creation (inside the method using create_bucket_uri and
create_bucket_response) needs a timeout to avoid hangs and the inner bare except
should be replaced with specific exception handling; add a timeout argument to
requests.post (e.g. timeout=10 or a configurable constant) and wrap the
response.json() call in a try/except that catches json.JSONDecodeError (or
ValueError) only, logging create_bucket_response.text when JSON parsing fails;
also consider catching requests.exceptions.RequestException around the POST to
log and re-raise or handle network errors instead of letting it hang.

Comment on lines +122 to +133
def download(self, bucket_name: str, key: str, filepath: str) -> None:
"""
Download a file from a bucket.
:param bucket_name:
:param key: storage source filepath
:param filepath: local destination filepath
"""
# R2 requires S3-compatible access for object operations
# For now, this is not fully implemented
self.logging.warning(f"download not fully implemented for R2 bucket {bucket_name}")
pass
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Incomplete implementation may silently fail.

The download method logs a warning but returns silently, which could cause callers to assume success. Consider raising NotImplementedError to make the incomplete state explicit.

🔧 Proposed fix
     def download(self, bucket_name: str, key: str, filepath: str) -> None:
         """
         Download a file from a bucket.

         :param bucket_name:
         :param key: storage source filepath
         :param filepath: local destination filepath
         """
-        # R2 requires S3-compatible access for object operations
-        # For now, this is not fully implemented
-        self.logging.warning(f"download not fully implemented for R2 bucket {bucket_name}")
-        pass
+        raise NotImplementedError(
+            "R2 download requires S3-compatible access. This is not yet implemented."
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def download(self, bucket_name: str, key: str, filepath: str) -> None:
"""
Download a file from a bucket.
:param bucket_name:
:param key: storage source filepath
:param filepath: local destination filepath
"""
# R2 requires S3-compatible access for object operations
# For now, this is not fully implemented
self.logging.warning(f"download not fully implemented for R2 bucket {bucket_name}")
pass
def download(self, bucket_name: str, key: str, filepath: str) -> None:
"""
Download a file from a bucket.
:param bucket_name:
:param key: storage source filepath
:param filepath: local destination filepath
"""
raise NotImplementedError(
"R2 download requires S3-compatible access. This is not yet implemented."
)
🧰 Tools
🪛 Ruff (0.14.13)

122-122: Unused method argument: key

(ARG002)


122-122: Unused method argument: filepath

(ARG002)

🤖 Prompt for AI Agents
In `@sebs/cloudflare/r2.py` around lines 122 - 133, The download method currently
only logs a warning and returns silently, which can mislead callers; replace the
silent pass with raising a NotImplementedError (include a clear message
referencing the bucket_name and key) so callers receive an explicit failure,
keep the method signature download(self, bucket_name: str, key: str, filepath:
str) unchanged and remove the trailing pass; retain or refine the log message if
desired but ensure the NotImplementedError is raised.

Comment on lines +299 to +316
try:
response = requests.get(list_buckets_uri, headers=self._get_auth_headers())

# Log detailed error information
if response.status_code == 403:
try:
error_data = response.json()
self.logging.error(
f"403 Forbidden accessing R2 buckets. "
f"Response: {error_data}. "
f"Your API token may need 'R2 Read and Write' permissions."
)
except:
self.logging.error(
f"403 Forbidden accessing R2 buckets. "
f"Your API token may need 'R2 Read and Write' permissions."
)
return []
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add timeout and fix error handling.

The requests.get call lacks a timeout. The bare except: and f-strings without placeholders should also be fixed.

🔧 Proposed fix
         try:
-            response = requests.get(list_buckets_uri, headers=self._get_auth_headers())
+            response = requests.get(list_buckets_uri, headers=self._get_auth_headers(), timeout=30)
             
             # Log detailed error information
             if response.status_code == 403:
                 try:
                     error_data = response.json()
                     self.logging.error(
                         f"403 Forbidden accessing R2 buckets. "
                         f"Response: {error_data}. "
-                        f"Your API token may need 'R2 Read and Write' permissions."
+                        "Your API token may need 'R2 Read and Write' permissions."
                     )
-                except:
+                except json.JSONDecodeError:
                     self.logging.error(
-                        f"403 Forbidden accessing R2 buckets. "
-                        f"Your API token may need 'R2 Read and Write' permissions."
+                        "403 Forbidden accessing R2 buckets. "
+                        "Your API token may need 'R2 Read and Write' permissions."
                     )
                 return []
🧰 Tools
🪛 Ruff (0.14.13)

300-300: Probable use of requests call without timeout

(S113)


311-311: Do not use bare except

(E722)


312-315: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


313-313: f-string without any placeholders

Remove extraneous f prefix

(F541)


314-314: f-string without any placeholders

Remove extraneous f prefix

(F541)

🤖 Prompt for AI Agents
In `@sebs/cloudflare/r2.py` around lines 299 - 316, The requests.get call to
list_buckets_uri needs a timeout and the error handling must avoid a bare except
and properly capture JSON parse errors and include the exception text in the
log; update the call in the method that calls requests.get(list_buckets_uri,
headers=self._get_auth_headers()) to pass a sensible timeout (e.g. timeout=10),
replace the bare except with a specific exception handler (catch
json.JSONDecodeError or ValueError as e) when calling response.json(), and
include the exception details in the self.logging.error message (or use
self.logging.exception) so the log contains the actual parse error while still
returning [] on 403.

Comment on lines +347 to +354
def clean_bucket(self, bucket_name: str):
"""
Remove all objects from a bucket.
:param bucket_name:
"""
self.logging.warning(f"clean_bucket not fully implemented for R2 bucket {bucket_name}")
pass
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Incomplete implementation may silently fail.

Similar to download, this stub could cause callers to assume success. Consider raising NotImplementedError.

🔧 Proposed fix
     def clean_bucket(self, bucket_name: str):
         """
         Remove all objects from a bucket.
         
         :param bucket_name:
         """
-        self.logging.warning(f"clean_bucket not fully implemented for R2 bucket {bucket_name}")
-        pass
+        raise NotImplementedError(
+            f"clean_bucket not yet implemented for R2 bucket {bucket_name}"
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def clean_bucket(self, bucket_name: str):
"""
Remove all objects from a bucket.
:param bucket_name:
"""
self.logging.warning(f"clean_bucket not fully implemented for R2 bucket {bucket_name}")
pass
def clean_bucket(self, bucket_name: str):
"""
Remove all objects from a bucket.
:param bucket_name:
"""
raise NotImplementedError(
f"clean_bucket not yet implemented for R2 bucket {bucket_name}"
)
🤖 Prompt for AI Agents
In `@sebs/cloudflare/r2.py` around lines 347 - 354, The clean_bucket method
currently logs a warning and returns silently which can mislead callers (similar
to download); update the implementation to raise NotImplementedError instead of
silently passing so callers get an explicit failure. Modify the clean_bucket
function (and any similar stubs like download) to raise NotImplementedError with
a clear message such as "clean_bucket not implemented for R2 bucket
{bucket_name}" and remove or keep the warning log as supplementary; ensure the
method signature and docstring remain unchanged.

Comment on lines +368 to +380
try:
response = requests.delete(delete_bucket_uri, headers=self._get_auth_headers())
response.raise_for_status()

data = response.json()

if data.get("success"):
self.logging.info(f"Successfully deleted R2 bucket {bucket}")
else:
self.logging.error(f"Failed to delete R2 bucket {bucket}: {data.get('errors')}")

except requests.exceptions.RequestException as e:
self.logging.error(f"Error deleting R2 bucket {bucket}: {e}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add timeout to HTTP request.

The requests.delete call lacks a timeout, consistent with other HTTP calls in this file.

🔧 Proposed fix
         try:
-            response = requests.delete(delete_bucket_uri, headers=self._get_auth_headers())
+            response = requests.delete(delete_bucket_uri, headers=self._get_auth_headers(), timeout=30)
             response.raise_for_status()
🧰 Tools
🪛 Ruff (0.14.13)

369-369: Probable use of requests call without timeout

(S113)


380-380: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🤖 Prompt for AI Agents
In `@sebs/cloudflare/r2.py` around lines 368 - 380, The requests.delete call lacks
a timeout; update the HTTP call to include the same timeout used elsewhere in
this module by adding the timeout keyword argument to
requests.delete(delete_bucket_uri, headers=self._get_auth_headers(),
timeout=<module timeout value>), where <module timeout value> should be the
existing timeout constant or attribute used by other calls in this file (for
example self._timeout or DEFAULT_TIMEOUT), so that the delete request for
delete_bucket_uri follows the same timeout behavior as other requests.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@sebs/cloudflare/cloudflare.py`:
- Around line 484-507: In _build_workers_dev_url, the last logging.warning uses
a plain string with "{name}" so worker_name is not interpolated; change that
call to use an f-string and reference {worker_name} (i.e., update the
logging.warning message to f"No account ID available; using
https://{worker_name}.workers.dev which may not be reachable.") so the actual
worker_name appears in logs.
♻️ Duplicate comments (12)
sebs/cloudflare/r2.py (7)

11-18: Fix provider typename typo.
The provider identifier is misspelled and can break lookups.

🔧 Proposed fix
     `@staticmethod`
     def typename() -> str:
-        return "Cloudlfare.R2"
+        return "Cloudflare.R2"

99-101: Avoid mutable default for buckets.
A shared list across calls can leak state; also note randomize_name remains unused.

🔧 Proposed fix
     def _create_bucket(
-        self, name: str, buckets: list[str] = [], randomize_name: bool = False
+        self, name: str, buckets: Optional[List[str]] = None, randomize_name: bool = False
     ) -> str:
+        if buckets is None:
+            buckets = []

128-145: Add timeout and narrow JSON parse handling in bucket creation.
Without a timeout the call can hang; bare except hides non‑JSON errors.

🔧 Proposed fix
             create_bucket_response = requests.post(
-                create_bucket_uri, json=params, headers=self._get_auth_headers()
+                create_bucket_uri,
+                json=params,
+                headers=self._get_auth_headers(),
+                timeout=30,
             )
@@
-                except:
+                except json.JSONDecodeError:
                     self.logging.error(
                         f"R2 bucket creation failed. Status: {create_bucket_response.status_code}, "
                         f"Response: {create_bucket_response.text}"
                     )

163-175: Make download failure explicit.
A warning + pass can be mistaken for success.

🔧 Proposed fix
-        self.logging.warning(f"download not fully implemented for R2 bucket {bucket_name}")
-        pass
+        raise NotImplementedError(
+            f"R2 download requires S3-compatible access and is not implemented for bucket {bucket_name}"
+        )

266-315: Add timeout and avoid bare except in list_buckets.
Same reliability concerns as other HTTP calls.

🔧 Proposed fix
-            response = requests.get(list_buckets_uri, headers=self._get_auth_headers())
+            response = requests.get(
+                list_buckets_uri,
+                headers=self._get_auth_headers(),
+                timeout=30,
+            )
@@
-                except:
+                except json.JSONDecodeError:
                     self.logging.error(
                         f"403 Forbidden accessing R2 buckets. "
                         f"Your API token may need 'R2 Read and Write' permissions."
                     )

327-334: Make clean_bucket explicit about being unimplemented.

🔧 Proposed fix
-        self.logging.warning(f"clean_bucket not fully implemented for R2 bucket {bucket_name}")
-        pass
+        raise NotImplementedError(
+            f"clean_bucket not implemented for R2 bucket {bucket_name}"
+        )

336-361: Add timeout to delete bucket call.
Keeps delete requests from hanging indefinitely.

🔧 Proposed fix
-            response = requests.delete(delete_bucket_uri, headers=self._get_auth_headers())
+            response = requests.delete(
+                delete_bucket_uri,
+                headers=self._get_auth_headers(),
+                timeout=30,
+            )
sebs/cloudflare/cloudflare.py (5)

80-91: Avoid mutable default for config (still unused).
This keeps the signature safe even if config stays unused.

🔧 Proposed fix
-    def initialize(self, config: Dict[str, str] = {}, resource_prefix: Optional[str] = None):
+    def initialize(self, config: Optional[Dict[str, str]] = None, resource_prefix: Optional[str] = None):
@@
-        # Verify credentials are valid
+        if config is None:
+            config = {}
+        # Verify credentials are valid

131-164: Add timeout and handle request failures during credential verification.
Prevents hangs and gives clearer failure context.

🔧 Proposed fix
-        response = requests.get(f"{self._api_base_url}/user/tokens/verify", headers=headers)
+        try:
+            response = requests.get(
+                f"{self._api_base_url}/user/tokens/verify",
+                headers=headers,
+                timeout=30,
+            )
+        except requests.exceptions.RequestException as e:
+            raise RuntimeError("Failed to verify Cloudflare credentials") from e

349-366: Add timeout and avoid bare except in worker lookup.

🔧 Proposed fix
-        response = requests.get(url, headers=headers)
+        response = requests.get(url, headers=headers, timeout=30)
@@
-            except:
+            except ValueError:
                 return None

443-483: Add timeout and narrow exception handling for subdomain lookup.

🔧 Proposed fix
-            resp = requests.get(url, headers=headers)
+            resp = requests.get(url, headers=headers, timeout=30)
             if resp.status_code == 200:
-                body = resp.json()
+                try:
+                    body = resp.json()
+                except ValueError:
+                    self.logging.warning("Invalid JSON while fetching workers.dev subdomain")
+                    return None
@@
-        except Exception as e:
+        except requests.exceptions.RequestException as e:
             self.logging.warning(f"Error fetching workers.dev subdomain: {e}")
             return None

606-635: Enforce digit-prefix rule for all workers, not just containers.
Non-container workers starting with a digit will still produce invalid names.

🔧 Proposed fix
-        if container_deployment and formatted and formatted[0].isdigit():
-            formatted = 'container-' + formatted
+        if formatted and formatted[0].isdigit():
+            prefix = "container-" if container_deployment else "w-"
+            formatted = prefix + formatted
🧹 Nitpick comments (1)
sebs/cloudflare/cloudflare.py (1)

522-561: Prefer explicit container flag over name prefix.
Name-based detection only covers digit-prefixed names; an explicit flag is more reliable.

♻️ Suggested change
-        is_container = worker.name.startswith("container-")
+        is_container = container_deployment

Comment on lines +484 to +507
def _build_workers_dev_url(self, worker_name: str, account_id: Optional[str]) -> str:
"""Build a best-effort public URL for a worker.
Prefer using the account's readable workers.dev subdomain when available
(e.g. <name>.<subdomain>.workers.dev). If we can't obtain that, fall
back to using the account_id as a last resort and log a warning.
"""
if account_id:
sub = self._get_workers_dev_subdomain(account_id)
if sub:
return f"https://{worker_name}.{sub}.workers.dev"
else:
# fallback: some code historically used account_id in the host
self.logging.warning(
"Using account ID in workers.dev URL as a fallback. "
"Enable the workers.dev subdomain in Cloudflare for proper URLs."
)
return f"https://{worker_name}.{account_id}.workers.dev"
# Last fallback: plain workers.dev (may not resolve without a subdomain)
self.logging.warning(
"No account ID available; using https://{name}.workers.dev which may not be reachable."
)
return f"https://{worker_name}.workers.dev"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n sebs/cloudflare/cloudflare.py | sed -n '484,507p'

Repository: spcl/serverless-benchmarks

Length of output: 1458


Fix warning string to use f-string and interpolate worker_name.

Line 504 has a plain string literal with {name} placeholder that won't be interpolated. It should be an f-string using {worker_name}:

🔧 Proposed fix
         self.logging.warning(
-            "No account ID available; using https://{name}.workers.dev which may not be reachable."
+            f"No account ID available; using https://{worker_name}.workers.dev which may not be reachable."
         )
🤖 Prompt for AI Agents
In `@sebs/cloudflare/cloudflare.py` around lines 484 - 507, In
_build_workers_dev_url, the last logging.warning uses a plain string with
"{name}" so worker_name is not interpolated; change that call to use an f-string
and reference {worker_name} (i.e., update the logging.warning message to f"No
account ID available; using https://{worker_name}.workers.dev which may not be
reachable.") so the actual worker_name appears in logs.

@userlaurin userlaurin requested a review from mcopik January 19, 2026 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants