⚡️ Speed up function `_format_variables` by 21% #613

codeflash-ai · 2025-11-12T05:00:00Z

📄 21% (0.21x) speedup for `_format_variables` in `marimo/_server/ai/prompts.py`

⏱️ Runtime : 277 microseconds → 230 microseconds (best of 36 runs)

📝 Explanation and details

The optimized code achieves a 20% speedup by replacing inefficient string concatenation with a list-based approach and localizing method lookups.

Key Optimizations:

List-based string building: Instead of repeatedly concatenating strings with variable_info += ..., the optimized version uses a list to collect string parts and joins them once at the end. This eliminates the quadratic time complexity of string concatenation in Python, where each += operation creates a new string object.
Localized method lookup: append = lines.append stores the method reference in a local variable, avoiding repeated attribute lookups in the loop. This micro-optimization reduces overhead when the loop processes many variables.
Removed unnecessary walrus operator assignments: The original code used _is_private_variable := variable.startswith("_") but never used the assigned variable, creating unnecessary overhead.

Performance Impact by Workload:

Small variable lists (1-10 variables): Modest improvements, with some test cases showing slight regressions due to the overhead of list creation
Large variable lists (500-1000 variables): Significant gains of 21-29% faster, where the quadratic string concatenation cost becomes dominant
Empty/None inputs: Small but consistent improvements of 4-15% faster

The optimization is particularly effective for AI prompt generation scenarios with many available variables, which is likely the primary use case given the function's context in the AI prompts module.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 19 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	✅ 2 Passed
📊 Tests Coverage	64.3%

🌀 Generated Regression Tests and Runtime

from typing import Optional, Union

imports

import pytest
from marimo._server.ai.prompts import _format_variables

Mock VariableContext for testing

class VariableContext:
def init(self, name: str, value_type: str, preview_value: str):
self.name = name
self.value_type = value_type
self.preview_value = preview_value
from marimo._server.ai.prompts import _format_variables

unit tests

1. Basic Test Cases

def test_empty_list_returns_empty_string():
# Test with empty list
codeflash_output = _format_variables([]) # 395ns -> 378ns (4.50% faster)

def test_none_returns_empty_string():
# Test with None
codeflash_output = _format_variables(None) # 376ns -> 328ns (14.6% faster)

def test_single_string_variable():
# Test with a single string variable
expected = (
"\n\n## Available variables from other cells:\n"
"- variable: bar"
)
codeflash_output = _format_variables(["bar"]) # 1.58μs -> 1.95μs (19.1% slower)

def test_multiple_strings():
# Test with multiple string variables
expected = (
"\n\n## Available variables from other cells:\n"
"- variable: foo- variable: bar"
)
codeflash_output = _format_variables(["foo", "bar"]) # 1.99μs -> 2.30μs (13.6% slower)

2. Edge Test Cases

def test_private_string_is_skipped():
# String variable with private name should be skipped
codeflash_output = _format_variables(["_private", "public"]) # 1.92μs -> 2.18μs (12.0% slower)

def test_string_empty_string_variable():
# String variable with empty string name should not be skipped
expected = (
"\n\n## Available variables from other cells:\n"
"- variable: ``"
)
codeflash_output = _format_variables([""]) # 1.57μs -> 1.98μs (20.7% slower)

def test_string_with_only_underscore():
# String variable "_" should be skipped (private)
codeflash_output = format_variables([""]) # 1.25μs -> 1.59μs (21.5% slower)

def test_string_with_leading_and_trailing_spaces():
# String variable with spaces in name should not be skipped
expected = (
"\n\n## Available variables from other cells:\n"
"- variable: bar"
)
codeflash_output = _format_variables([" bar "]) # 1.57μs -> 1.96μs (19.9% slower)

def test_string_with_non_ascii_name():
# String variable with non-ascii name
expected = (
"\n\n## Available variables from other cells:\n"
"- variable: 变量"
)
codeflash_output = _format_variables(["变量"]) # 1.94μs -> 2.27μs (14.3% slower)

def test_string_with_newline_in_name():
# String variable with newline in name
expected = (
"\n\n## Available variables from other cells:\n"
"- variable: foo\nbar"
)
codeflash_output = _format_variables(["foo\nbar"]) # 1.64μs -> 1.99μs (18.0% slower)

def test_string_variable_with_leading_underscore_and_space():
# String variable with leading underscore and space is private if it starts with _
codeflash_output = format_variables([" foo"]) # 1.23μs -> 1.53μs (19.5% slower)

def test_string_variable_with_trailing_underscore():
# String variable with trailing underscore is not private
expected = (
"\n\n## Available variables from other cells:\n"
"- variable: foo_"
)
codeflash_output = format_variables(["foo"]) # 1.55μs -> 1.94μs (20.2% slower)

def test_large_number_of_strings():
# Test with a large number of string variables (no more than 1000)
n = 500
variables = [f"var{i}" for i in range(n)]
codeflash_output = _format_variables(variables); result = codeflash_output # 83.9μs -> 69.4μs (20.9% faster)
for i in range(n):
pass

#------------------------------------------------
from typing import Optional, Union

imports

import pytest # used for our unit tests
from marimo._server.ai.prompts import _format_variables

Minimal VariableContext class for testing purposes

class VariableContext:
def init(self, name: str, value_type: str, preview_value: str):
self.name = name
self.value_type = value_type
self.preview_value = preview_value
from marimo._server.ai.prompts import _format_variables

unit tests

------------------ Basic Test Cases ------------------

def test_empty_list_returns_empty_string():
# Test with None
codeflash_output = _format_variables(None) # 401ns -> 352ns (13.9% faster)
# Test with empty list
codeflash_output = _format_variables([]) # 231ns -> 240ns (3.75% slower)

def test_single_str_public():
# Test with one public str variable
expected = (
"\n\n## Available variables from other cells:\n"
"- variable: y"
)
codeflash_output = _format_variables(["y"]) # 1.54μs -> 1.90μs (18.9% slower)

def test_private_str_skipped():
# str variable starting with underscore should be skipped
codeflash_output = _format_variables(["_hidden"]) # 1.30μs -> 1.57μs (17.1% slower)

def test_str_variable_empty_string():
# str variable is an empty string (not private)
expected = (
"\n\n## Available variables from other cells:\n"
"- variable: ``"
)
codeflash_output = _format_variables([""]) # 1.50μs -> 1.88μs (20.4% slower)

def test_large_all_public_str():
# 1000 public str variables
variables = [f"var{i}" for i in range(1000)]
codeflash_output = _format_variables(variables); result = codeflash_output # 169μs -> 131μs (28.6% faster)
# Should contain all variable names
for i in range(1000):
pass
# Should not contain any underscores at start
for i in range(1000):
pass

codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

#------------------------------------------------
from marimo._server.ai.prompts import _format_variables

def test__format_variables():
format_variables(['', ''])

def test__format_variables_2():
_format_variables([])

🔎 Concolic Coverage Tests and Runtime

Test File::Test Function	Original ⏱️	Optimized ⏱️	Speedup
`codeflash_concolic_bps3n5s8/tmpos79go1o/test_concolic_coverage.py::test__format_variables`	1.86μs	2.14μs	-13.3%⚠️
`codeflash_concolic_bps3n5s8/tmpos79go1o/test_concolic_coverage.py::test__format_variables_2`	420ns	369ns	13.8%✅

To edit these changes git checkout codeflash/optimize-_format_variables-mhvj6rm4 and push.

The optimized code achieves a **20% speedup** by replacing inefficient string concatenation with a list-based approach and localizing method lookups. **Key Optimizations:** 1. **List-based string building**: Instead of repeatedly concatenating strings with `variable_info += ...`, the optimized version uses a list to collect string parts and joins them once at the end. This eliminates the quadratic time complexity of string concatenation in Python, where each `+=` operation creates a new string object. 2. **Localized method lookup**: `append = lines.append` stores the method reference in a local variable, avoiding repeated attribute lookups in the loop. This micro-optimization reduces overhead when the loop processes many variables. 3. **Removed unnecessary walrus operator assignments**: The original code used `_is_private_variable := variable.startswith("_")` but never used the assigned variable, creating unnecessary overhead. **Performance Impact by Workload:** - **Small variable lists** (1-10 variables): Modest improvements, with some test cases showing slight regressions due to the overhead of list creation - **Large variable lists** (500-1000 variables): Significant gains of **21-29% faster**, where the quadratic string concatenation cost becomes dominant - **Empty/None inputs**: Small but consistent improvements of **4-15% faster** The optimization is particularly effective for AI prompt generation scenarios with many available variables, which is likely the primary use case given the function's context in the AI prompts module.

codeflash-ai bot requested a review from mashraf-222 November 12, 2025 05:00

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `_format_variables` by 21% #613

⚡️ Speed up function `_format_variables` by 21% #613

Uh oh!

codeflash-ai bot commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function _format_variables by 21% #613

Are you sure you want to change the base?

⚡️ Speed up function _format_variables by 21% #613

Uh oh!

Conversation

codeflash-ai bot commented Nov 12, 2025

📄 21% (0.21x) speedup for _format_variables in marimo/_server/ai/prompts.py

📝 Explanation and details

imports

Mock VariableContext for testing

unit tests

1. Basic Test Cases

2. Edge Test Cases

imports

Minimal VariableContext class for testing purposes

unit tests

------------------ Basic Test Cases ------------------

codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function `_format_variables` by 21% #613

⚡️ Speed up function `_format_variables` by 21% #613

📄 21% (0.21x) speedup for `_format_variables` in `marimo/_server/ai/prompts.py`