Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 13, 2025

📄 68% (0.68x) speedup for get_return_docstring in gradio/cli/commands/components/_docs_utils.py

⏱️ Runtime : 1.53 milliseconds 906 microseconds (best of 39 runs)

📝 Explanation and details

The optimization precompiles the regular expression pattern into a reusable compiled regex object (_RETURN_PATTERN), eliminating the need to recompile the pattern on every function call.

Key changes:

  • Moved regex compilation outside the function using re.compile() with the same pattern and flags
  • Simplified the regex quantifier from {0,1} to ? for better readability
  • Used the precompiled pattern's .search() method instead of re.search()

Why this leads to speedup:
In Python, re.search() must parse and compile the regex pattern each time it's called. By precompiling the pattern once at module load time, we eliminate this compilation overhead on every function invocation. The line profiler shows the regex search operation (line with re.search) dropped from 91.9% of execution time to 83.1%, with per-hit time reducing from 22,435ns to 8,469ns - a 62% improvement on the most expensive operation.

Performance impact based on function references:
The get_return_docstring function is called within extract_docstrings(), which processes multiple class members and their docstrings in a loop. Since documentation parsing typically processes many functions/methods in batch operations, this optimization compounds significantly - each avoided regex compilation saves ~14μs per call.

Test case performance:
The optimization shows consistent 100-200% speedup across all test cases, with particularly strong gains on edge cases like empty strings (580% faster) and simple cases (130-180% faster). Large-scale tests show smaller but meaningful improvements (4-30% faster), indicating the optimization remains effective even when regex compilation becomes a smaller fraction of total processing time for very large inputs.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 119 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from __future__ import annotations

import re

# imports
import pytest  # used for our unit tests
from gradio.cli.commands.components._docs_utils import get_return_docstring

# unit tests

# -------------------- Basic Test Cases --------------------

def test_returns_single_line():
    # Test a simple single-line Returns docstring
    docstring = "Returns: The result as a string."
    codeflash_output = get_return_docstring(docstring) # 9.23μs -> 3.98μs (132% faster)

def test_return_single_line():
    # Test a simple single-line Return docstring
    docstring = "Return: The sum of a and b."
    codeflash_output = get_return_docstring(docstring) # 8.49μs -> 3.93μs (116% faster)

def test_returns_with_leading_spaces():
    # Test with leading spaces after Returns:
    docstring = "Returns:    The answer."
    codeflash_output = get_return_docstring(docstring) # 8.71μs -> 3.82μs (128% faster)

def test_returns_case_insensitive():
    # Test with lowercase 'returns'
    docstring = "returns: lowercase test."
    codeflash_output = get_return_docstring(docstring) # 8.58μs -> 3.68μs (133% faster)

def test_return_case_insensitive():
    # Test with uppercase 'RETURN'
    docstring = "RETURN: uppercase test."
    codeflash_output = get_return_docstring(docstring) # 8.83μs -> 3.67μs (141% faster)

def test_returns_with_tabs_and_newlines():
    # Test with tabs and newlines after Returns:
    docstring = "Returns:\n\tThe value."
    codeflash_output = get_return_docstring(docstring) # 8.37μs -> 3.66μs (129% faster)

def test_returns_with_text_after():
    # Test with more text after the return docstring
    docstring = "Returns: Value\nOther: something else"
    codeflash_output = get_return_docstring(docstring) # 7.98μs -> 3.39μs (135% faster)

def test_returns_with_colon_in_text():
    # Test with colon in the return description
    docstring = "Returns: A dict: with keys and values."
    codeflash_output = get_return_docstring(docstring) # 8.60μs -> 3.85μs (124% faster)

def test_returns_with_multiline_description():
    # Test multiline return docstring
    docstring = "Returns: The result.\nThis line should not be included."
    codeflash_output = get_return_docstring(docstring) # 8.42μs -> 3.50μs (140% faster)

def test_returns_with_trailing_spaces():
    # Test trailing spaces in the return description
    docstring = "Returns:   Value with spaces   "
    codeflash_output = get_return_docstring(docstring) # 8.67μs -> 3.78μs (129% faster)

# -------------------- Edge Test Cases --------------------

def test_no_returns_section():
    # Test with no Returns or Return section
    docstring = "This function does something but does not have a returns section."
    codeflash_output = get_return_docstring(docstring) # 8.08μs -> 3.23μs (151% faster)

def test_returns_colon_but_no_description():
    # Test Returns: with no description
    docstring = "Returns:"
    codeflash_output = get_return_docstring(docstring) # 8.02μs -> 3.04μs (164% faster)

def test_returns_colon_only_spaces():
    # Test Returns: with only spaces after colon
    docstring = "Returns:    "
    codeflash_output = get_return_docstring(docstring) # 8.20μs -> 2.90μs (183% faster)

def test_returns_embedded_in_word():
    # Should not match 'SuperReturns:'
    docstring = "SuperReturns: Not a match.\nReturns: This should match."
    codeflash_output = get_return_docstring(docstring) # 9.47μs -> 4.59μs (106% faster)

def test_multiple_returns_sections():
    # Should match only the first Returns section
    docstring = "Returns: First.\nReturns: Second."
    codeflash_output = get_return_docstring(docstring) # 8.55μs -> 3.30μs (159% faster)

def test_returns_colon_in_middle_of_line():
    # Should not match 'This function Returns:'
    docstring = "This function Returns: does something."
    codeflash_output = get_return_docstring(docstring) # 8.70μs -> 4.00μs (117% faster)

def test_returns_with_extra_newlines():
    # Test Returns: followed by multiple newlines before the description
    docstring = "Returns:\n\n\nThe value."
    codeflash_output = get_return_docstring(docstring) # 8.57μs -> 3.70μs (132% faster)

def test_returns_with_tabs_and_spaces():
    # Test Returns: followed by tabs and spaces
    docstring = "Returns:\t   The value."
    codeflash_output = get_return_docstring(docstring) # 8.66μs -> 3.60μs (140% faster)

def test_returns_with_unicode_characters():
    # Test Returns: with unicode characters in description
    docstring = "Returns: Résultat avec caractères spéciaux éàè."
    codeflash_output = get_return_docstring(docstring) # 9.40μs -> 4.26μs (120% faster)

def test_returns_with_only_colon():
    # Test docstring that is just 'Returns:'
    docstring = "Returns:"
    codeflash_output = get_return_docstring(docstring) # 8.51μs -> 3.13μs (172% faster)

def test_returns_with_multiple_colons():
    # Test Returns: with multiple colons in the description
    docstring = "Returns: Value: with: colons."
    codeflash_output = get_return_docstring(docstring) # 8.89μs -> 3.84μs (132% faster)

def test_returns_with_no_colon():
    # Test 'Returns' with no colon should not match
    docstring = "Returns the value."
    codeflash_output = get_return_docstring(docstring) # 7.56μs -> 2.35μs (222% faster)

def test_return_colon_with_newline_description():
    # Test 'Return:' with description on next line (should not match, only matches same line)
    docstring = "Return:\nThis is on the next line."
    codeflash_output = get_return_docstring(docstring) # 8.97μs -> 3.95μs (127% faster)

def test_returns_with_trailing_newline():
    # Test Returns: with trailing newline at end of docstring
    docstring = "Returns: Value\n"
    codeflash_output = get_return_docstring(docstring) # 8.35μs -> 3.42μs (144% faster)

def test_returns_with_leading_newline():
    # Test Returns: at start of docstring with leading newline
    docstring = "\nReturns: Value"
    codeflash_output = get_return_docstring(docstring) # 8.59μs -> 3.44μs (149% faster)

def test_returns_with_only_newlines():
    # Test Returns: with only newlines after colon
    docstring = "Returns:\n\n"
    codeflash_output = get_return_docstring(docstring) # 7.42μs -> 2.97μs (150% faster)

def test_returns_with_non_ascii_colon():
    # Test Returns: with full-width colon (should not match)
    docstring = "Returns:Non-ASCII colon."
    codeflash_output = get_return_docstring(docstring) # 9.36μs -> 3.94μs (137% faster)

def test_returns_with_trailing_comment():
    # Test Returns: with inline comment after description
    docstring = "Returns: value  # comment"
    codeflash_output = get_return_docstring(docstring) # 8.40μs -> 3.83μs (120% faster)

def test_returns_with_indented_description():
    # Test Returns: with indented description (should strip)
    docstring = "Returns:    indented value"
    codeflash_output = get_return_docstring(docstring) # 8.57μs -> 3.62μs (136% faster)

def test_returns_with_empty_string():
    # Test empty docstring
    docstring = ""
    codeflash_output = get_return_docstring(docstring) # 5.46μs -> 803ns (580% faster)

def test_returns_with_none_input():
    # Test None as input (should raise TypeError)
    with pytest.raises(TypeError):
        get_return_docstring(None) # 7.00μs -> 1.92μs (265% faster)

# -------------------- Large Scale Test Cases --------------------

def test_large_docstring_with_returns_at_end():
    # Test a large docstring with Returns at the end
    docstring = "Line\n" * 900 + "Returns: The last value."
    codeflash_output = get_return_docstring(docstring) # 59.5μs -> 53.2μs (11.9% faster)

def test_large_docstring_with_returns_in_middle():
    # Test a large docstring with Returns in the middle
    docstring = "Start\n" + ("Middle\n" * 500) + "Returns: Middle value.\n" + ("End\n" * 400)
    codeflash_output = get_return_docstring(docstring) # 44.2μs -> 39.1μs (13.0% faster)

def test_large_docstring_no_returns():
    # Test a large docstring with no Returns section
    docstring = "Line\n" * 999
    codeflash_output = get_return_docstring(docstring) # 61.4μs -> 56.3μs (9.03% faster)

def test_large_docstring_multiple_returns():
    # Test a large docstring with multiple Returns sections
    docstring = ("Text\n" * 250) + "Returns: First value.\n" + ("Text\n" * 250) + "Returns: Second value."
    codeflash_output = get_return_docstring(docstring) # 23.3μs -> 17.9μs (30.1% faster)

def test_large_docstring_returns_with_long_description():
    # Test Returns: with a long description
    long_desc = "x" * 900
    docstring = f"Returns: {long_desc}"
    codeflash_output = get_return_docstring(docstring) # 23.4μs -> 18.3μs (28.2% faster)

def test_large_docstring_returns_with_multiline_description():
    # Should only match up to the first newline
    docstring = ("Line\n" * 500) + "Returns: This is a long description.\nBut this should not be included."
    codeflash_output = get_return_docstring(docstring) # 37.2μs -> 31.8μs (17.2% faster)

def test_large_docstring_returns_with_lots_of_whitespace():
    # Test Returns: with lots of whitespace before description
    docstring = ("Line\n" * 500) + "Returns:      \n\t   The value."
    codeflash_output = get_return_docstring(docstring) # 36.1μs -> 32.4μs (11.5% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from __future__ import annotations

import re

# imports
import pytest  # used for our unit tests
from gradio.cli.commands.components._docs_utils import get_return_docstring

# unit tests

# ------------------- Basic Test Cases -------------------

def test_basic_single_line_return():
    # Basic single-line 'Returns:' docstring
    doc = "Returns: The result of the computation."
    codeflash_output = get_return_docstring(doc) # 9.79μs -> 3.92μs (150% faster)

def test_basic_single_line_return_lowercase():
    # Lowercase 'return:' should also match due to IGNORECASE
    doc = "return: The output value."
    codeflash_output = get_return_docstring(doc) # 9.54μs -> 3.90μs (145% faster)

def test_basic_multiline_docstring_with_returns():
    # Multiline docstring, 'Returns:' in the middle
    doc = """This function does something.
    Returns: A boolean indicating success.
    """
    codeflash_output = get_return_docstring(doc) # 10.5μs -> 4.93μs (112% faster)

def test_basic_returns_with_leading_and_trailing_whitespace():
    # Whitespace before and after the return docstring
    doc = "Returns:    The answer.   "
    codeflash_output = get_return_docstring(doc) # 9.63μs -> 3.96μs (143% faster)

def test_basic_returns_with_tab_and_newline():
    # Tabs and newlines after colon
    doc = "Returns:\n    The processed value."
    codeflash_output = get_return_docstring(doc) # 9.29μs -> 3.89μs (139% faster)

def test_basic_returns_with_multiple_spaces_and_tabs():
    # Multiple spaces and tabs after colon
    doc = "Returns:   \t   A value."
    codeflash_output = get_return_docstring(doc) # 8.92μs -> 3.67μs (143% faster)

def test_basic_returns_with_colon_and_no_description():
    # Only 'Returns:' and nothing else
    doc = "Returns:"
    codeflash_output = get_return_docstring(doc) # 8.27μs -> 3.11μs (166% faster)

def test_basic_returns_with_colon_and_newline_only():
    # Only 'Returns:' followed by a newline
    doc = "Returns:\n"
    codeflash_output = get_return_docstring(doc) # 8.54μs -> 3.08μs (177% faster)

def test_basic_returns_with_colon_and_multiple_newlines():
    # Only 'Returns:' followed by multiple newlines
    doc = "Returns:\n\n"
    codeflash_output = get_return_docstring(doc) # 8.59μs -> 3.19μs (169% faster)

def test_basic_returns_with_colon_and_whitespace_only():
    # Only 'Returns:' followed by whitespace
    doc = "Returns:   \t  "
    codeflash_output = get_return_docstring(doc) # 8.57μs -> 3.30μs (159% faster)

def test_basic_returns_with_text_after_returns():
    # Text after 'Returns:' and some unrelated trailing text
    doc = "Returns: Some value.\nNote: This is important."
    codeflash_output = get_return_docstring(doc) # 9.29μs -> 3.67μs (153% faster)

def test_basic_returns_with_text_at_end_of_string():
    # 'Returns:' at the end of the docstring
    doc = "Does something. Returns: Final value."
    codeflash_output = get_return_docstring(doc) # 9.73μs -> 4.17μs (133% faster)

def test_basic_returns_with_text_at_start_of_string():
    # 'Returns:' at the start of the docstring
    doc = "Returns: Output value. Used for testing."
    codeflash_output = get_return_docstring(doc) # 9.62μs -> 4.08μs (136% faster)

def test_basic_returns_with_text_and_no_colon():
    # Should not match if no colon after 'Returns'
    doc = "Returns the output value."
    codeflash_output = get_return_docstring(doc) # 8.16μs -> 2.63μs (211% faster)

def test_basic_returns_with_text_and_colon_in_middle_of_word():
    # Should not match if colon is not directly after 'Returns'
    doc = "Returns something: the output value."
    codeflash_output = get_return_docstring(doc) # 8.27μs -> 2.80μs (196% faster)

# ------------------- Edge Test Cases -------------------

def test_edge_empty_docstring():
    # Empty string should return None
    doc = ""
    codeflash_output = get_return_docstring(doc) # 6.10μs -> 808ns (655% faster)

def test_edge_none_input():
    # None input should raise TypeError
    with pytest.raises(TypeError):
        get_return_docstring(None) # 7.47μs -> 1.97μs (279% faster)

def test_edge_no_returns_section():
    # Docstring with no 'Returns:' section
    doc = "This function does not have a return docstring."
    codeflash_output = get_return_docstring(doc) # 8.42μs -> 3.33μs (152% faster)

def test_edge_returns_in_middle_of_text():
    # 'Returns:' in the middle of a sentence, not at start of line
    doc = "This function Returns: something useful."
    codeflash_output = get_return_docstring(doc) # 9.90μs -> 4.52μs (119% faster)

def test_edge_returns_with_multiple_occurrences():
    # Multiple 'Returns:' sections, should match the first one
    doc = """Returns: First value.
    Returns: Second value."""
    codeflash_output = get_return_docstring(doc) # 8.87μs -> 3.65μs (143% faster)

def test_edge_returns_with_multiline_description():
    # Multiline description after 'Returns:', should only capture up to first newline
    doc = "Returns: A value.\nThis is a new line."
    codeflash_output = get_return_docstring(doc) # 8.95μs -> 3.62μs (147% faster)

def test_edge_returns_with_multiline_and_colon_in_description():
    # Description contains a colon, should not affect matching
    doc = "Returns: A value: the result."
    codeflash_output = get_return_docstring(doc) # 9.14μs -> 3.75μs (144% faster)

def test_edge_returns_with_non_ascii_characters():
    # Non-ASCII characters in the return description
    doc = "Returns: Résultat calculé."
    codeflash_output = get_return_docstring(doc) # 9.48μs -> 4.09μs (132% faster)

def test_edge_returns_with_unicode_whitespace():
    # Unicode whitespace after colon
    doc = "Returns:\u2003Output value."
    codeflash_output = get_return_docstring(doc) # 11.2μs -> 6.21μs (80.0% faster)

def test_edge_returns_with_tabs_and_newlines():
    # Tabs and newlines after colon
    doc = "Returns:\t\n\t  The result."
    codeflash_output = get_return_docstring(doc) # 9.21μs -> 3.85μs (139% faster)

def test_edge_returns_with_multiple_newlines_in_docstring():
    # Multiple newlines, 'Returns:' not at start of line
    doc = "\n\nReturns: Final value.\n\n"
    codeflash_output = get_return_docstring(doc) # 9.10μs -> 3.72μs (144% faster)

def test_edge_returns_with_returns_in_word():
    # 'Returns:' as part of another word should not match
    doc = "SuperReturns: This is not a valid section."
    codeflash_output = get_return_docstring(doc) # 7.77μs -> 2.43μs (219% faster)

def test_edge_returns_with_return_colon_and_no_space():
    # 'Return:' with no space after colon
    doc = "Return:Value"
    codeflash_output = get_return_docstring(doc) # 9.28μs -> 3.66μs (154% faster)

def test_edge_returns_with_return_colon_and_newline():
    # 'Return:' followed by newline and description
    doc = "Return:\nThe output."
    codeflash_output = get_return_docstring(doc) # 9.27μs -> 3.71μs (150% faster)

def test_edge_returns_with_return_colon_and_tab():
    # 'Return:' followed by tab and description
    doc = "Return:\tThe output."
    codeflash_output = get_return_docstring(doc) # 8.96μs -> 3.77μs (138% faster)

def test_edge_returns_with_return_colon_and_multiple_whitespace():
    # 'Return:' followed by multiple whitespace characters
    doc = "Return:     The value."
    codeflash_output = get_return_docstring(doc) # 9.39μs -> 3.81μs (146% faster)

def test_edge_returns_with_return_colon_and_empty_description():
    # 'Return:' with no description
    doc = "Return:"
    codeflash_output = get_return_docstring(doc) # 8.26μs -> 3.15μs (163% faster)

def test_edge_returns_with_return_colon_and_newline_only():
    # 'Return:' followed by newline only
    doc = "Return:\n"
    codeflash_output = get_return_docstring(doc) # 8.50μs -> 3.23μs (163% faster)

def test_edge_returns_with_return_colon_and_whitespace_only():
    # 'Return:' followed by whitespace only
    doc = "Return:   \t  "
    codeflash_output = get_return_docstring(doc) # 8.63μs -> 3.25μs (165% faster)

def test_edge_returns_with_trailing_and_leading_newlines():
    # 'Returns:' surrounded by newlines
    doc = "\nReturns: Output value.\n"
    codeflash_output = get_return_docstring(doc) # 9.39μs -> 3.77μs (149% faster)

def test_edge_returns_with_multiple_spaces_before_colon():
    # Multiple spaces before colon should not match
    doc = "Returns   : Output value."
    codeflash_output = get_return_docstring(doc) # 8.07μs -> 2.79μs (189% faster)

def test_edge_returns_with_colon_in_description():
    # Colon in the description should be included
    doc = "Returns: The value: processed."
    codeflash_output = get_return_docstring(doc) # 9.47μs -> 4.02μs (136% faster)

def test_edge_returns_with_colon_and_no_space():
    # No space after colon, should still match
    doc = "Returns:Output value."
    codeflash_output = get_return_docstring(doc) # 9.06μs -> 3.75μs (141% faster)

def test_edge_returns_with_colon_and_newline_and_trailing_text():
    # Should only capture up to the first newline
    doc = "Returns: Output value.\nExtra info."
    codeflash_output = get_return_docstring(doc) # 9.12μs -> 3.67μs (148% faster)

def test_edge_returns_with_colon_and_multiline_description():
    # Multiline description, should only capture up to the first newline
    doc = "Returns: Output value.\nSecond line."
    codeflash_output = get_return_docstring(doc) # 9.12μs -> 3.67μs (148% faster)

def test_edge_returns_with_colon_and_multiple_newlines_in_description():
    # Multiple newlines after 'Returns:'
    doc = "Returns: Value.\n\nMore info."
    codeflash_output = get_return_docstring(doc) # 8.51μs -> 3.58μs (137% faster)

def test_edge_returns_with_colon_and_multiple_returns_sections():
    # Multiple 'Returns:' sections, should match the first
    doc = "Returns: First value.\nReturns: Second value."
    codeflash_output = get_return_docstring(doc) # 9.12μs -> 3.70μs (147% faster)

def test_edge_returns_with_colon_and_returns_at_end_of_docstring():
    # 'Returns:' at the very end of docstring
    doc = "Does something.\nReturns: Final value."
    codeflash_output = get_return_docstring(doc) # 9.72μs -> 4.38μs (122% faster)

def test_edge_returns_with_colon_and_returns_at_start_of_docstring():
    # 'Returns:' at the very start of docstring
    doc = "Returns: Output value."
    codeflash_output = get_return_docstring(doc) # 8.96μs -> 3.75μs (139% faster)

def test_edge_returns_with_colon_and_returns_in_middle_of_docstring():
    # 'Returns:' in the middle of docstring
    doc = "Does something.\nReturns: Output value.\nMore info."
    codeflash_output = get_return_docstring(doc) # 9.58μs -> 4.24μs (126% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_period():
    # 'Returns:' with trailing period in the description
    doc = "Returns: Output value."
    codeflash_output = get_return_docstring(doc) # 9.54μs -> 3.77μs (153% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_exclamation():
    # 'Returns:' with trailing exclamation mark in the description
    doc = "Returns: Output value!"
    codeflash_output = get_return_docstring(doc) # 9.41μs -> 3.82μs (146% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_question():
    # 'Returns:' with trailing question mark in the description
    doc = "Returns: Output value?"
    codeflash_output = get_return_docstring(doc) # 9.44μs -> 3.87μs (144% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_semicolon():
    # 'Returns:' with trailing semicolon in the description
    doc = "Returns: Output value;"
    codeflash_output = get_return_docstring(doc) # 9.45μs -> 3.91μs (142% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_comma():
    # 'Returns:' with trailing comma in the description
    doc = "Returns: Output value,"
    codeflash_output = get_return_docstring(doc) # 8.84μs -> 3.67μs (141% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_colon():
    # 'Returns:' with trailing colon in the description
    doc = "Returns: Output value:"
    codeflash_output = get_return_docstring(doc) # 9.51μs -> 3.75μs (154% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_dash():
    # 'Returns:' with trailing dash in the description
    doc = "Returns: Output value-"
    codeflash_output = get_return_docstring(doc) # 9.29μs -> 3.80μs (145% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_underscore():
    # 'Returns:' with trailing underscore in the description
    doc = "Returns: Output value_"
    codeflash_output = get_return_docstring(doc) # 9.29μs -> 3.83μs (143% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_slash():
    # 'Returns:' with trailing slash in the description
    doc = "Returns: Output value/"
    codeflash_output = get_return_docstring(doc) # 9.23μs -> 3.76μs (146% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_backslash():
    # 'Returns:' with trailing backslash in the description
    doc = "Returns: Output value\\"
    codeflash_output = get_return_docstring(doc) # 9.27μs -> 3.84μs (141% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_pipe():
    # 'Returns:' with trailing pipe in the description
    doc = "Returns: Output value|"
    codeflash_output = get_return_docstring(doc) # 8.99μs -> 3.71μs (142% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_ampersand():
    # 'Returns:' with trailing ampersand in the description
    doc = "Returns: Output value&"
    codeflash_output = get_return_docstring(doc) # 8.73μs -> 3.73μs (134% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_hash():
    # 'Returns:' with trailing hash in the description
    doc = "Returns: Output value#"
    codeflash_output = get_return_docstring(doc) # 9.50μs -> 3.76μs (153% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_percent():
    # 'Returns:' with trailing percent in the description
    doc = "Returns: Output value%"
    codeflash_output = get_return_docstring(doc) # 9.47μs -> 3.90μs (143% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_dollar():
    # 'Returns:' with trailing dollar in the description
    doc = "Returns: Output value$"
    codeflash_output = get_return_docstring(doc) # 9.27μs -> 3.91μs (137% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_at():
    # 'Returns:' with trailing at in the description
    doc = "Returns: Output value@"
    codeflash_output = get_return_docstring(doc) # 9.46μs -> 3.82μs (147% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_star():
    # 'Returns:' with trailing star in the description
    doc = "Returns: Output value*"
    codeflash_output = get_return_docstring(doc) # 9.08μs -> 3.76μs (141% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_plus():
    # 'Returns:' with trailing plus in the description
    doc = "Returns: Output value+"
    codeflash_output = get_return_docstring(doc) # 9.27μs -> 3.85μs (140% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_equal():
    # 'Returns:' with trailing equal in the description
    doc = "Returns: Output value="
    codeflash_output = get_return_docstring(doc) # 8.77μs -> 3.88μs (126% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_tilde():
    # 'Returns:' with trailing tilde in the description
    doc = "Returns: Output value~"
    codeflash_output = get_return_docstring(doc) # 8.81μs -> 3.77μs (134% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_caret():
    # 'Returns:' with trailing caret in the description
    doc = "Returns: Output value^"
    codeflash_output = get_return_docstring(doc) # 9.14μs -> 3.78μs (142% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_bracket():
    # 'Returns:' with trailing bracket in the description
    doc = "Returns: Output value]"
    codeflash_output = get_return_docstring(doc) # 9.39μs -> 4.03μs (133% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_brace():
    # 'Returns:' with trailing brace in the description
    doc = "Returns: Output value}"
    codeflash_output = get_return_docstring(doc) # 9.25μs -> 3.84μs (141% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_parenthesis():
    # 'Returns:' with trailing parenthesis in the description
    doc = "Returns: Output value)"
    codeflash_output = get_return_docstring(doc) # 9.14μs -> 3.78μs (142% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_angle_bracket():
    # 'Returns:' with trailing angle bracket in the description
    doc = "Returns: Output value>"
    codeflash_output = get_return_docstring(doc) # 9.27μs -> 3.79μs (145% faster)

def test_edge_returns_with_colon_and_returns_with_trailing_less_than():
    # 'Returns:' with trailing less than in the description
    doc = "Returns: Output value<"
    codeflash_output = get_return_docstring(doc) # 9.14μs -> 3.85μs (137% faster)

# ------------------- Large Scale Test Cases -------------------

def test_large_scale_long_docstring_with_returns_at_end():
    # Large docstring, 'Returns:' at the end
    doc = "Line.\n" * 999 + "Returns: Final output."
    codeflash_output = get_return_docstring(doc) # 80.4μs -> 76.8μs (4.64% faster)

def test_large_scale_long_docstring_with_returns_at_start():
    # Large docstring, 'Returns:' at the start
    doc = "Returns: Output value.\n" + "Line.\n" * 999
    codeflash_output = get_return_docstring(doc) # 8.84μs -> 3.56μs (148% faster)

def test_large_scale_long_docstring_with_returns_in_middle():
    # Large docstring, 'Returns:' in the middle
    doc = "Line.\n" * 500 + "Returns: Middle output.\n" + "Line.\n" * 499
    codeflash_output = get_return_docstring(doc) # 45.5μs -> 41.5μs (9.78% faster)

def test_large_scale_long_docstring_with_no_returns():
    # Large docstring, no 'Returns:' section
    doc = "Line.\n" * 1000
    codeflash_output = get_return_docstring(doc) # 78.0μs -> 76.0μs (2.61% faster)

def test_large_scale_many_returns_sections():
    # Large docstring with many 'Returns:' sections, should match the first
    doc = "".join(f"Returns: Value{i}.\n" for i in range(1000))
    codeflash_output = get_return_docstring(doc) # 9.32μs -> 3.43μs (171% faster)

def test_large_scale_large_returns_description():
    # Large description after 'Returns:', should capture all up to first newline
    description = "A" * 999
    doc = f"Returns: {description}\nNext section."
    codeflash_output = get_return_docstring(doc) # 24.5μs -> 19.6μs (25.3% faster)

def test_large_scale_returns_with_multiline_description():
    # Multiline description, should only capture up to first newline
    description = "A" * 500
    doc = f"Returns: {description}\n{description}"
    codeflash_output = get_return_docstring(doc) # 17.2μs -> 11.6μs (48.7% faster)

def test_large_scale_returns_with_unicode():
    # Large docstring with unicode characters in the return description
    description = "✓" * 999
    doc = f"Returns: {description}\nEnd."
    codeflash_output = get_return_docstring(doc) # 30.5μs -> 25.5μs (19.7% faster)

def test_large_scale_returns_with_mixed_whitespace():
    # Large docstring with mixed whitespace after colon
    description = "A" * 999
    doc = f"Returns:\t  {description}\nEnd."
    codeflash_output = get_return_docstring(doc) # 24.1μs -> 19.1μs (26.0% faster)

def test_large_scale_returns_with_multiple_returns_and_large_description():
    # Multiple 'Returns:' sections, large description, should match the first
    doc = "Returns: " + "A" * 500 + "\nReturns: " + "B" * 500 + "\nEnd."
    codeflash_output = get_return_docstring(doc) # 17.1μs -> 11.6μs (47.7% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-get_return_docstring-mhwvp5kr and push.

Codeflash Static Badge

The optimization precompiles the regular expression pattern into a reusable compiled regex object (`_RETURN_PATTERN`), eliminating the need to recompile the pattern on every function call.

**Key changes:**
- Moved regex compilation outside the function using `re.compile()` with the same pattern and flags
- Simplified the regex quantifier from `{0,1}` to `?` for better readability 
- Used the precompiled pattern's `.search()` method instead of `re.search()`

**Why this leads to speedup:**
In Python, `re.search()` must parse and compile the regex pattern each time it's called. By precompiling the pattern once at module load time, we eliminate this compilation overhead on every function invocation. The line profiler shows the regex search operation (line with `re.search`) dropped from 91.9% of execution time to 83.1%, with per-hit time reducing from 22,435ns to 8,469ns - a 62% improvement on the most expensive operation.

**Performance impact based on function references:**
The `get_return_docstring` function is called within `extract_docstrings()`, which processes multiple class members and their docstrings in a loop. Since documentation parsing typically processes many functions/methods in batch operations, this optimization compounds significantly - each avoided regex compilation saves ~14μs per call.

**Test case performance:**
The optimization shows consistent 100-200% speedup across all test cases, with particularly strong gains on edge cases like empty strings (580% faster) and simple cases (130-180% faster). Large-scale tests show smaller but meaningful improvements (4-30% faster), indicating the optimization remains effective even when regex compilation becomes a smaller fraction of total processing time for very large inputs.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 13, 2025 03:37
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant