Split test generation and optimizations generation into separate methods #957

KRRT7 · 2025-12-08T11:13:47Z

User description

earlier we were doing it all in self.generate_and_instrument_tests which was misleading

PR Type

Enhancement

Description

Split test and optimization generation
Run tests and optimizations in parallel
Adjust return types and error flow
Add dedicated optimization generator

Diagram Walkthrough

flowchart LR
  A["perform_optimization.sync"] -- "submit" --> B["generate_and_instrument_tests"]
  A -- "submit" --> C["generate_optimizations"]
  B -- "tests + concolic" --> D["baseline/setup"]
  C -- "OptimizationSet + references" --> D

File Walkthrough

Relevant files

Enhancement

perform_optimization.py `LSP: parallelize test and optimization execution` codeflash/lsp/features/perform_optimization.py Submit test and optimization tasks in parallel Wait and merge results with error checks Remove optimization data from test result unpacking Pass code context pieces to optimization generator	+22/-4
function_optimizer.py `Decouple and parallelize test and optimization generation` codeflash/optimization/function_optimizer.py Split test generation into new generate_tests Add generate_optimizations returning OptimizationSet, references Update generate_and_instrument_tests signature and returns Parallelize in optimize_function; adjust flows and logging	+102/-98

github-actions · 2025-12-08T11:14:45Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Error Propagation Futures are waited and then .result() is called without guarding for exceptions; if either background task raises, this will raise and bypass the structured Result error handling. Validate exception handling around future results or use return_wrapped Results from tasks. future_tests = function_optimizer.executor.submit( function_optimizer.generate_and_instrument_tests, code_context ) future_optimizations = function_optimizer.executor.submit( function_optimizer.generate_optimizations, read_writable_code=code_context.read_writable_code, read_only_context_code=code_context.read_only_context_code, run_experiment=should_run_experiment, ) concurrent.futures.wait([future_tests, future_optimizations]) test_setup_result = future_tests.result() optimization_result = future_optimizations.result() abort_if_cancelled(cancel_event) if not is_successful(test_setup_result): return {"functionName": params.functionName, "status": "error", "message": test_setup_result.failure()} if not is_successful(optimization_result): return {"functionName": params.functionName, "status": "error", "message": optimization_result.failure()} Progress Output The progress bar now wraps parallel generation; console.rule() is called before and after futures but if an exception occurs prior to success, UI may be left in inconsistent state. Consider try/finally to always close/print separators. with progress_bar( f"Generating new tests and optimizations for function '{self.function_to_optimize.function_name}'", transient=True, revert_to_print=bool(get_pr_number()), ): console.rule() # Generate tests and optimizations in parallel future_tests = self.executor.submit(self.generate_and_instrument_tests, code_context) future_optimizations = self.executor.submit( self.generate_optimizations, read_writable_code=code_context.read_writable_code, read_only_context_code=code_context.read_only_context_code, run_experiment=should_run_experiment, ) concurrent.futures.wait([future_tests, future_optimizations]) test_setup_result = future_tests.result() optimization_result = future_optimizations.result() console.rule() ThreadPool Sizing Executor max_workers is derived from number of tests and now also handles parallel optimization tasks; ensure this sizing is sufficient and does not oversubscribe leading to starvation when submitting many futures (tests + concolic + optimization + references). self.executor = concurrent.futures.ThreadPoolExecutor( max_workers=n_tests + 3 if self.experiment_id is None else n_tests + 4 )

github-actions · 2025-12-08T11:15:39Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Make parallel futures failure-safe Protect the parallel futures with try/except and cancel the sibling future if one fails to prevent deadlocks or unhandled exceptions. Also use `return_when=FIRST_EXCEPTION` to react faster to failures and avoid waiting unnecessarily. codeflash/lsp/features/perform_optimization.py [50-63] future_tests = function_optimizer.executor.submit( function_optimizer.generate_and_instrument_tests, code_context ) future_optimizations = function_optimizer.executor.submit( function_optimizer.generate_optimizations, read_writable_code=code_context.read_writable_code, read_only_context_code=code_context.read_only_context_code, run_experiment=should_run_experiment, ) -concurrent.futures.wait([future_tests, future_optimizations]) +done, not_done = concurrent.futures.wait( + [future_tests, future_optimizations], + return_when=concurrent.futures.FIRST_EXCEPTION, +) +try: + # Propagate exception from whichever completed with error + for fut in done: + _ = fut.result() + # If both succeeded, collect results + test_setup_result = future_tests.result() + optimization_result = future_optimizations.result() +except Exception: + # Cancel any remaining work to avoid dangling tasks + for fut in not_done: + fut.cancel() + raise -test_setup_result = future_tests.result() -optimization_result = future_optimizations.result() - Suggestion importance[1-10]: 7 __ Why: The proposal correctly targets the new parallel futures block and adds robust exception handling to avoid hangs and propagate failures promptly. It's a solid reliability improvement, though not strictly critical to correctness of the new logic.	Medium
	Handle futures exceptions robustly Use `FIRST_EXCEPTION` and guard `.result()` calls to avoid blocking if one future errors. If concolic generation fails, return a clear Failure instead of raising, keeping the Result contract consistent. codeflash/optimization/function_optimizer.py [1102-1164] -def generate_tests( - self, - testgen_context: CodeStringsMarkdown, - helper_functions: list[FunctionSource], - generated_test_paths: list[Path], - generated_perf_test_paths: list[Path], -) -> Result[tuple[int, GeneratedTestsList, dict[str, set[FunctionCalledInTest]], str], str]: - """Generate unit tests and concolic tests for the function.""" - n_tests = N_TESTS_TO_GENERATE_EFFECTIVE - assert len(generated_test_paths) == n_tests +# Wait and short-circuit on first exception +done, not_done = concurrent.futures.wait( + [future_tests, future_concolic_tests], + return_when=concurrent.futures.FIRST_EXCEPTION, +) - # Submit test generation tasks - future_tests = self.submit_test_generation_tasks( - self.executor, - testgen_context.markdown, - [definition.fully_qualified_name for definition in helper_functions], - generated_test_paths, - generated_perf_test_paths, - ) - - future_concolic_tests = self.executor.submit( - generate_concolic_tests, self.test_cfg, self.args, self.function_to_optimize, self.function_to_optimize_ast - ) - - # Wait for test futures to complete - concurrent.futures.wait([future_tests, future_concolic_tests]) - - # Process test generation results - tests: list[GeneratedTests] = [] - for future in future_tests: - res = future.result() +tests: list[GeneratedTests] = [] +try: + for fut in future_tests: + res = fut.result() if res: - ... + ( + generated_test_source, + instrumented_behavior_test_source, + instrumented_perf_test_source, + test_behavior_path, + test_perf_path, + ) = res + tests.append( + GeneratedTests( + generated_original_test_source=generated_test_source, + instrumented_behavior_test_source=instrumented_behavior_test_source, + instrumented_perf_test_source=instrumented_perf_test_source, + behavior_file_path=test_behavior_path, + perf_file_path=test_perf_path, + ) + ) if not tests: logger.warning(f"Failed to generate and instrument tests for {self.function_to_optimize.function_name}") return Failure(f"/!\\ NO TESTS GENERATED for {self.function_to_optimize.function_name}") - function_to_concolic_tests, concolic_test_str = future_concolic_tests.result() + try: + function_to_concolic_tests, concolic_test_str = future_concolic_tests.result() + except Exception as e: + return Failure(f"Concolic test generation failed: {e}") +except Exception as e: + for fut in not_done: + fut.cancel() + return Failure(f"Test generation failed: {e}") Suggestion importance[1-10]: 7 __ Why: This refines the new `generate_tests` method by using FIRST_EXCEPTION and converting exceptions to Failure, aligning with the Result contract. It’s contextually accurate and improves resilience without altering behavior semantics.	Medium
	Prevent hangs on optimization failures Avoid hanging when one future fails by using `FIRST_EXCEPTION` and cancel any remaining futures. Also handle exceptions from experimental/control candidates and references, returning a Failure with context instead of propagating raw exceptions. codeflash/optimization/function_optimizer.py [1174-1229] -def generate_optimizations( - self, - read_writable_code: CodeStringsMarkdown, - read_only_context_code: str, - run_experiment: bool = False, # noqa: FBT001, FBT002 -) -> Result[tuple[OptimizationSet, str], str]: - """Generate optimization candidates for the function.""" - n_candidates = N_CANDIDATES_EFFECTIVE - - future_optimization_candidates = self.executor.submit( - self.aiservice_client.optimize_python_code, +futures = [future_optimization_candidates, future_references] +future_candidates_exp = None +if run_experiment: + future_candidates_exp = self.executor.submit( + self.local_aiservice_client.optimize_python_code, read_writable_code.markdown, read_only_context_code, - self.function_trace_id[:-4] + "EXP0" if run_experiment else self.function_trace_id, + self.function_trace_id[:-4] + "EXP1", n_candidates, - ExperimentMetadata(id=self.experiment_id, group="control") if run_experiment else None, + ExperimentMetadata(id=self.experiment_id, group="experiment"), is_async=self.function_to_optimize.is_async, ) - ... - future_references = self.executor.submit( - get_opt_review_metrics, - self.function_to_optimize_source_code, - self.function_to_optimize.file_path, - self.function_to_optimize.qualified_name, - self.project_root, - self.test_cfg.tests_root, - ) + futures.append(future_candidates_exp) - futures = [future_optimization_candidates, future_references] - future_candidates_exp = None - - if run_experiment: - future_candidates_exp = self.executor.submit( - self.local_aiservice_client.optimize_python_code, - read_writable_code.markdown, - read_only_context_code, - self.function_trace_id[:-4] + "EXP1", - n_candidates, - ExperimentMetadata(id=self.experiment_id, group="experiment"), - is_async=self.function_to_optimize.is_async, - ) - futures.append(future_candidates_exp) - - # Wait for optimization futures to complete - concurrent.futures.wait(futures) - - # Retrieve results +done, not_done = concurrent.futures.wait(futures, return_when=concurrent.futures.FIRST_EXCEPTION) +try: candidates: list[OptimizedCandidate] = future_optimization_candidates.result() - logger.info(f"lsp\|Generated '{len(candidates)}' candidate optimizations.") - ... + if not candidates: + return Failure(f"/!\\ NO OPTIMIZATIONS GENERATED for {self.function_to_optimize.function_name}") candidates_experiment = future_candidates_exp.result() if future_candidates_exp else None function_references = future_references.result() +except Exception as e: + for fut in not_done: + fut.cancel() + return Failure(f"Optimization generation failed: {e}") +return Success((OptimizationSet(control=candidates, experiment=candidates_experiment), function_references)) + Suggestion importance[1-10]: 7 __ Why: The change directly applies to the new `generate_optimizations` logic, adding early-exit on exceptions and cancellation of pending futures; this enhances robustness and error reporting while fitting the PR’s structure.	Medium

codeflash/lsp/features/perform_optimization.py

KRRT7 · 2025-12-08T23:02:25Z

@mohammedahmed18 should be good to go now

KRRT7 · 2025-12-08T23:04:06Z

@mohammedahmed18 any way we can have some sort of unit test that could catch these errors from our CI

mohammedahmed18 · 2025-12-08T23:40:38Z

any way we can have some sort of unit test that could catch these errors from our CI

it will be tricky to catch the correct order of the logs, I guess the best way is to manually test it on the extension

KRRT7 added 2 commits December 8, 2025 04:41

first pass split

6a51bd6

split unit test generation and optimizations

03dc330

github-actions bot added the Review effort 3/5 label Dec 8, 2025

formatting

c752fb8

KRRT7 enabled auto-merge (squash) December 8, 2025 11:51

KRRT7 requested review from aseembits93, misrasaurabh1 and mohammedahmed18 December 8, 2025 11:51

mohammedahmed18 requested changes Dec 8, 2025

View reviewed changes

codeflash/lsp/features/perform_optimization.py Outdated Show resolved Hide resolved

mohammedahmed18 requested changes Dec 8, 2025

View reviewed changes

codeflash/lsp/features/perform_optimization.py Show resolved Hide resolved

KRRT7 and others added 3 commits December 8, 2025 16:16

copy contextvars

44b0ea1

bugfix for VSC

eab8cdc

Merge branch 'main' into split-test-gens-and-optimizations

9027ed3

extension logging

632b826

mohammedahmed18 approved these changes Dec 8, 2025

View reviewed changes

KRRT7 merged commit b7b82ee into main Dec 8, 2025
21 of 22 checks passed

KRRT7 deleted the split-test-gens-and-optimizations branch December 8, 2025 23:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Split test generation and optimizations generation into separate methods #957

Split test generation and optimizations generation into separate methods #957

Uh oh!

KRRT7 commented Dec 8, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Dec 8, 2025

Uh oh!

github-actions bot commented Dec 8, 2025

Uh oh!

Uh oh!

Uh oh!

KRRT7 commented Dec 8, 2025

Uh oh!

KRRT7 commented Dec 8, 2025

Uh oh!

mohammedahmed18 commented Dec 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Split test generation and optimizations generation into separate methods #957

Split test generation and optimizations generation into separate methods #957

Uh oh!

Conversation

KRRT7 commented Dec 8, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

github-actions bot commented Dec 8, 2025

PR Reviewer Guide 🔍

Uh oh!

github-actions bot commented Dec 8, 2025

PR Code Suggestions ✨

Uh oh!

Uh oh!

Uh oh!

KRRT7 commented Dec 8, 2025

Uh oh!

KRRT7 commented Dec 8, 2025

Uh oh!

mohammedahmed18 commented Dec 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

KRRT7 commented Dec 8, 2025 •

edited by github-actions bot

Loading