feat: Implement end-to-end automated Java test generation pipeline #1

Zapper9982 · 2025-06-13T06:21:31Z

This commit introduces a comprehensive system for automated Java JUnit 5 test case generation using Large Language Models (LLMs), with a focus on iterative improvement based on JaCoCo code coverage reports.

Key Features:

Pre-processing: I remove comments and prepare your Java code for analysis.
Code Analysis: I identify Spring Boot controllers and services as targets for test generation.
Langchain Integration: I chunk your source code, generate embeddings (BAAI/bge-small-en-v1.5), and store them in ChromaDB.
LLM-Powered Test Generation: I use Google's Gemini-1.5-flash model via Langchain's RetrievalQA to generate JUnit 5 test cases. This includes a retry mechanism with feedback for failed tests.
JaCoCo-Based Iteration:
- I execute the generated tests using Maven or Gradle.
- I parse JaCoCo XML reports to determine overall line coverage and method-specific coverage.
- If coverage is below a configurable target (default 90%), I identify under-tested methods.
- I then refine prompts to the LLM to focus on these specific methods in subsequent iterations.
- This process repeats for a configurable maximum number of iterations.
Configuration: Key parameters such as the Spring Boot project root, Google API key, build tool, max iterations, and target coverage are configurable via environment variables.
Central Orchestration: src/main.py manages the entire pipeline flow.
GitHub Actions Workflow: A CI workflow (.github/workflows/coverage_check.yml) is implemented to:
- Run the test generation pipeline on pushes/pull_requests.
- Fail PRs if the target code coverage is not met.
- Requires GOOGLE_API_KEY to be set as a repository secret.
Documentation:
- README.md has been updated with detailed instructions on setup, configuration, and execution.
- run.sh script provided as a helper to set environment variables and launch the pipeline.

The system is designed to significantly reduce your manual effort in writing unit tests and help maintain high code coverage standards.

This commit introduces a comprehensive system for automated Java JUnit 5 test case generation using Large Language Models (LLMs), with a focus on iterative improvement based on JaCoCo code coverage reports. Key Features: - Pre-processing: I remove comments and prepare your Java code for analysis. - Code Analysis: I identify Spring Boot controllers and services as targets for test generation. - Langchain Integration: I chunk your source code, generate embeddings (BAAI/bge-small-en-v1.5), and store them in ChromaDB. - LLM-Powered Test Generation: I use Google's Gemini-1.5-flash model via Langchain's RetrievalQA to generate JUnit 5 test cases. This includes a retry mechanism with feedback for failed tests. - JaCoCo-Based Iteration: - I execute the generated tests using Maven or Gradle. - I parse JaCoCo XML reports to determine overall line coverage and method-specific coverage. - If coverage is below a configurable target (default 90%), I identify under-tested methods. - I then refine prompts to the LLM to focus on these specific methods in subsequent iterations. - This process repeats for a configurable maximum number of iterations. - Configuration: Key parameters such as the Spring Boot project root, Google API key, build tool, max iterations, and target coverage are configurable via environment variables. - Central Orchestration: `src/main.py` manages the entire pipeline flow. - GitHub Actions Workflow: A CI workflow (`.github/workflows/coverage_check.yml`) is implemented to: - Run the test generation pipeline on pushes/pull_requests. - Fail PRs if the target code coverage is not met. - Requires `GOOGLE_API_KEY` to be set as a repository secret. - Documentation: - `README.md` has been updated with detailed instructions on setup, configuration, and execution. - `run.sh` script provided as a helper to set environment variables and launch the pipeline. The system is designed to significantly reduce your manual effort in writing unit tests and help maintain high code coverage standards.

Zapper9982 closed this Jun 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Implement end-to-end automated Java test generation pipeline #1

feat: Implement end-to-end automated Java test generation pipeline #1

Uh oh!

Zapper9982 commented Jun 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Implement end-to-end automated Java test generation pipeline #1

feat: Implement end-to-end automated Java test generation pipeline #1

Uh oh!

Conversation

Zapper9982 commented Jun 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants