-
Notifications
You must be signed in to change notification settings - Fork 18
Add comprehensive integration test suite and CI workflow for gh-models CLI #90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Implement tests for Float32Ptr to validate pointer creation for float32 values. - Create tests for ExtractJSON to ensure correct extraction of JSON from various input formats. - Add tests for cleanJavaScriptStringConcat to verify string concatenation handling in JavaScript context. - Introduce tests for StringSliceContains to check for string presence in slices. - Implement tests for MergeStringMaps to validate merging behavior of multiple string maps, including overwrites and handling of nil/empty maps.
…ove unused ChatMessage type
…Pex context conversion
… tests in export_test.go - Changed modelParams from pointer to value in toGitHubModelsPrompt function for better clarity and safety. - Updated the assignment of ModelParameters to use the value directly instead of dereferencing a pointer. - Introduced a new test suite in export_test.go to cover various scenarios for GitHub models evaluation generation, including edge cases and expected outputs. - Ensured that the tests validate the correct creation of files and their contents based on the provided context and options.
- Added NewPromptPex function to create a new PromptPex instance. - Implemented Run method to execute the PromptPex pipeline with context management. - Created context from prompt files or loaded existing context from JSON. - Developed pipeline steps including intent generation, input specification, output rules, and tests. - Added functionality for generating groundtruth outputs and evaluating test results. - Implemented test expansion and rating features for improved test coverage. - Introduced error handling and logging throughout the pipeline execution.
- Implemented TestCreateContext to validate various prompt YAML configurations and their expected context outputs. - Added TestCreateContextRunIDUniqueness to ensure unique RunIDs are generated for multiple context creations. - Created TestCreateContextWithNonExistentFile to handle cases where the prompt file does not exist. - Developed TestCreateContextPromptValidation to check for valid and invalid prompt formats. - Introduced TestGithubModelsEvalsGenerate to test the generation of GitHub Models eval files with various scenarios. - Added TestToGitHubModelsPrompt to validate the conversion of prompts to GitHub Models format. - Implemented TestExtractTemplateVariables and TestExtractVariablesFromText to ensure correct extraction of template variables. - Created TestGetMapKeys and TestGetTestScenario to validate utility functions related to maps and test scenarios.
…tPex configuration
… summary generation
… improved summary reporting
…se and restore its implementation; remove obsolete promptpex.go and summary_test.go files
…covering various scenarios and error handling
…entiment analysis test prompt
…neFlags function and update flag parsing to use consistent naming
… in generate_test.go
…ck responses for sentiment analysis stages
…odology for test generation
…derMessagesToString for message formatting
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
pelikhan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot rebase to main
Makefile
Outdated
| rm -rf ./gh-models | ||
| .PHONY: clean | ||
|
|
||
| integration-test: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add dependency on build step
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added build as a proper dependency to the integration-test target. The target now correctly depends on build rather than calling it explicitly. (0d67cc7)
| @@ -0,0 +1,42 @@ | |||
| # Integration Tests | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this section to contributing.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the integration testing documentation from integration/README.md to CONTRIBUTING.md. The integration/README.md now contains just a brief description with a reference to the full documentation. (0d67cc7)
… docs Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
I'm unable to perform a rebase operation due to system limitations that prevent force push operations. The changes in this PR can be rebased manually if needed, or the changes can be merged as-is to preserve the commit history. |
pelikhan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot Add GitHub action workflow step that builds and runs the integration tests only. Runs on pushes to main. Make sure to set the models: read permission
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Added GitHub Actions integration test workflow in |
|
@copilot rebase branch |
I'm unable to perform a rebase operation due to system limitations that prevent force push operations. The branch contains the integration test suite commits (59e0381, 0d67cc7, 969b0a2) on top of the current main branch. If you'd like to rebase, this would need to be done manually using standard git rebase commands. |
This PR adds a comprehensive integration test suite that validates the
gh-modelsCLI executable against live LLM endpoints using process execution rather than mocks, plus automated CI integration.Problem
The existing test suite only included unit tests with mocked clients, which didn't validate:
Solution
Created a new integration test suite in
integration/that:🏗️ Architecture
//go:build integrationbuild tags to exclude from regular test runs./gh-modelsbinary viaos/execmake integration-testcommand with proper build dependency🧪 Test Coverage
Basic Commands (
integration_test.go):File Operations (
file_modification_test.go):generatecommandAuthentication Scenarios (
authenticated_test.go):gh auth loginis configured🚀 CI Integration
GitHub Actions Workflow (
.github/workflows/integration.yml):models: readpermission for GitHub Models APImake integration-testto build binary and run integration tests🎯 Key Features
File Modification Validation:
Live Endpoint Ready:
gh auth loginis configured🔒 Safety & CI Integration
The integration test suite provides confidence that the CLI binary works correctly in real-world scenarios while maintaining the fast feedback loop of existing unit tests, with automated validation on every main branch update.
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.