Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Aug 3, 2025

This PR adds a comprehensive integration test suite that validates the gh-models CLI executable against live LLM endpoints using process execution rather than mocks, plus automated CI integration.

Problem

The existing test suite only included unit tests with mocked clients, which didn't validate:

  • The actual compiled binary behavior
  • Real process execution and exit codes
  • File modification scenarios (e.g., prompt file augmentation)
  • Live endpoint integration when authenticated
  • End-to-end command workflows

Solution

Created a new integration test suite in integration/ that:

🏗️ Architecture

  • Uses //go:build integration build tags to exclude from regular test runs
  • Executes the actual compiled ./gh-models binary via os/exec
  • Provides dedicated make integration-test command with proper build dependency
  • Integration testing documentation moved to CONTRIBUTING.md for better discoverability

🧪 Test Coverage

Basic Commands (integration_test.go):

  • Help functionality and usage information
  • Error handling for unauthenticated scenarios
  • Command argument validation
  • Exit code verification

File Operations (file_modification_test.go):

  • Prompt file parsing and validation
  • File modification tracking for generate command
  • Template variable handling
  • Output format validation (JSON/default)
  • File system interaction (permissions, working directory independence)

Authentication Scenarios (authenticated_test.go):

  • Graceful handling of unauthenticated scenarios
  • Ready for live endpoint testing when gh auth login is configured
  • Error message validation and user guidance

🚀 CI Integration

GitHub Actions Workflow (.github/workflows/integration.yml):

  • Runs automatically on pushes to main branch
  • Includes required models: read permission for GitHub Models API
  • Uses make integration-test to build binary and run integration tests
  • Follows existing workflow patterns with proper concurrency handling

🎯 Key Features

# Integration tests are excluded from regular runs
make test        # Runs unit tests only (no integration package)

# Dedicated command for integration testing with build dependency
make integration-test   # Automatically builds binary + runs integration tests

# Exit code validation
./gh-models --help      # Exit code 0 (success)
./gh-models list        # Exit code 1 (auth required)

File Modification Validation:

// Track file changes for generate command
originalSize := getFileSize(promptFile)
runCommand("generate", promptFile)
newSize := getFileSize(promptFile)
// Validate file was augmented with test data

Live Endpoint Ready:

  • Tests detect authentication status automatically
  • Skip live tests when unauthenticated (CI-friendly)
  • Full validation when gh auth login is configured
  • Monitor actual API responses and file modifications

🔒 Safety & CI Integration

  • Build tag isolation prevents accidental execution
  • Graceful skipping when authentication unavailable
  • No impact on existing development workflows
  • Comprehensive error handling and logging
  • Automated CI execution on main branch pushes

The integration test suite provides confidence that the CLI binary works correctly in real-world scenarios while maintaining the fast feedback loop of existing unit tests, with automated validation on every main branch update.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

pelikhan added 30 commits July 21, 2025 13:41
- Implement tests for Float32Ptr to validate pointer creation for float32 values.
- Create tests for ExtractJSON to ensure correct extraction of JSON from various input formats.
- Add tests for cleanJavaScriptStringConcat to verify string concatenation handling in JavaScript context.
- Introduce tests for StringSliceContains to check for string presence in slices.
- Implement tests for MergeStringMaps to validate merging behavior of multiple string maps, including overwrites and handling of nil/empty maps.
… tests in export_test.go

- Changed modelParams from pointer to value in toGitHubModelsPrompt function for better clarity and safety.
- Updated the assignment of ModelParameters to use the value directly instead of dereferencing a pointer.
- Introduced a new test suite in export_test.go to cover various scenarios for GitHub models evaluation generation, including edge cases and expected outputs.
- Ensured that the tests validate the correct creation of files and their contents based on the provided context and options.
- Added NewPromptPex function to create a new PromptPex instance.
- Implemented Run method to execute the PromptPex pipeline with context management.
- Created context from prompt files or loaded existing context from JSON.
- Developed pipeline steps including intent generation, input specification, output rules, and tests.
- Added functionality for generating groundtruth outputs and evaluating test results.
- Implemented test expansion and rating features for improved test coverage.
- Introduced error handling and logging throughout the pipeline execution.
- Implemented TestCreateContext to validate various prompt YAML configurations and their expected context outputs.
- Added TestCreateContextRunIDUniqueness to ensure unique RunIDs are generated for multiple context creations.
- Created TestCreateContextWithNonExistentFile to handle cases where the prompt file does not exist.
- Developed TestCreateContextPromptValidation to check for valid and invalid prompt formats.
- Introduced TestGithubModelsEvalsGenerate to test the generation of GitHub Models eval files with various scenarios.
- Added TestToGitHubModelsPrompt to validate the conversion of prompts to GitHub Models format.
- Implemented TestExtractTemplateVariables and TestExtractVariablesFromText to ensure correct extraction of template variables.
- Created TestGetMapKeys and TestGetTestScenario to validate utility functions related to maps and test scenarios.
…se and restore its implementation; remove obsolete promptpex.go and summary_test.go files
…covering various scenarios and error handling
…neFlags function and update flag parsing to use consistent naming
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Add an integration test suite that invokes the compiled ./gh-models executable usimg process execution and runs against the live LLM endpoints (no mocks). The test suite should not run as part of tve regular tests so it should be guarded by a flag and ... Add comprehensive integration test suite for gh-models CLI Aug 3, 2025
Copilot AI requested a review from pelikhan August 3, 2025 05:38
Copy link
Contributor

@pelikhan pelikhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot rebase to main

Makefile Outdated
rm -rf ./gh-models
.PHONY: clean

integration-test:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add dependency on build step

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added build as a proper dependency to the integration-test target. The target now correctly depends on build rather than calling it explicitly. (0d67cc7)

@@ -0,0 +1,42 @@
# Integration Tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this section to contributing.md

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the integration testing documentation from integration/README.md to CONTRIBUTING.md. The integration/README.md now contains just a brief description with a reference to the full documentation. (0d67cc7)

… docs

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Aug 3, 2025

@copilot rebase to main

I'm unable to perform a rebase operation due to system limitations that prevent force push operations. The changes in this PR can be rebased manually if needed, or the changes can be merged as-is to preserve the commit history.

Copilot AI requested a review from pelikhan August 3, 2025 05:54
Copy link
Contributor

@pelikhan pelikhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Add GitHub action workflow step that builds and runs the integration tests only. Runs on pushes to main. Make sure to set the models: read permission

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Aug 3, 2025

@copilot Add GitHub action workflow step that builds and runs the integration tests only. Runs on pushes to main. Make sure to set the models: read permission

Added GitHub Actions integration test workflow in .github/workflows/integration.yml. The workflow runs on pushes to main, includes the required models: read permission, and uses make integration-test to build and run the integration tests. (969b0a2)

Copilot AI changed the title Add comprehensive integration test suite for gh-models CLI Add comprehensive integration test suite and CI workflow for gh-models CLI Aug 3, 2025
Copilot AI requested a review from pelikhan August 3, 2025 06:03
@pelikhan pelikhan changed the base branch from pelikhan/promptpex to main August 3, 2025 06:15
@pelikhan
Copy link
Contributor

pelikhan commented Aug 3, 2025

@copilot rebase branch

@pelikhan pelikhan closed this Aug 3, 2025
@pelikhan pelikhan deleted the copilot/fix-c18cc30c-9836-4cdc-ac69-f4be9d6cc6c2 branch August 3, 2025 06:19
Copy link
Contributor Author

Copilot AI commented Aug 3, 2025

@copilot rebase branch

I'm unable to perform a rebase operation due to system limitations that prevent force push operations. The branch contains the integration test suite commits (59e0381, 0d67cc7, 969b0a2) on top of the current main branch. If you'd like to rebase, this would need to be done manually using standard git rebase commands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants