diff --git a/CLAUDE.md b/CLAUDE.md index b8741b2..f74cb95 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -65,126 +65,15 @@ xcodebuild test \ curl http://localhost:9009/health ``` -## Installation & Distribution - -### One-Command Installer (`install.sh`) - -Users install with `curl -fsSL https://github.com/docer1990/visiontest/releases/latest/download/install.sh | bash`. The script: -1. Detects OS (macOS/Linux) and arch (arm64/x86_64) -2. Validates Java 17+ with platform-specific install suggestions -3. Fetches latest release tag from GitHub API, validates format (`v[0-9][0-9A-Za-z._-]*`) and rejects dangerous characters -4. Downloads `visiontest.jar` + SHA-256 checksum, verifies integrity -5. Downloads Android APKs (`automation-server.apk`, `automation-server-test.apk`) + checksums, verifies integrity -6. On macOS arm64: downloads `ios-automation-server.tar.gz` + checksum, extracts pre-built iOS XCUITest bundle to `ios-automation-server/` subdirectory (skipped on Linux and macOS x86_64) -7. Installs JAR, APKs, and iOS bundle to `~/.local/share/visiontest/` (customizable via `VISIONTEST_DIR` env var, must be under `$HOME`) -8. Creates wrapper script at `~/.local/bin/visiontest`, ensures PATH -9. Does not modify Claude Desktop configuration; use `run-visiontest.sh` or manual setup for Claude integration. - -**Security hardening:** `umask 077`, explicit `chmod` on all files/dirs, tag validation, checksum verification, install path restricted to `$HOME`. - -### Release Workflow (`.github/workflows/release.yaml`) - -Triggered by git tags matching `v*`. The workflow runs the test suite, builds the fat JAR via `shadowJar`, Android APKs, and the pre-built iOS XCUITest bundle (on a macOS runner), generates SHA-256 checksums, and creates a GitHub Release with the following assets: `visiontest.jar`, `visiontest.jar.sha256`, `automation-server.apk`, `automation-server.apk.sha256`, `automation-server-test.apk`, `automation-server-test.apk.sha256`, `ios-automation-server.tar.gz`, `ios-automation-server.tar.gz.sha256`, `install.sh`, `run-visiontest.sh`. - -All GitHub Actions in both workflows are pinned to commit SHAs for supply-chain security. When updating or adding actions, always use SHA-pinned references instead of floating version tags. - -### Launcher Script (`run-visiontest.sh`) - -Used for development and Claude Desktop config. JAR resolution order: - -1. Repo build: `app/build/libs/visiontest.jar` (sets up `ANDROID_HOME`, APK path, `cd` to project root) -2. Installed JAR: `~/.local/share/visiontest/visiontest.jar` (skips Android SDK setup) -3. Error with build/install instructions - ## Architecture Overview VisionTest is an MCP (Model Context Protocol) server enabling AI agents to interact with mobile devices. It consists of three modules: -### MCP Server (`app/`) - -Kotlin/JVM server using stdio transport. Key files: -- `Main.kt` - Entry point, initializes managers and connects via stdio -- `ToolFactory.kt` - Thin coordinator that wires registrars and delegates tool registration -- `tools/ToolDsl.kt` - `ToolScope` DSL wrapping timeout/error handling + `CallToolRequest` extension helpers -- `tools/ToolRegistrar.kt` - Interface for modular tool registration -- `tools/ToolHelpers.kt` - Pure utility functions (`extractProperty`, `formatAppInfo`, etc.) -- `tools/AndroidDeviceToolRegistrar.kt` - 4 Android device tools -- `tools/AndroidAutomationToolRegistrar.kt` - 14 Android automation tools -- `tools/IOSDeviceToolRegistrar.kt` - 4 iOS device tools -- `tools/IOSAutomationToolRegistrar.kt` - 12 iOS automation tools + xcodebuild process management -- `discovery/ToolDiscovery.kt` - APK, Xcode project, xctestrun, and project root discovery -- `android/Android.kt` - ADB communication via Adam library -- `android/AutomationClient.kt` - HTTP client for Android Automation Server JSON-RPC -- `ios/IOSManager.kt` - iOS simulator operations via `xcrun simctl` -- `ios/IOSAutomationClient.kt` - HTTP client for iOS Automation Server JSON-RPC -- `config/AutomationConfig.kt` - Centralized constants for Android automation -- `config/IOSAutomationConfig.kt` - Centralized constants for iOS automation -- `common/DeviceConfig.kt` - Shared interface for both platforms - -### Automation Server (`automation-server/`) - -Native Android app providing UIAutomator access via JSON-RPC. Uses **instrumentation pattern** (like Maestro/Appium) for secure, privileged access. - -**Main App (`src/main/`) - Configuration UI only:** -- `MainActivity.kt` - Shows setup instructions and port configuration -- `config/ServerConfig.kt` - SharedPreferences for port setting -- `jsonrpc/JsonRpcModels.kt` - Request/Response/Error data classes -- `uiautomator/BaseUiAutomatorBridge.kt` - Abstract base class with all UIAutomator logic (includes reflection-based hierarchy dumping) -- `uiautomator/UiAutomatorModels.kt` - Data classes for results - -**Instrumentation (`src/androidTest/`) - The actual working server:** -- `AutomationInstrumentationRunner.kt` - Custom test runner capturing arguments -- `AutomationServerTest.kt` - Long-running test hosting the JSON-RPC server -- `JsonRpcServerInstrumented.kt` - Ktor HTTP server with JSON-RPC 2.0 -- `UiAutomatorBridgeInstrumented.kt` - Extends BaseUiAutomatorBridge with valid Instrumentation - -**JSON-RPC Endpoints:** -- `GET /health` - Server health check -- `POST /jsonrpc` - JSON-RPC 2.0 endpoint - -**Supported JSON-RPC Methods:** -| Method | Parameters | Description | -|--------|------------|-------------| -| `ui.dumpHierarchy` | - | Get UI hierarchy XML | -| `ui.tapByCoordinates` | `x`, `y` | Tap at coordinates | -| `ui.swipe` | `startX`, `startY`, `endX`, `endY`, `steps` | Swipe by coordinates | -| `ui.swipeByDirection` | `direction`, `distance`, `speed` | Swipe by direction (up/down/left/right) | -| `ui.swipeOnElement` | `direction`, selector, `speed` | Swipe on a specific element | -| `ui.findElement` | `text`, `resourceId`, etc. | Find UI element | -| `ui.getInteractiveElements` | `includeDisabled` (optional) | Get filtered list of interactive elements | -| `device.getInfo` | - | Get display size, rotation, SDK | -| `ui.inputText` | `text` | Type text into focused element | -| `device.pressBack` | - | Press back button | -| `device.pressHome` | - | Press home button | - -### iOS Automation Server (`ios-automation-server/`) - -Native iOS app providing XCUITest access via JSON-RPC. Uses **XCUITest framework** (similar pattern to Android's instrumentation). - -**Host App (`IOSAutomationServer/`):** -- `AppDelegate.swift` - Minimal host app required by XCUITest - -**UI Test Bundle (`IOSAutomationServerUITests/`) - The actual working server:** -- `AutomationServerUITest.swift` - Entry point test that starts the JSON-RPC server -- `Server/JsonRpcServer.swift` - Swifter HTTP server with JSON-RPC 2.0 dispatch -- `Bridge/XCUITestBridge.swift` - All XCUITest automation logic -- `Models/JsonRpcModels.swift` - Request/Response/Error types -- `Models/AutomationModels.swift` - Result types (mirrors UiAutomatorModels.kt) - -**Key difference from Android:** iOS simulators share the Mac's network stack, so **no port forwarding is needed**. The server is directly accessible at `localhost:9009`. - -**Supported iOS JSON-RPC Methods:** -| Method | Parameters | Description | -|--------|------------|-------------| -| `ui.dumpHierarchy` | - | Get UI hierarchy XML | -| `ui.tapByCoordinates` | `x`, `y` | Tap at coordinates | -| `ui.swipe` | `startX`, `startY`, `endX`, `endY`, `steps` | Swipe by coordinates | -| `ui.swipeByDirection` | `direction`, `distance`, `speed` | Swipe by direction | -| `ui.findElement` | `text`, `resourceId`, etc. | Find UI element | -| `ui.getInteractiveElements` | `includeDisabled` (optional) | Get interactive elements | -| `ui.inputText` | `text` | Type text into focused element | -| `device.getInfo` | - | Get display size, rotation, iOS version | -| `device.pressHome` | - | Press home button | +- **MCP Server (`app/`)** — Kotlin/JVM server using stdio transport. Entry point: `Main.kt`. Tool registration via `ToolFactory.kt` + modular `ToolRegistrar` implementations. DSL for timeout/error handling in `tools/ToolDsl.kt`. +- **Automation Server (`automation-server/`)** — Native Android app providing UIAutomator access via JSON-RPC. Uses instrumentation pattern (like Maestro/Appium). The server runs as an instrumentation test (`src/androidTest/`), not as a regular app service. +- **iOS Automation Server (`ios-automation-server/`)** — Native iOS XCUITest bundle providing automation via JSON-RPC. Server runs as a UI test. No port forwarding needed (iOS simulators share Mac's network stack). + +Both automation servers expose `GET /health` and `POST /jsonrpc` (JSON-RPC 2.0) endpoints. See `LEARNING.md` for detailed design decisions (instrumentation pattern, Template Method, reflection-based hierarchy dumping, security practices). ## MCP Tools @@ -218,16 +107,6 @@ Native iOS app providing XCUITest access via JSON-RPC. Uses **XCUITest framework | `android_press_back` | Press the back button | | `android_press_home` | Press the home button | -**Typical Android Automation Workflow:** -1. `install_automation_server` - Install both APKs (one-time) -2. `start_automation_server` - Start JSON-RPC server via `am instrument` -3. `get_interactive_elements` - Get filtered list of interactive elements (preferred) - - OR `get_ui_hierarchy` - Get full XML hierarchy (when you need all elements) -4. `android_tap_by_coordinates` - Tap using centerX/centerY from interactive elements -5. `android_input_text` - Type text into the focused field -6. `android_swipe_direction` - Scroll/swipe by direction (simpler, no coordinates needed) - - OR `android_swipe` - Swipe by exact coordinates (for precise control) - ### UI Automation (iOS) | Tool | Description | |------|-------------| @@ -244,167 +123,23 @@ Native iOS app providing XCUITest access via JSON-RPC. Uses **XCUITest framework | `ios_press_home` | Press home button | | `ios_stop_automation_server` | Stop the running XCUITest server | -**Typical iOS Automation Workflow:** -1. `ios_start_automation_server` - Start XCUITest server (uses pre-built bundle if available, otherwise builds from source) -2. `ios_get_interactive_elements` - Get filtered list of interactive elements (preferred) - - OR `ios_get_ui_hierarchy` - Get full XML hierarchy (when you need all elements) -3. `ios_tap_by_coordinates` - Tap using centerX/centerY from interactive elements -4. `ios_input_text` - Type text into the focused field -5. `ios_swipe_direction` - Scroll/swipe by direction (simpler, no coordinates needed) - -## Instrumentation Pattern +### Typical Automation Workflow -The automation server uses Android's instrumentation framework (like Maestro, Appium) for UIAutomator access: +1. **Install/Start** — `install_automation_server` + `start_automation_server` (Android) or `ios_start_automation_server` (iOS, uses pre-built bundle if available) +2. **Inspect** — `get_interactive_elements` / `ios_get_interactive_elements` (preferred) or `get_ui_hierarchy` / `ios_get_ui_hierarchy` (full XML) +3. **Interact** — `android_tap_by_coordinates` / `ios_tap_by_coordinates` using centerX/centerY from interactive elements +4. **Input** — `android_input_text` / `ios_input_text` for text entry +5. **Navigate** — `android_swipe_direction` / `ios_swipe_direction` (simpler) or coordinate-based swipe (precise) -**Why Instrumentation?** -- UIAutomator requires valid `Instrumentation` with `UiAutomation` connection -- Regular services can't get this - creating empty `Instrumentation()` doesn't work -- Only the test framework provides proper instrumentation context -- Service remains unexported for security - -**How it works:** -```bash -# MCP tool executes this command: -adb shell am instrument -w -e port 9008 \ - -e class com.example.automationserver.AutomationServerTest#runAutomationServer \ - com.example.automationserver.test/com.example.automationserver.AutomationInstrumentationRunner -``` - -**Manual testing:** -```bash -# Start server -adb shell am instrument -w -e port 9008 \ - -e class com.example.automationserver.AutomationServerTest#runAutomationServer \ - com.example.automationserver.test/com.example.automationserver.AutomationInstrumentationRunner - -# In another terminal, set up port forwarding and test -adb forward tcp:9008 tcp:9008 -curl http://localhost:9008/health - -# Stop server -adb shell am force-stop com.example.automationserver -``` - -## iOS Automation Pattern - -The iOS automation server uses Apple's XCUITest framework, mirroring the Android instrumentation pattern: - -**Why XCUITest?** -- XCUITest provides full access to the UI element tree via XCUIElement -- It runs as a UI test bundle, with proper accessibility access -- No separate installation step — `xcodebuild test` handles build + install + run - -**How it works:** -```bash -# MCP tool executes this command: -xcodebuild test \ - -project ios-automation-server/IOSAutomationServer.xcodeproj \ - -scheme IOSAutomationServer \ - -destination 'platform=iOS Simulator,name=iPhone 16' \ - -only-testing:IOSAutomationServerUITests/AutomationServerUITest/testRunAutomationServer -``` - -**Manual testing:** -```bash -# Start server (in one terminal) -xcodebuild test \ - -project ios-automation-server/IOSAutomationServer.xcodeproj \ - -scheme IOSAutomationServer \ - -destination 'platform=iOS Simulator,name=iPhone 16' \ - -only-testing:IOSAutomationServerUITests/AutomationServerUITest/testRunAutomationServer - -# In another terminal, test directly -curl http://localhost:9009/health -curl -X POST http://localhost:9009/jsonrpc -H 'Content-Type: application/json' \ - -d '{"jsonrpc":"2.0","method":"device.getInfo","id":1}' - -# Stop server: kill the xcodebuild process (Ctrl+C) -``` - -**Key differences from Android:** -| Aspect | Android | iOS | -|--------|---------|-----| -| Port forwarding | Required (ADB) | Not needed (shared network) | -| Back button | `device.pressBack()` | No equivalent — tap nav bar back button | -| Starting server | `am instrument -w` | `xcodebuild test -only-testing:` | -| Stopping server | `am force-stop` | Kill xcodebuild process | -| Swipe control | Step count | Duration (steps * 0.05 seconds) | -| Build system | Gradle module | Xcode project | -| Dependencies | Ktor/Netty (Gradle) | Swifter (Swift Package Manager) | -| Default port | 9008 | 9009 | - -## Flutter App Support - -The automation server uses a reflection-based approach for UI hierarchy dumping, similar to Maestro: - -**Key features:** -- Uses `UiDevice.getWindowRoots()` via reflection to access all accessibility window roots -- Enables `FLAG_RETRIEVE_INTERACTIVE_WINDOWS` (API 24+) for cross-app window access -- Sets `compressedLayoutHierarchy` to false to expose all accessibility nodes -- Handles WebView contents that may report as invisible - -**Finding elements in Flutter apps:** -> **Important:** Flutter apps expose text labels via `content-desc` (contentDescription) attribute instead of `text`. When using `find_element` on a Flutter app: -> 1. First try finding by `text` parameter -> 2. If not found, retry using `contentDescription` parameter with the same value -> -> Example: To find a "Log In" button in a Flutter app, use `contentDescription: "Log In"` instead of `text: "Log In"`. - -**Abstract methods in `BaseUiAutomatorBridge`:** -- `getUiDevice()` - Returns the UiDevice instance -- `getUiAutomation()` - Returns UiAutomation with appropriate flags -- `getDisplayRect()` - Returns display bounds for visibility calculations - -## Unit Tests - -All Gradle tests are pure JVM (no device/emulator needed). iOS tests run on the simulator but don't need a running automation server. - -### MCP Server (`app/src/test/kotlin/com/example/visiontest/`) - -| Test File | What It Tests | -|-----------|---------------| -| `utils/ErrorHandlerTest.kt` | 12 exception→error-code mappings, `retryOperation` exponential backoff | -| `utils/ErrorHandlerCoroutineTest.kt` | `retryOperation` exponential backoff delays with `TestCoroutineScheduler` | -| `ios/IOSSimulatorParsingTest.kt` | `parseDeviceList`, `parseAppListFromPlist`, `isValidBundleId`, `isValidShellCommand` | -| `ios/IOSSimulatorTest.kt` | `listDevices`, `getFirstAvailableDevice`, `listApps`, `getAppInfo`, `launchApp`, `executeShell`, `ensureDeviceBooted` (MockK'd ProcessExecutor) | -| `ios/ProcessExecutorTest.kt` | Exit codes, stdout capture, timeout handling, non-existent commands | -| `ios/IOSAutomationClientTest.kt` | JSON-RPC requests, `isServerRunning`, Gson serialization (MockWebServer) | -| `android/AndroidValidationTest.kt` | `isValidPackageName`, `validateForwardArgs`, `validateShellArgs`, `validateInstallArgs` | -| `android/AutomationClientTest.kt` | `sendRequest` POST/params/errors, `isServerRunning` health check (MockWebServer) | -| `config/AppConfigTest.kt` | Default config values and log level | -| `ToolFactoryHelpersTest.kt` | `ToolHelpers.extractProperty`, `extractPattern`, `formatAppInfo` | -| `ToolFactoryPathTest.kt` | `ToolDiscovery.findProjectRoot`, `findAutomationServerApk`, `resolveMainApkPath`, `findXctestrun`; `IOSAutomationToolRegistrar.buildXcodebuildCommand` | - -### Automation Server (`automation-server/src/test/java/com/example/automationserver/`) - -| Test File | What It Tests | -|-----------|---------------| -| `jsonrpc/JsonRpcModelsTest.kt` | `JsonRpcError` factory methods, request/response defaults and field handling | -| `uiautomator/UiAutomatorModelsTest.kt` | All data classes, default values, enum entries (SwipeSpeed, SwipeDirection, SwipeDistance) | -| `config/ServerConfigPortTest.kt` | `isValidPort` boundary tests, constants | -| `uiautomator/XmlUtilsTest.kt` | `stripInvalidXMLChars` — invalid ranges replaced, valid chars preserved | - -### iOS Automation Server (`ios-automation-server/IOSAutomationServerTests/`) - -| Test File | What It Tests | -|-----------|---------------| -| `JsonRpcModelsTests.swift` | `JsonRpcRequest.parse` (valid/invalid/malformed JSON), error factory methods & codes, `toDictionary`, success/error responses | -| `AutomationModelsTests.swift` | All result model `toDictionary()` conversions (UiHierarchyResult, DeviceInfoResult, OperationResult, ElementResult, InteractiveElement, InteractiveElementsResult), enum raw values & properties (SwipeDirection, SwipeDistance, SwipeSpeed) | -| `HelpersTests.swift` | `escapeXML` (nil, special chars, multiple replacements), `boundsString` (CGRect to bounds string, fractional truncation), `intParam` (Int/Double/String coercion, missing keys, edge cases) | - -See `.claude/unit-testing-strategy.md` for the full testing roadmap (Plans 1-7). +> **Flutter apps:** Text labels use `content-desc` (contentDescription) instead of `text`. If `find_element` by `text` fails, retry with `contentDescription`. ## Key Patterns - All device operations use suspend functions with coroutine-based async -- Device list caching reduces ADB overhead (validity: 1000ms default) - Retry logic with exponential backoff in `ErrorHandler.retryOperation()` - Custom exception hierarchy in `Exceptions.kt` with platform-specific error codes - Tool timeout wrapper: `ToolScope` DSL with `withTimeout` (default: 10s, 30s for UI hierarchy, 200s for iOS server startup) -- Automation Server uses Ktor/Netty for HTTP server with Gson serialization -- Template Method Pattern: `BaseUiAutomatorBridge` defines operations, subclasses provide `UiDevice`, `UiAutomation`, and display bounds -- Reflection-based hierarchy dumping via `UiDevice.getWindowRoots()` for Flutter app support -- Centralized constants in `AutomationConfig.kt` - no magic numbers +- Centralized constants in `AutomationConfig.kt` / `IOSAutomationConfig.kt` — no magic numbers ## Configuration @@ -417,37 +152,14 @@ See `.claude/unit-testing-strategy.md` for the full testing roadmap (Plans 1-7). | `VISION_TEST_IOS_PROJECT_PATH` | (auto-detected) | Explicit path to iOS `.xcodeproj` | | `VISIONTEST_DIR` | `~/.local/share/visiontest` | Override install directory (must be under `$HOME`) | -### Default Timeouts (in `config/AppConfig.kt`) - -- ADB timeout: 5000ms -- Device cache validity: 1000ms -- Tool execution timeout: 10000ms - -### Android Automation Server Defaults (in `config/AutomationConfig.kt`) - -- Server port: 9008 (range: 1024-65535) -- ADB port forwarding: `adb forward tcp:9008 tcp:9008` - -### iOS Automation Server Defaults (in `config/IOSAutomationConfig.kt`) - -- Server port: 9009 (range: 1024-65535) -- No port forwarding needed (iOS simulators share Mac's network stack) - -## Prerequisites - -- JDK 17+ -- macOS or Linux (arm64 or x86_64) -- Android Platform Tools (ADB) in PATH — for Android automation -- Xcode Command Line Tools — for iOS simulator support (macOS only). Pre-built iOS bundle requires the same Xcode major version used in CI (see release notes). For source builds or Intel Macs, the full Xcode IDE is needed. -- Android SDK — only needed for building the automation-server module from source +### Key Defaults -> **Quick start:** Users who just need the MCP server can run `curl -fsSL https://github.com/docer1990/visiontest/releases/latest/download/install.sh | bash` — only Java 17+ is required. +- Android automation server port: **9008** (range: 1024-65535), requires ADB port forwarding +- iOS automation server port: **9009** (range: 1024-65535), no port forwarding needed +- All Gradle tests are pure JVM (no device/emulator needed). See `.claude/unit-testing-strategy.md` for the testing roadmap. -## Important Design Decisions +## Further Reading -See `LEARNING.md` for detailed explanations of: -- Why we use instrumentation instead of services -- Template Method Pattern for UIAutomator bridge -- Reflection-based hierarchy dumping for Flutter support -- Why non-working code was deleted instead of deprecated -- Security best practices (command allowlists, JSON escaping) +- [`LEARNING.md`](LEARNING.md) — Design decisions (instrumentation, Template Method, Flutter support, security) +- [`docs/installation.md`](docs/installation.md) — Installer, release workflow, launcher script, prerequisites +- [`kotlin-mcp-server.instruction.md`](kotlin-mcp-server.instruction.md) — Required Kotlin/MCP patterns diff --git a/docs/installation.md b/docs/installation.md new file mode 100644 index 0000000..309ac24 --- /dev/null +++ b/docs/installation.md @@ -0,0 +1,40 @@ +# Installation & Distribution + +## One-Command Installer (`install.sh`) + +Users install with `curl -fsSL https://github.com/docer1990/visiontest/releases/latest/download/install.sh | bash`. The script: +1. Detects OS (macOS/Linux) and arch (arm64/x86_64) +2. Validates Java 17+ with platform-specific install suggestions +3. Fetches latest release tag from GitHub API, validates format (`v[0-9][0-9A-Za-z._-]*`) and rejects dangerous characters +4. Downloads `visiontest.jar` + SHA-256 checksum, verifies integrity +5. Downloads Android APKs (`automation-server.apk`, `automation-server-test.apk`) + checksums, verifies integrity +6. On macOS arm64: downloads `ios-automation-server.tar.gz` + checksum, extracts pre-built iOS XCUITest bundle to `ios-automation-server/` subdirectory (skipped on Linux and macOS x86_64) +7. Installs JAR, APKs, and iOS bundle to `~/.local/share/visiontest/` (customizable via `VISIONTEST_DIR` env var, must be under `$HOME`) +8. Creates wrapper script at `~/.local/bin/visiontest`, ensures PATH +9. Does not modify Claude Desktop configuration; use `run-visiontest.sh` or manual setup for Claude integration. + +**Security hardening:** `umask 077`, explicit `chmod` on all files/dirs, tag validation, checksum verification, install path restricted to `$HOME`. + +## Release Workflow (`.github/workflows/release.yaml`) + +Triggered by git tags matching `v*`. The workflow runs the test suite, builds the fat JAR via `shadowJar`, Android APKs, and the pre-built iOS XCUITest bundle (on a macOS runner), generates SHA-256 checksums, and creates a GitHub Release with the following assets: `visiontest.jar`, `visiontest.jar.sha256`, `automation-server.apk`, `automation-server.apk.sha256`, `automation-server-test.apk`, `automation-server-test.apk.sha256`, `ios-automation-server.tar.gz`, `ios-automation-server.tar.gz.sha256`, `install.sh`, `run-visiontest.sh`. + +All GitHub Actions in both workflows are pinned to commit SHAs for supply-chain security. When updating or adding actions, always use SHA-pinned references instead of floating version tags. + +## Launcher Script (`run-visiontest.sh`) + +Used for development and Claude Desktop config. JAR resolution order: + +1. Repo build: `app/build/libs/visiontest.jar` (sets up `ANDROID_HOME`, APK path, `cd` to project root) +2. Installed JAR: `~/.local/share/visiontest/visiontest.jar` (skips Android SDK setup) +3. Error with build/install instructions + +## Prerequisites + +- JDK 17+ +- macOS or Linux (arm64 or x86_64) +- Android Platform Tools (ADB) in PATH — for Android automation +- Xcode Command Line Tools — for iOS simulator support (macOS only). Pre-built iOS bundle requires the same Xcode major version used in CI (see release notes). For source builds or Intel Macs, the full Xcode IDE is needed. +- Android SDK — only needed for building the automation-server module from source + +> **Quick start:** Users who just need the MCP server can run `curl -fsSL https://github.com/docer1990/visiontest/releases/latest/download/install.sh | bash` — only Java 17+ is required. diff --git a/openspec/changes/refactor-toolfactory/.openspec.yaml b/openspec/changes/archive/2026-03-24-refactor-toolfactory/.openspec.yaml similarity index 100% rename from openspec/changes/refactor-toolfactory/.openspec.yaml rename to openspec/changes/archive/2026-03-24-refactor-toolfactory/.openspec.yaml diff --git a/openspec/changes/refactor-toolfactory/design.md b/openspec/changes/archive/2026-03-24-refactor-toolfactory/design.md similarity index 100% rename from openspec/changes/refactor-toolfactory/design.md rename to openspec/changes/archive/2026-03-24-refactor-toolfactory/design.md diff --git a/openspec/changes/refactor-toolfactory/proposal.md b/openspec/changes/archive/2026-03-24-refactor-toolfactory/proposal.md similarity index 100% rename from openspec/changes/refactor-toolfactory/proposal.md rename to openspec/changes/archive/2026-03-24-refactor-toolfactory/proposal.md diff --git a/openspec/changes/refactor-toolfactory/specs/tool-discovery/spec.md b/openspec/changes/archive/2026-03-24-refactor-toolfactory/specs/tool-discovery/spec.md similarity index 100% rename from openspec/changes/refactor-toolfactory/specs/tool-discovery/spec.md rename to openspec/changes/archive/2026-03-24-refactor-toolfactory/specs/tool-discovery/spec.md diff --git a/openspec/changes/refactor-toolfactory/specs/tool-registration-dsl/spec.md b/openspec/changes/archive/2026-03-24-refactor-toolfactory/specs/tool-registration-dsl/spec.md similarity index 100% rename from openspec/changes/refactor-toolfactory/specs/tool-registration-dsl/spec.md rename to openspec/changes/archive/2026-03-24-refactor-toolfactory/specs/tool-registration-dsl/spec.md diff --git a/openspec/changes/refactor-toolfactory/tasks.md b/openspec/changes/archive/2026-03-24-refactor-toolfactory/tasks.md similarity index 100% rename from openspec/changes/refactor-toolfactory/tasks.md rename to openspec/changes/archive/2026-03-24-refactor-toolfactory/tasks.md diff --git a/openspec/specs/tool-discovery/spec.md b/openspec/specs/tool-discovery/spec.md new file mode 100644 index 0000000..62c021c --- /dev/null +++ b/openspec/specs/tool-discovery/spec.md @@ -0,0 +1,94 @@ +## ADDED Requirements + +### Requirement: ToolDiscovery class encapsulates all path resolution +A `ToolDiscovery` class in the `discovery/` package SHALL encapsulate all asset discovery logic previously embedded in `ToolFactory`. It SHALL accept only a `Logger` as a constructor parameter. + +#### Scenario: ToolDiscovery is independently constructable +- **WHEN** `ToolDiscovery(logger)` is constructed +- **THEN** it SHALL be ready to use without requiring `DeviceConfig`, `AutomationClient`, or any other `ToolFactory` dependency + +### Requirement: Android APK discovery +`ToolDiscovery` SHALL provide `findAutomationServerApk()` and its testable overload `findAutomationServerApk(envApkPath, searchRoots, installDir)` with identical behavior to the current `ToolFactory` implementation. + +#### Scenario: APK found via environment variable +- **WHEN** `VISION_TEST_APK_PATH` environment variable points to an existing file +- **THEN** `findAutomationServerApk()` SHALL return that file's absolute path + +#### Scenario: APK found via search roots +- **WHEN** no environment variable is set and the APK exists at `/automation-server/build/outputs/apk/androidTest/debug/automation-server-debug-androidTest.apk` +- **THEN** `findAutomationServerApk()` SHALL return the first match found across search roots + +#### Scenario: APK found in install directory +- **WHEN** no environment variable is set and no search root contains the APK, but `/automation-server-test.apk` exists +- **THEN** `findAutomationServerApk()` SHALL return the install directory APK path + +#### Scenario: No APK found +- **WHEN** no APK is found in any location +- **THEN** `findAutomationServerApk()` SHALL return null + +### Requirement: Main APK resolution from test APK path +`ToolDiscovery` SHALL provide `resolveMainApkPath(testApkPath)` with identical behavior to the current `ToolFactory` implementation. + +#### Scenario: Gradle layout derivation +- **WHEN** test APK path contains `androidTest/` and `-androidTest` substrings and the derived main APK exists +- **THEN** `resolveMainApkPath()` SHALL return the derived path + +#### Scenario: Install directory sibling lookup +- **WHEN** the test APK is named `automation-server-test.apk` and `automation-server.apk` exists in the same directory +- **THEN** `resolveMainApkPath()` SHALL return the sibling APK path + +#### Scenario: No main APK found +- **WHEN** no derivation or sibling lookup succeeds +- **THEN** `resolveMainApkPath()` SHALL return null + +### Requirement: Xcode project discovery +`ToolDiscovery` SHALL provide `findXcodeProject()` with identical cascading search behavior: environment variable → CWD → project root → code source root. + +#### Scenario: Xcode project from environment variable +- **WHEN** `VISION_TEST_IOS_PROJECT_PATH` environment variable points to a valid `.xcodeproj` directory +- **THEN** `findXcodeProject()` SHALL return its absolute path + +#### Scenario: Xcode project from project root +- **WHEN** no environment variable is set and the `.xcodeproj` exists relative to the detected project root +- **THEN** `findXcodeProject()` SHALL return its absolute path + +#### Scenario: No Xcode project found +- **WHEN** no `.xcodeproj` is found in any location +- **THEN** `findXcodeProject()` SHALL return null + +### Requirement: xctestrun bundle discovery +`ToolDiscovery` SHALL provide `findXctestrun()` and its testable overload `findXctestrun(installDir)` with identical behavior. + +#### Scenario: xctestrun found in install directory +- **WHEN** `/ios-automation-server/` contains `.xctestrun` files +- **THEN** `findXctestrun()` SHALL return the absolute path of the first file alphabetically + +#### Scenario: No xctestrun found +- **WHEN** the bundle directory does not exist or contains no `.xctestrun` files +- **THEN** `findXctestrun()` SHALL return null + +### Requirement: Project root discovery +`ToolDiscovery` SHALL provide `findProjectRoot(startFrom)` that walks up the directory tree (max 10 levels) looking for `settings.gradle.kts` or `settings.gradle`. + +#### Scenario: Project root found +- **WHEN** a `settings.gradle.kts` or `settings.gradle` file exists within 10 parent directories of `startFrom` +- **THEN** `findProjectRoot()` SHALL return the directory containing it + +#### Scenario: Project root not found within depth limit +- **WHEN** no settings file exists within 10 levels up +- **THEN** `findProjectRoot()` SHALL return null + +#### Scenario: Trailing dot in path handled +- **WHEN** `startFrom` path ends with `.` +- **THEN** `findProjectRoot()` SHALL resolve the parent correctly and search normally + +### Requirement: Install directory resolution +`ToolDiscovery` SHALL provide `resolveInstallDir()` with the cascading resolution: `VISIONTEST_DIR` env var → JAR directory → `~/.local/share/visiontest` default. + +#### Scenario: Install dir from environment variable +- **WHEN** `VISIONTEST_DIR` environment variable is set and non-empty +- **THEN** `resolveInstallDir()` SHALL return a `File` pointing to that path + +#### Scenario: Install dir default fallback +- **WHEN** no environment variable is set and not running from a JAR +- **THEN** `resolveInstallDir()` SHALL return `~/.local/share/visiontest` diff --git a/openspec/specs/tool-registration-dsl/spec.md b/openspec/specs/tool-registration-dsl/spec.md new file mode 100644 index 0000000..1534197 --- /dev/null +++ b/openspec/specs/tool-registration-dsl/spec.md @@ -0,0 +1,103 @@ +## ADDED Requirements + +### Requirement: ToolScope absorbs tool registration boilerplate +The `ToolScope` class SHALL wrap `Server.addTool()` with automatic timeout enforcement, error handling via `ErrorHandler.handleToolError()`, and `CallToolResult` wrapping. Tool handlers SHALL only provide the business logic as a `suspend (CallToolRequest?) -> String` lambda. + +#### Scenario: Tool registered via ToolScope executes within timeout +- **WHEN** a tool is registered via `ToolScope.tool()` with a 10s timeout and the handler returns in 5s +- **THEN** the tool SHALL return a `CallToolResult` containing a single `TextContent` with the handler's return value + +#### Scenario: Tool registered via ToolScope times out +- **WHEN** a tool is registered via `ToolScope.tool()` with a 10s timeout and the handler takes longer than 10s +- **THEN** the tool SHALL return an error `CallToolResult` produced by `ErrorHandler.handleToolError()` with a `TimeoutCancellationException` + +#### Scenario: Tool registered via ToolScope handles exceptions +- **WHEN** a tool handler throws any `Exception` +- **THEN** the tool SHALL return an error `CallToolResult` produced by `ErrorHandler.handleToolError()` with the thrown exception and the tool name as context + +#### Scenario: Tool registered with custom timeout +- **WHEN** a tool is registered with `timeoutMs = 30000` +- **THEN** the tool SHALL use 30s as its timeout instead of the default + +### Requirement: ToolRegistrar interface for modular registration +Each platform tool group SHALL implement the `ToolRegistrar` interface with a single `registerTools(scope: ToolScope)` method. `ToolFactory.registerAllTools()` SHALL iterate over all registrars and delegate to each. + +#### Scenario: All tools registered via registrars +- **WHEN** `ToolFactory.registerAllTools(server)` is called +- **THEN** all 36 MCP tools SHALL be registered on the server with identical names, descriptions, and input schemas as the current monolithic implementation + +#### Scenario: Registrar receives ToolScope +- **WHEN** a `ToolRegistrar.registerTools(scope)` is called +- **THEN** the scope SHALL provide the server, logger, and default timeout configured in `ToolFactory` + +### Requirement: CallToolRequest parameter extraction helpers +Extension functions on `CallToolRequest?` SHALL provide type-safe parameter extraction: `requireString(key)`, `requireInt(key)`, `optionalString(key)`, `optionalInt(key)`. + +#### Scenario: requireString returns value when present +- **WHEN** `request.requireString("packageName")` is called and `packageName` exists in the request arguments +- **THEN** it SHALL return the string value + +#### Scenario: requireString throws when missing +- **WHEN** `request.requireString("packageName")` is called and `packageName` is not in the request arguments +- **THEN** it SHALL throw `IllegalArgumentException` with a message containing the key name + +#### Scenario: requireInt parses integer from string +- **WHEN** `request.requireInt("x")` is called and the argument value is `"100"` +- **THEN** it SHALL return `100` as an `Int` + +#### Scenario: requireInt throws on non-integer +- **WHEN** `request.requireInt("x")` is called and the argument value is `"abc"` +- **THEN** it SHALL throw `IllegalArgumentException` with a message indicating the key must be an integer + +#### Scenario: optionalString returns null when missing +- **WHEN** `request.optionalString("text")` is called and `text` is not in the request arguments +- **THEN** it SHALL return `null` + +### Requirement: ToolHelpers object for pure utility functions +The functions `extractProperty`, `extractPattern`, and `formatAppInfo` SHALL be moved to a `ToolHelpers` object in the `tools/` package with identical behavior. + +#### Scenario: extractProperty finds property value +- **WHEN** `ToolHelpers.extractProperty("[ro.product.model]: [Pixel 6]", "ro.product.model")` is called +- **THEN** it SHALL return `"Pixel 6"` + +#### Scenario: extractProperty returns Unknown for missing property +- **WHEN** `ToolHelpers.extractProperty("", "ro.product.model")` is called +- **THEN** it SHALL return `"Unknown"` + +#### Scenario: formatAppInfo extracts and formats app information +- **WHEN** `ToolHelpers.formatAppInfo(rawDumpsysOutput, "com.example.app")` is called with valid dumpsys output +- **THEN** it SHALL return a formatted string containing version name, version code, SDK targets, install dates, and up to 10 permissions + +### Requirement: Four platform-specific registrars +The system SHALL provide exactly four `ToolRegistrar` implementations: +- `AndroidDeviceToolRegistrar` — registers 4 Android device management tools +- `AndroidAutomationToolRegistrar` — registers 14 Android UI automation tools +- `IOSDeviceToolRegistrar` — registers 4 iOS device management tools +- `IOSAutomationToolRegistrar` — registers 10 iOS UI automation tools plus server lifecycle management + +#### Scenario: Android device tools registered +- **WHEN** `AndroidDeviceToolRegistrar.registerTools(scope)` is called +- **THEN** tools `available_device_android`, `list_apps_android`, `info_app_android`, `launch_app_android` SHALL be registered + +#### Scenario: Android automation tools registered +- **WHEN** `AndroidAutomationToolRegistrar.registerTools(scope)` is called +- **THEN** all 14 Android automation tools SHALL be registered including `install_automation_server`, `start_automation_server`, `get_ui_hierarchy`, `find_element`, `android_tap_by_coordinates`, `android_swipe`, `android_swipe_direction`, `android_swipe_on_element`, `android_press_back`, `android_press_home`, `android_input_text`, `android_get_device_info`, `get_interactive_elements`, and `automation_server_status` + +#### Scenario: iOS device tools registered +- **WHEN** `IOSDeviceToolRegistrar.registerTools(scope)` is called +- **THEN** tools `ios_available_device`, `ios_list_apps`, `ios_info_app`, `ios_launch_app` SHALL be registered + +#### Scenario: iOS automation tools registered +- **WHEN** `IOSAutomationToolRegistrar.registerTools(scope)` is called +- **THEN** all 10 iOS automation tools SHALL be registered including `ios_start_automation_server`, `ios_automation_server_status`, `ios_get_ui_hierarchy`, `ios_get_interactive_elements`, `ios_tap_by_coordinates`, `ios_swipe`, `ios_swipe_direction`, `ios_find_element`, `ios_get_device_info`, `ios_press_home`, `ios_input_text`, and `ios_stop_automation_server` + +### Requirement: ToolFactory remains the public entry point +`ToolFactory` SHALL maintain its existing constructor signature and `registerAllTools(server: Server)` method. `Main.kt` SHALL require zero changes. + +#### Scenario: ToolFactory constructor compatibility +- **WHEN** `ToolFactory(android, ios, logger)` is constructed (using defaults for optional params) +- **THEN** it SHALL compile and function identically to the pre-refactor version + +#### Scenario: registerAllTools delegates to registrars +- **WHEN** `toolFactory.registerAllTools(server)` is called +- **THEN** it SHALL create a `ToolScope` and pass it to each of the four registrars