feat(mcp): unified MCP client with prompts, caching, retry, batch, REPL, metrics, and auto-routing#140
feat(mcp): unified MCP client with prompts, caching, retry, batch, REPL, metrics, and auto-routing#140toasterbook88 wants to merge 3 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive Model Context Protocol (MCP) client integration and an execution observation tracking system. It adds a new mcp client CLI for discovering and calling tools on external servers via stdio or HTTP, alongside an interactive REPL and a registry for managing multiple connections with retry logic. Additionally, it implements an observations command to monitor resource usage (RAM, VRAM, wall time) and updates the doctor command with MCP configuration validation. Review feedback highlights the need to use command contexts for proper cancellation, fix a typo in a flag default, and resolve input parsing bugs in the REPL where arguments with spaces were truncated. Suggestions were also made to refactor duplicate tool-calling logic and adopt more reliable patterns for identifying transient HTTP errors.
| } | ||
|
|
||
| reg := mcpclient.NewRegistry() | ||
| ctx := context.Background() |
There was a problem hiding this comment.
Using context.Background() for connection and discovery prevents the operation from being cancelled if the user terminates the command (e.g., via Ctrl+C). Since ConnectAll can take significant time depending on the number of servers and their responsiveness, it should use the command's context. This applies to all runMCPClient* functions in this file.
| }, | ||
| } | ||
| cmd.Flags().StringVar(&server, "server", "", "Filter to a specific server") | ||
| cmd.Flags().StringVar(&format, "format", "format", "Output format: text or json") |
There was a problem hiding this comment.
The default value for the format flag is set to "format", which appears to be a typo. It should be "text" to remain consistent with other commands and ensure the default output is human-readable.
| cmd.Flags().StringVar(&format, "format", "format", "Output format: text or json") | |
| cmd.Flags().StringVar(&format, "format", "text", "Output format: text or json") |
| } | ||
| var rawArgs string | ||
| if len(args) > 2 { | ||
| rawArgs = args[2] |
There was a problem hiding this comment.
In the interactive REPL, strings.Fields splits the input line by whitespace. This causes JSON arguments containing spaces to be broken into multiple parts, and only the first part is assigned to rawArgs. Joining the remaining parts ensures the full JSON string is captured.
| rawArgs = args[2] | |
| rawArgs = strings.Join(args[2:], " ") |
| } | ||
| var rawArgs string | ||
| if len(args) > 2 { | ||
| rawArgs = args[2] |
There was a problem hiding this comment.
| fmt.Fprintln(out, "Usage: search <keyword>") | ||
| continue | ||
| } | ||
| keywordLower := strings.ToLower(args[0]) |
There was a problem hiding this comment.
The search command in the REPL only uses the first word of the keyword. If a user searches for a multi-word term (e.g., search system health), only the first word is used. Joining all arguments allows searching for multi-word terms.
| keywordLower := strings.ToLower(args[0]) | |
| keywordLower := strings.ToLower(strings.Join(args, " ")) |
| if strings.Contains(err.Error(), "500") || strings.Contains(err.Error(), "502") || | ||
| strings.Contains(err.Error(), "503") || strings.Contains(err.Error(), "504") { |
There was a problem hiding this comment.
Checking for HTTP status codes using strings.Contains on the entire error string is fragile and prone to false positives. For example, an error message containing a port number like :5000 or a tool error message that happens to mention '500' would be incorrectly identified as a transient HTTP error and retried. It is safer to check if the error is an HTTP error type and inspect the status code directly.
References
- Identify data fields by unique, stable patterns rather than relying on fragile string matching or indexing.
| func (r *Registry) CallTool(ctx context.Context, serverName, toolName string, args map[string]any) CallResult { | ||
| sc := r.Get(serverName) | ||
| if sc == nil { | ||
| return CallResult{Server: serverName, Err: fmt.Errorf("server %q not configured", serverName)} | ||
| } | ||
| if !sc.Connected() { | ||
| return CallResult{Server: serverName, Err: fmt.Errorf("server %q not connected: %v", serverName, sc.Err)} | ||
| } | ||
|
|
||
| start := time.Now() | ||
| var result *mcp.CallToolResult | ||
| err := withRetry(ctx, func() error { | ||
| cctx, cancel := context.WithTimeout(ctx, 30*time.Second) | ||
| defer cancel() | ||
| req := mcp.CallToolRequest{} | ||
| req.Params.Name = toolName | ||
| req.Params.Arguments = args | ||
| res, callErr := sc.Client.CallTool(cctx, req) | ||
| result = res | ||
| return callErr | ||
| }) | ||
| sc.RecordCall(time.Since(start), err) | ||
| return CallResult{Server: serverName, Result: result, Err: err} | ||
| } |
- Replace context.Background() with cmd.Context() in all runMCPClient* functions - Fix --format flag default typo (format → text) - Fix REPL arg parsing for call/get-prompt to join remaining args with spaces - Fix REPL search to use full multi-word keyword - Refactor isTransientError to use errors.As with httpStatusCoder instead of fragile string matching - Extract shared callTool internal method to deduplicate CallTool and CallToolWithProgress - Update mcp_client_test.go to pass context.Context Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
8256910 to
61447d1
Compare
…PL, metrics, and auto-routing Implements a comprehensive axis mcp client subcommand suite: Operator UX: - axis mcp client prompts / get-prompt for prompt discovery - --pretty flag for human-readable JSON output on call/read - axis mcp client search <keyword> for tool discovery - axis doctor validates MCP server configs (stdio command existence, HTTP URL) Reliability: - Per-connection caching (60s TTL) with sync.RWMutex - Retry with exponential backoff (3 attempts, 200ms start) for transient errors - axis mcp client batch <file.json> for sequential multi-tool execution Advanced Surfaces: - Interactive REPL (axis mcp client interactive) with 10 commands - Per-server metrics surfaced in list --format json Intelligence: - Placement-aware routing (--auto-route) tries all servers offering a tool - Progress notification infrastructure (CallToolWithProgress, SetProgressHandler) Quality gates: lint, test-race, coverage, build all pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fixes TestMeshPartitionCoalescing by properly handling pending triggers during refresh completion. The original logic had a race condition where pending triggers weren't properly counted when completing a refresh. Changes: - Record start time when winning the refresh lock for latency fallback - Capture pending state before clearing it - Use captured pending triggers for next trigger determination - Fix activeRequestedAt assignment logic
- Replace context.Background() with cmd.Context() in all runMCPClient* functions - Fix --format flag default typo (format → text) - Fix REPL arg parsing for call/get-prompt to join remaining args with spaces - Fix REPL search to use full multi-word keyword - Refactor isTransientError to use errors.As with httpStatusCoder instead of fragile string matching - Extract shared callTool internal method to deduplicate CallTool and CallToolWithProgress - Update mcp_client_test.go to pass context.Context Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
61447d1 to
d8900ec
Compare
This PR adds a comprehensive MCP client to the axis CLI, implementing 11 out of 12 planned improvements (with daemon integration deferred). Features include:
All quality checks pass.