Skip to content

handleCallToolVariant: panic recovery returns (nil, nil) crashing mcp-go; triggered by tiktoken cache corruption #318

@Melodeiro

Description

@Melodeiro

Bug Description

call_tool_read, call_tool_write, and call_tool_destructive panic with nil pointer dereference. The client receives an empty HTTP response and the connection is dropped.

Two independent bugs combine to cause the crash:

Bug 1: Corrupted tiktoken cache (trigger)

tiktoken-go v0.1.8 (load.go:31) does http.Get(blobpath) and caches the response body without checking resp.StatusCode. If Azure Blob Storage returns a 404 (or any HTTP error), the XML error response is cached as a valid BPE file:

/tmp/data-gym-cache/9b5ad71b2ce5302211f9c61530b329a4922fc6a4

Content (223 bytes instead of ~1.6MB):

<?xml version="1.0" encoding="utf-8"?>
<Error><Code>ResourceNotFound</Code><Message>The specified resource does not exist.</Message></Error>

This causes tokens.NewTokenizer() to fail. The fallback in runtime.go:145 also calls GetEncoding() which reads the same corrupted cache → returns (nil, error). The error is discarded with _:

tokenizer, _ = tokens.NewTokenizer(tokenizerEncoding, logger.Sugar(), false)

The nil *DefaultTokenizer is then assigned to the tokens.Tokenizer interface field — creating a non-nil interface with nil underlying value (Go spec: an interface holding a nil concrete value is itself non-nil).

Bug 2: Panic recovery returns (nil, nil) (amplifier)

handleCallToolVariant (mcp.go:1012) has a defer with recover(), but uses unnamed return values:

func (p *MCPProxyServer) handleCallToolVariant(...) (*mcp.CallToolResult, error) {
    defer func() {
        if r := recover(); r != nil {
            p.logger.Error("Recovered from panic in handleCallToolVariant", ...)
            // BUG: no return value set — function returns (nil, nil)
        }
    }()

When the tokenizer nil dereference panics at mcp.go:1245:

  1. recover() catches the panic and logs it
  2. Function returns zero values: (*mcp.CallToolResult)(nil), (error)(nil)
  3. mcp-go HandleMessage (request_handler.go:331) does return createResponse(baseMessage.ID, *result)second panic on nil dereference
  4. This second panic is not recovered → HTTP goroutine crashes → client gets empty response

Additional: unsafe type assertion (mcp.go:~1253)

Encoding: tokenizer.(*tokens.DefaultTokenizer).GetDefaultEncoding(),

This is a non-checked type assertion that panics if the underlying value is nil, even when the interface is non-nil.

Reproduction

  1. Delete tiktoken cache: rm /tmp/data-gym-cache/9b5ad71b2ce5302211f9c61530b329a4922fc6a4
  2. Block the Azure URL (simulate 404) and restart mcpproxy
  3. Any call_tool_read/write/destructive call will panic

Or simply corrupt the cache file:

echo "corrupted" > /tmp/data-gym-cache/9b5ad71b2ce5302211f9c61530b329a4922fc6a4

Suggested Fixes

1. Named return values in panic recovery (critical)

func (p *MCPProxyServer) handleCallToolVariant(...) (callResult *mcp.CallToolResult, callErr error) {
    defer func() {
        if r := recover(); r != nil {
            p.logger.Error("Recovered from panic in handleCallToolVariant", ...)
            callResult = mcp.NewToolResultError(fmt.Sprintf("Internal proxy error: %v", r))
            callErr = nil
        }
    }()

2. Safe tokenizer nil check before use

isTokenizerValid := false
if tokenizer != nil {
    if dt, ok := tokenizer.(*tokens.DefaultTokenizer); ok && dt != nil {
        isTokenizerValid = true
    }
}
if isTokenizerValid {
    // ... count tokens
}

3. Upstream: tiktoken-go should check HTTP status before caching

load.go:31 should verify resp.StatusCode == 200 before writing to cache.

Versions Tested

All versions from v0.9.0+ (when tokenizer was added in commit 4ad01d4) are affected:

  • v0.14.0 (mcp-go@v0.43.1) — same panic
  • v0.17.8, v0.18.1, v0.19.1 (mcp-go@v0.44.0-beta.2) — same panic
  • v0.8.6 and below — not affected (no tokenizer code)

Workaround

Delete the corrupted tiktoken cache file and restart mcpproxy:

rm /tmp/data-gym-cache/9b5ad71b2ce5302211f9c61530b329a4922fc6a4
systemctl --user restart mcpproxy

Environment

  • OS: Linux amd64 (Ubuntu)
  • mcpproxy: v0.19.1
  • mcp-go: v0.44.0-beta.2
  • tiktoken-go: v0.1.8
  • Go: 1.25.7

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions