Skip to content

Refactor Service Commands for Testability (Phase 2) #730

@vfusco

Description

@vfusco

Summary

Refactor service commands to enable deeper unit testing of command execution, configuration loading, and service initialization. This phase addresses the testability limitations identified in Phase 1.

Related:

Affected Services

Command Package
cartesi-rollups-advancer cmd/cartesi-rollups-advancer/
cartesi-rollups-claimer cmd/cartesi-rollups-claimer/
cartesi-rollups-evm-reader cmd/cartesi-rollups-evm-reader/
cartesi-rollups-jsonrpc-api cmd/cartesi-rollups-jsonrpc-api/
cartesi-rollups-node cmd/cartesi-rollups-node/
cartesi-rollups-prt cmd/cartesi-rollups-prt/
cartesi-rollups-validator cmd/cartesi-rollups-validator/

Motivation

The current service command structure has testability limitations:

// Current structure (validator example)
func run(cmd *cobra.Command, args []string) {
    ctx, cancel := context.WithTimeout(context.Background(), cfg.MaxStartupTime)
    defer cancel()

    // Creates real repository connection - can't mock
    createInfo.Repository, err = factory.NewRepositoryFromConnectionString(ctx, cfg.DatabaseConnection.String())
    cobra.CheckErr(err)
    defer createInfo.Repository.Close()

    // Creates and runs real service
    validatorService, err := validator.Create(ctx, &createInfo)
    cobra.CheckErr(err)

    // Blocks forever - can't test completion
    cobra.CheckErr(validatorService.Serve())
}

Problems:

  1. run function creates real dependencies (database connections, services)
  2. Serve() blocks indefinitely - can't test command execution end-to-end
  3. Config (cfg) is a package-level variable populated in PreRunE
  4. No way to inject mocks or test doubles

Proposed Refactoring

Option A: Constructor Function with Interface (Recommended)

Introduce a ServiceRunner interface and constructor function:

// root.go

// ServiceRunner abstracts the service execution for testability
type ServiceRunner interface {
    Run(ctx context.Context, cfg *config.ValidatorConfig) error
}

// DefaultServiceRunner is the production implementation
type DefaultServiceRunner struct{}

func (r *DefaultServiceRunner) Run(ctx context.Context, cfg *config.ValidatorConfig) error {
    repo, err := factory.NewRepositoryFromConnectionString(ctx, cfg.DatabaseConnection.String())
    if err != nil {
        return err
    }
    defer repo.Close()

    createInfo := validator.CreateInfo{
        CreateInfo: service.CreateInfo{
            Name:                 serviceName,
            LogLevel:             cfg.LogLevel,
            LogColor:             cfg.LogColor,
            EnableSignalHandling: true,
            TelemetryCreate:      true,
            TelemetryAddress:     cfg.TelemetryAddress,
            PollInterval:         cfg.ValidatorPollingInterval,
        },
        Config: *cfg,
    }
    createInfo.Repository = repo

    svc, err := validator.Create(ctx, &createInfo)
    if err != nil {
        return err
    }

    return svc.Serve()
}

// NewCmd creates the command with injectable runner
func NewCmd(runner ServiceRunner) *cobra.Command {
    var (
        logLevel         string
        logColor         bool
        databaseConnection string
        pollInterval     string
        maxStartupTime   string
        telemetryAddress string
        cfg              *config.ValidatorConfig
    )

    cmd := &cobra.Command{
        Use:     "cartesi-rollups-" + serviceName,
        Short:   "Runs cartesi-rollups-" + serviceName,
        Long:    "Runs cartesi-rollups-" + serviceName + " in standalone mode",
        Version: version.BuildVersion,
        PreRunE: func(cmd *cobra.Command, args []string) error {
            var err error
            cfg, err = config.LoadValidatorConfig()
            return err
        },
        RunE: func(cmd *cobra.Command, args []string) error {
            ctx, cancel := context.WithTimeout(context.Background(), cfg.MaxStartupTime)
            defer cancel()
            return runner.Run(ctx, cfg)
        },
    }

    // Register flags
    cmd.Flags().StringVar(&telemetryAddress, "telemetry-address", ":10003", "Health check and metrics address and port")
    cobra.CheckErr(viper.BindPFlag(config.TELEMETRY_ADDRESS, cmd.Flags().Lookup("telemetry-address")))

    cmd.Flags().StringVar(&logLevel, "log-level", "info", "Log level: debug, info, warn or error")
    cobra.CheckErr(viper.BindPFlag(config.LOG_LEVEL, cmd.Flags().Lookup("log-level")))

    cmd.Flags().BoolVar(&logColor, "log-color", true, "Tint the logs (colored output)")
    cobra.CheckErr(viper.BindPFlag(config.LOG_COLOR, cmd.Flags().Lookup("log-color")))

    cmd.Flags().StringVar(&databaseConnection, "database-connection", "",
        "Database connection string in the URL format\n(eg.: 'postgres://user:password@hostname:port/database') ")
    cobra.CheckErr(viper.BindPFlag(config.DATABASE_CONNECTION, cmd.Flags().Lookup("database-connection")))

    cmd.Flags().StringVar(&pollInterval, "poll-interval", "7", "Poll interval")
    cobra.CheckErr(viper.BindPFlag(config.VALIDATOR_POLLING_INTERVAL, cmd.Flags().Lookup("poll-interval")))

    cmd.Flags().StringVar(&maxStartupTime, "max-startup-time", "15", "Maximum startup time in seconds")
    cobra.CheckErr(viper.BindPFlag(config.MAX_STARTUP_TIME, cmd.Flags().Lookup("max-startup-time")))

    return cmd
}

// Cmd is the default command for production use
var Cmd = NewCmd(&DefaultServiceRunner{})

Benefits:

  • Clean separation of concerns
  • Easy to inject mock runner in tests
  • Flag variables are local to the command (no package-level state)
  • Production code unchanged (Cmd still works the same)

Option B: Function Variable (Minimal Change)

If Option A is too invasive, use a replaceable function variable:

// root.go

// RunService is the service execution function, replaceable for testing
var RunService = func(ctx context.Context, cfg *config.ValidatorConfig) error {
    repo, err := factory.NewRepositoryFromConnectionString(ctx, cfg.DatabaseConnection.String())
    if err != nil {
        return err
    }
    defer repo.Close()

    // ... rest of service setup and run
    return svc.Serve()
}

func run(cmd *cobra.Command, args []string) {
    ctx, cancel := context.WithTimeout(context.Background(), cfg.MaxStartupTime)
    defer cancel()
    cobra.CheckErr(RunService(ctx, cfg))
}

Benefits:

  • Minimal code changes
  • Easy to understand

Drawbacks:

  • Package-level mutable state
  • Must remember to restore original function after tests

Test Examples

Testing with Option A (Mock Runner)

// root_test.go

type MockServiceRunner struct {
    RunCalled  bool
    RunConfig  *config.ValidatorConfig
    RunContext context.Context
    RunError   error
}

func (m *MockServiceRunner) Run(ctx context.Context, cfg *config.ValidatorConfig) error {
    m.RunCalled = true
    m.RunConfig = cfg
    m.RunContext = ctx
    return m.RunError
}

func TestValidatorCommand_ExecutesRunner(t *testing.T) {
    viper.Reset()

    mock := &MockServiceRunner{}
    cmd := root.NewCmd(mock)

    buf := new(bytes.Buffer)
    cmd.SetOut(buf)
    cmd.SetErr(buf)
    cmd.SetArgs([]string{
        "--database-connection", "postgres://test:test@localhost:5432/testdb",
        "--log-level", "debug",
    })

    err := cmd.Execute()

    require.NoError(t, err)
    assert.True(t, mock.RunCalled)
    assert.Equal(t, "debug", mock.RunConfig.LogLevel)
}

func TestValidatorCommand_PropagatesRunnerError(t *testing.T) {
    viper.Reset()

    expectedErr := errors.New("connection failed")
    mock := &MockServiceRunner{RunError: expectedErr}
    cmd := root.NewCmd(mock)

    buf := new(bytes.Buffer)
    cmd.SetOut(buf)
    cmd.SetErr(buf)
    cmd.SetArgs([]string{
        "--database-connection", "postgres://test:test@localhost:5432/testdb",
    })

    err := cmd.Execute()

    assert.Error(t, err)
    assert.Contains(t, err.Error(), "connection failed")
}

func TestValidatorCommand_RespectsContextTimeout(t *testing.T) {
    viper.Reset()

    mock := &MockServiceRunner{}
    cmd := root.NewCmd(mock)

    cmd.SetArgs([]string{
        "--database-connection", "postgres://test:test@localhost:5432/testdb",
        "--max-startup-time", "5",
    })

    _ = cmd.Execute()

    deadline, ok := mock.RunContext.Deadline()
    assert.True(t, ok)
    assert.WithinDuration(t, time.Now().Add(5*time.Second), deadline, 1*time.Second)
}

Testing with Option B (Function Variable)

// root_test.go

func TestValidatorCommand_CallsRunService(t *testing.T) {
    viper.Reset()

    // Save original and restore after test
    originalRun := root.RunService
    t.Cleanup(func() { root.RunService = originalRun })

    var capturedConfig *config.ValidatorConfig
    root.RunService = func(ctx context.Context, cfg *config.ValidatorConfig) error {
        capturedConfig = cfg
        return nil
    }

    buf := new(bytes.Buffer)
    cmd := root.Cmd
    cmd.SetOut(buf)
    cmd.SetErr(buf)
    cmd.SetArgs([]string{
        "--database-connection", "postgres://test:test@localhost:5432/testdb",
        "--log-level", "warn",
    })

    err := cmd.Execute()

    require.NoError(t, err)
    require.NotNil(t, capturedConfig)
    assert.Equal(t, "warn", capturedConfig.LogLevel)
}

Additional Test Cases

With the refactoring, we can now test:

PreRunE Configuration Loading

func TestValidatorCommand_PreRunE_InvalidConfig(t *testing.T) {
    viper.Reset()

    mock := &MockServiceRunner{}
    cmd := root.NewCmd(mock)

    // Missing required database connection
    cmd.SetArgs([]string{
        "--log-level", "debug",
    })

    err := cmd.Execute()

    assert.Error(t, err)
    assert.False(t, mock.RunCalled) // Runner should not be called
}

Context Cancellation

func TestValidatorCommand_ContextIsCancelled(t *testing.T) {
    viper.Reset()

    mock := &MockServiceRunner{
        RunError: nil,
    }
    cmd := root.NewCmd(mock)

    cmd.SetArgs([]string{
        "--database-connection", "postgres://test:test@localhost:5432/testdb",
    })

    _ = cmd.Execute()

    // After command returns, context should have been cancelled
    select {
    case <-mock.RunContext.Done():
        // Expected - context was cancelled
    default:
        t.Error("context should have been cancelled after run completed")
    }
}

Implementation Plan

  1. Choose refactoring approach - Team decides between Option A and Option B
  2. Implement for one service - Start with cartesi-rollups-validator as proof of concept
  3. Review and iterate - Adjust pattern based on feedback
  4. Apply to remaining services - Roll out to all 7 services
  5. Add comprehensive tests - Cover execution, errors, context handling

Acceptance Criteria

  • Refactoring approach chosen and documented
  • All services refactored to chosen pattern:
    • cartesi-rollups-advancer
    • cartesi-rollups-claimer
    • cartesi-rollups-evm-reader
    • cartesi-rollups-jsonrpc-api
    • cartesi-rollups-node
    • cartesi-rollups-prt
    • cartesi-rollups-validator
  • Each service has tests for:
    • Runner execution with correct config
    • Runner error propagation
    • Context timeout handling
    • PreRunE validation errors
  • Production behavior unchanged (no functional changes)
  • Tests pass with go test ./cmd/...

References

Labels

testing, services, refactoring, enhancement

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    No status

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions