Skip to content

CLI, docs & examples#65

Merged
m-peko merged 6 commits into
mainfrom
feature/cli
Mar 21, 2026
Merged

CLI, docs & examples#65
m-peko merged 6 commits into
mainfrom
feature/cli

Conversation

@m-peko

@m-peko m-peko commented Mar 19, 2026

Copy link
Copy Markdown
Collaborator

No description provided.

@m-peko m-peko requested a review from garrettallen14 March 19, 2026 15:31
Comment thread docs/cli/commands.md Outdated
@Lzok

Lzok commented Mar 19, 2026

Copy link
Copy Markdown
Collaborator

Output for the analysis of this PR:

Issues & Suggestions

Security / Correctness

  1. _client.py:89-98 - resolve_model / resolve_benchmark swallow all errors silently. If the API returns a 500 or a malformed response, the user just gets "Model not found" with no indication of the real problem. Consider letting non-404 exceptions propagate or at least logging in verbose mode.

  2. ci.py:62 - datetime.utcnow() is deprecated in Python 3.12+. Use datetime.now(datetime.UTC) instead (or datetime.timezone.utc for 3.8 compat).

  3. _app.py:57-59 - Base URL scheme logic is fragile. scheme = "https" if port in (None, 443) else "http" means --port 8443 (common for HTTPS dev servers) falls back to http. Consider defaulting to https always and adding a --no-tls flag, or at least documenting this behavior.

  4. examples/cli/08_spaces.sh:10 - grep -oP is not portable. -P (PCRE) isn't available on macOS's default grep. Use grep -oE or parse with python3/jq for portability.

Architecture

  1. No tests. This is the biggest gap. There are no unit tests for:

    • The CLI commands (Click has CliRunner for this)
    • The new resource modules (scorers.py, integrations.py, evaluation_spaces.py)
    • The formatter/table rendering
    • The resolver logic

    At minimum, the resource modules and resolvers should have tests before merge.

  2. mypy/pyright relaxation is too broad. The PR disables multiple type-checking rules for the entire cli/ directory:

    [mypy-layerlens.cli.*]
    disallow_untyped_defs = False
    disallow_any_generics = False

    And in pyright:

    reportMissingImports = false, reportFunctionMemberAccess = false, reportCallIssue = false, reportArgumentType = false
    

    This effectively opts the CLI out of type safety. The Click decorators are the only real source of type noise; consider using click-type-test or targeted # type: ignore instead of blanket suppression.

Code Quality

  1. _client.py:13-16 - handle_errors double-wraps context. The decorator applies @click.pass_context, but every command function also receives ctx via @click.pass_context on its own. The ctx.invoke(fn, *args, **kwargs) call works around this, but it's fragile and non-obvious. Consider catching errors at the group level via cli.invoke() override or a custom click.Group subclass instead.

  2. _completions.py - All completion functions swallow exceptions with bare except Exception: pass. This is fine for shell completion (you don't want errors there), but consider logging to stderr when --verbose is set, at least for debugging.

  3. _formatter.py:43 - max() on empty iterable will crash. format_single line 43: max(len(k) for k in d) will crash on empty dict. Add a guard.

  4. Duplicate EVALUATION_COLUMNS definition. It's defined in both commands/bulk.py and commands/evaluate.py. Extract to a shared constants module.

  5. 07_scorer_lifecycle.sh:36-38 - Shell injection risk. $SCORER_NAME contains a space and timestamp, and it's interpolated directly into a Python string inside a subprocess. If the name contained a quote, this would break. Use --arg passing or jq filtering instead.

Minor / Nits

  1. The stratix entry point alias in pyproject.toml is undocumented. The README and docs only reference layerlens. Either document it or remove it.

  2. _banner.py - The ASCII art says "STRATIX" but the CLI command is layerlens. Could be confusing to users.

  3. The --quiet / -q flag suppresses the banner but isn't listed in the docs' global options table.

  4. Several resource methods (evaluation_spaces.py, scorers.py, integrations.py) use bare except Exception: return None around Pydantic model construction. This hides real bugs. At minimum log the exception in verbose mode.

@m-peko m-peko merged commit d139f70 into main Mar 21, 2026
7 checks passed
@m-peko m-peko deleted the feature/cli branch March 21, 2026 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants