Design: Generic PyATS test type and open-jobfile execution

## TL;DR

nac-test has no supported extension point for custom PyATS tests that don't belong to a specific architecture (SD-WAN, ACI, CC, etc.). I propose a new `generic` test type — identified by inheriting from a new `NACCustomTestBase` — that runs through the full nac-test pipeline (job execution, progress tracking, HTML report, xunit, combined dashboard) but requires **no controller, no device inventory, and no SSH broker**. Results appear as "PyATS Generic" alongside API, D2D, and Robot results. A second phase would add support for arbitrary PyATS jobfiles (Robot-parity), and a third phase composable opt-in primitives (broker, device resolver, controller session) for more advanced custom tests.

---

## Background

nac-test currently runs three kinds of tests: PyATS API tests (controller REST validation), PyATS D2D/SSH tests (direct device SSH), and Robot Framework tests. Each integrates fully with our execution, reporting, and result-aggregation pipeline.

Two gaps have emerged:

1. **No extension point for custom tests.** End-users and delivery teams who do not use or cannot use the Quicksilver pipeline to generate architecture-specific tests have no supported way to add their own PyATS test cases that integrate with the nac-test pipeline — output formatting, progress tracking, HTML reports, xunit, combined dashboard. The only available base classes carry controller and credential machinery that may be unnecessary or unavailable in their context.

2. **No escape hatch for arbitrary PyATS content.** Power users with existing PyATS jobfiles or test scripts have no way to run them inside nac-test without adapting them to our base classes. This is the inverse of Robot Framework, where any valid `.robot` file can be executed without modification.

---

## Use cases

**UC-1 — End-user and delivery team custom tests (primary motivation)**
Delivery teams and end-users want to add their own site-specific or deployment-specific PyATS checks alongside the architecture-generated test suite, without needing Quicksilver or an architecture-specific base class. Examples: validating a post-change configuration invariant from the data model, running a custom health check that doesn't fit the standard NRFU pattern, or augmenting an existing test suite with project-specific assertions.

**UC-2 — Internal smoke tests and pipeline validation (#722)**
Tests that exercise the full nac-test execution pipeline (job generation, archive extraction, report generation, result aggregation) with known-outcome fixtures — without requiring any live controller or device. A supported `generic` type makes these first-class citizens runnable in CI without external infrastructure.

**UC-3 — Open PyATS jobfiles (Robot-parity, phase 2)**
Users with existing PyATS job files written against vanilla PyATS (no nac-test base classes) should be able to run them inside nac-test and have results appear in the combined summary and dashboard. At minimum, aggregate pass/fail/error counts and a link to the native PyATS HTML report would be shown; per-test detail depends on what `results.json` those tests produce (see #607).

**UC-4 — Composable primitive access (phase 3)**
Custom test authors who want to leverage nac-test infrastructure (connection broker, device resolver, controller detection, retry/caching) should be able to opt in selectively without inheriting the full controller setup. Deferred to phase 3.

---

## What we already have (relevant prior art)

The class hierarchy in `nac_test.pyats_core.common` is relevant context:

```
aetest.Testcase
└─ NACTestBase          (base_test.py)
    │  setup(): controller detection, credential loading,
    │           HTTP pool, result_collector, batching_reporter
    │  load_data_model(), record_result(), retry/caching helpers
    │
    └─ SSHTestBase       (ssh_base_test.py)
           setup(): broker client, device context, SSH command execution
```

The non-controller parts of `NACTestBase.setup()` — `load_data_model()`, `_initialize_result_collector()`, `_initialize_batching_reporter()` — are our own code. This means we have full freedom to restructure the hierarchy to extract a thinner base class that provides only those reporting parts, without inheriting controller setup. This restructure is the prerequisite for `NACCustomTestBase`.

The orchestrator's `_run_tests_async()` currently gates execution on test type via explicit `if api_tests / if d2d_tests` branches. Pre-flight (`validate_environment()`) runs unconditionally today; device inventory discovery and the connection broker are D2D-only. The `generic` type must not trigger any of those.

---

## Design areas and open decisions

### 1. Class hierarchy restructure

`NACTestBase` today conflates two concerns: **reporting infrastructure** (result collector, batching reporter, data model loading) and **controller setup** (detection, credentials, HTTP pool). A new base class needs only the former.

**Proposed structure:**

```
aetest.Testcase
└─ NACBaseTest              (NEW — thin base, reporting only)
    │  setup(): load_data_model(), _initialize_result_collector(),
    │           _initialize_batching_reporter()
    │  record_result(), data_model access
    │
    ├─ NACTestBase           (existing — adds controller setup, unchanged)
    │   └─ SSHTestBase       (existing — adds broker/SSH, unchanged)
    │
    └─ NACCustomTestBase     (NEW — generic type, no controller)
```

`NACBaseTest` carries everything needed for result collection and HTML reporting. `NACTestBase` extends it with controller machinery — no change for existing tests or their consumers in `nac_test_pyats_common`. `NACCustomTestBase` extends `NACBaseTest` directly, adding nothing in phase 1 beyond being the discoverable base class for the `generic` type.

**Alternative:** Keep `NACTestBase` as-is, extract the reporting helpers into a mixin, and compose `NACCustomTestBase` from `aetest.Testcase` + that mixin. More flexible for phase-3 composition but higher structural change upfront.

**Decision needed:** Which restructuring approach? Are there constraints from `nac_test_pyats_common` consumers that would make either approach problematic?

### 2. `NACCustomTestBase` public interface

Minimum viable interface for phase 1:

```python
class NACCustomTestBase(NACBaseTest):
    """Base class for generic/custom PyATS tests.

    Provides: data model access, result collection, HTML reporting.
    Does not require: controller, device inventory, SSH broker.

    Module-level constants (recommended):
        TITLE: str          — test title shown in HTML report
        DESCRIPTION: str    — markdown description (optional)
        SETUP: str          — markdown setup notes (optional)
        PROCEDURE: str      — markdown procedure (optional)
        PASS_FAIL_CRITERIA: str — markdown criteria (optional)
    """
    @aetest.setup
    def setup(self) -> None: ...  # calls NACBaseTest reporting init only

    def record_result(self, status: ResultStatus, message: str) -> None: ...
```

**Open questions:**

- **Package location:** Both `nac_test` and `nac_test_pyats_common` are external-facing. `NACCustomTestBase` in `nac_test_pyats_common` is the more natural home for external test authors, but it requires the common library to depend on nac-test's reporting infrastructure (`TestResultCollector`, `BatchingReporter`), or those pieces need to be extracted into a shared package. `nac_test` is simpler but less discoverable. Which is the right home?
- **Public name:** `NACCustomTestBase` communicates "extensibility". Alternatives: `NACGenericTestBase`, `CustomTestBase`, `NACExtTestBase`. This is the name external test authors will import — worth choosing deliberately.
- **Metadata requirements:** Should `TITLE` be the only required module-level constant, with the rest optional? Or should all metadata fields be optional for custom tests?

### 3. Test type discovery

Discovery maps base class names → type strings via `BASE_CLASS_MAPPING`. Adding `NACCustomTestBase` → `"generic"` is a small additive change. The categorisation logic that splits discovered files into execution buckets needs a corresponding third path.

Generic test files live in the same test directory tree as API and D2D tests and are discovered alongside them via the same base-class scanning mechanism — no separate directory or CLI flag required for phase 1.

Tag-based filtering (`--include-tags` / `--exclude-tags`) should work uniformly: custom tests should support the same `groups = [...]` class-variable convention as existing types.

**Separate opportunity:** D2D tests currently use controller detection to infer the architecture (SD-WAN, CC, IOSXE), even though D2D itself doesn't require a reachable controller at test time. Enhancing test discovery to also infer architecture type from the base class would be more robust and remove an implicit dependency. This deserves its own ticket but is worth flagging as related work.

### 4. Execution sequencing and gating

The orchestrator needs to know which setup steps each type requires. Currently this is implicit. With a third type it should become explicit.

| Step | API | D2D | Generic |
|---|---|---|---|
| Controller env validation | yes | yes | **no** |
| Device inventory discovery | no | yes | **no** |
| Connection broker | no | yes | **no** |
| Job file generation | yes | yes | yes (same as API) |
| Archive / report generation | yes | yes | yes |

**Execution timing — open question:** Should generic tests run in parallel with API and D2D tests, or sequentially after them? Parallel is consistent with how API and D2D already run concurrently, and there is no inherent ordering dependency. Sequential-after could be useful if generic tests are expected to validate outputs produced by the API/D2D run, but that is a niche use case. Recommendation: parallel by default, with no sequencing guarantee, consistent with the existing model.

**Structural proposal:** Introduce a `TestTypeStrategy` per type (or equivalent capability flags) so the orchestrator's dispatch loop becomes type-agnostic — new types declare their capabilities and the loop does not grow another `if generic_tests` branch. This is the right investment before adding a third type, given more types are anticipated.

### 5. Results and reporting

Adding `generic` requires the following additive changes:
- `generic: TestResults | None` field on `PyATSResults` and `CombinedResults`
- Entry in `FRAMEWORK_METADATA` in `combined_generator.py` with display name and report path (`pyats_results/generic/html_reports/summary_report.html`)
- `if results.generic is not None` block in `xunit_merger.py` and `_print_execution_summary()`
- `NAC_TEST_TYPE=generic` env var set during execution so `TestResultCollector` writes to a type-specific temp directory, preventing race conditions with concurrent API/D2D runs

**Display name decision needed:** "PyATS Generic" is the working title. "PyATS Custom", "Custom Tests", "PyATS Extension" are alternatives. This label appears in the combined dashboard and as the xunit suite name — visible to end-users and CI integrations consuming the xunit file.

### 6. Open PyATS jobfile support (phase 2)

The Robot-parity goal: allow users to run arbitrary PyATS content without any nac-test base class inheritance.

For discovery, the intent is to use the same mechanism as for `generic`/`api`/`d2d` — scanning the test directory tree — rather than requiring a separate CLI flag. What constitutes a "discoverable" open jobfile test (file naming convention, marker, directory convention) is an open question.

The relationship between PyATS jobfiles and test scripts is worth discussing explicitly. For our current test types (API, D2D, generic), nac-test **generates** the jobfile at runtime from a list of discovered test scripts. For open PyATS tests, the jobfile may already exist or may not exist at all — it's unclear whether the generated-jobfile approach can be sustained, or whether a different execution path is needed. This is deliberately left open for phase 2 discussion.

Reporting would draw on `results.json` from the PyATS archive, which `ArchiveInspector` already reads. Full per-test detail depends on #607; without it, reporting degrades to aggregate counts plus a link to the native PyATS HTML report.

### 7. Composable primitives (phase 3)

Custom test authors who need nac-test infrastructure should be able to opt in selectively:

```python
class MyTest(NACCustomTestBase, SSHCapableMixin):
    # gets: data model, result collector, broker client, SSH execution
    # does not get: controller detection, HTTP pool
```

`NACCustomTestBase`'s phase-1 design should not foreclose this — avoid a `setup()` structure that would prevent mixin composition later.

---

## Phasing summary

| Phase | Scope | Blocking dependencies |
|---|---|---|
| **1 — Generic type** | `NACBaseTest` hierarchy refactor, `NACCustomTestBase`, discovery, execution gating, `PyATSResults`/`CombinedResults` field, reporting entry, xunit | None |
| **2 — Open jobfiles** | Discovery of vanilla PyATS tests, execution path discussion (jobfile generation vs. provided jobfile), `results.json`-based reporting | #607 (for full per-test detail) |
| **3 — Composable primitives** | `SSHCapableMixin`, `APICapableMixin`, public primitive API in `nac_test_pyats_common` | Phase 1 |

---

## Questions for the team

1. **Class hierarchy:** Extract `NACBaseTest` as a new thin intermediate, or use mixin composition? Constraints from `nac_test_pyats_common` consumers?
2. **Package location:** Should `NACCustomTestBase` live in `nac_test` or `nac_test_pyats_common`? What are the dependency implications?
3. **Public name:** `NACCustomTestBase` or another name?
4. **Metadata requirements:** `TITLE` mandatory only, rest optional — or fully optional?
5. **Execution timing:** Parallel with API/D2D, or sequential after?
6. **`TestTypeStrategy` refactor:** Prerequisite to phase 1, or deferred until before phase 2?
7. **Display name / xunit suite name:** "PyATS Generic", "PyATS Custom", or something else?
8. **Open jobfiles (phase 2):** Same discovery mechanism as other types — what makes a file discoverable as an open PyATS test? Can the generated-jobfile approach be sustained, or is a different execution path needed?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design: Generic PyATS test type and open-jobfile execution #730

TL;DR

Background

Use cases

What we already have (relevant prior art)

Design areas and open decisions

1. Class hierarchy restructure

2. `NACCustomTestBase` public interface

3. Test type discovery

4. Execution sequencing and gating

5. Results and reporting

6. Open PyATS jobfile support (phase 2)

7. Composable primitives (phase 3)

Phasing summary

Questions for the team

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Step	API	D2D	Generic
Controller env validation	yes	yes	no
Device inventory discovery	no	yes	no
Connection broker	no	yes	no
Job file generation	yes	yes	yes (same as API)
Archive / report generation	yes	yes	yes

Phase	Scope	Blocking dependencies
1 — Generic type	`NACBaseTest` hierarchy refactor, `NACCustomTestBase`, discovery, execution gating, `PyATSResults`/`CombinedResults` field, reporting entry, xunit	None
2 — Open jobfiles	Discovery of vanilla PyATS tests, execution path discussion (jobfile generation vs. provided jobfile), `results.json`-based reporting	#607 (for full per-test detail)
3 — Composable primitives	`SSHCapableMixin`, `APICapableMixin`, public primitive API in `nac_test_pyats_common`	Phase 1

Design: Generic PyATS test type and open-jobfile execution #730

Description

TL;DR

Background

Use cases

What we already have (relevant prior art)

Design areas and open decisions

1. Class hierarchy restructure

2. NACCustomTestBase public interface

3. Test type discovery

4. Execution sequencing and gating

5. Results and reporting

6. Open PyATS jobfile support (phase 2)

7. Composable primitives (phase 3)

Phasing summary

Questions for the team

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

2. `NACCustomTestBase` public interface