Skip to content

Verify: Add 5-class node classification (Source/Input/Process/Output/Claim) #147

@ywatanabe1989

Description

@ywatanabe1989

Summary

Add semantic role classification to DAG nodes in the verify module: Source, Input, Processing, Output, Claim.

Current State

file_hashes table tracks files but doesn't classify them by pipeline role. The 3 states (VERIFIED, FAILED, UNKNOWN) describe integrity but not function.

Proposed Changes

  1. Add role column to file_hashes schema: source | input | process | output | claim
  2. Auto-infer role from file type/location (.py scripts → source/process, .csv/.npy → input/output)
  3. Surface role in Mermaid DAG visualization (shape/color per class)
  4. Enable role-based severity analysis in BPV output

Class Definitions

Class Description Examples
Source Data acquisition scripts 01_source.py
Input Raw data, configuration source.csv, config.yaml
Processing Transform/analysis scripts 02_preprocess.py
Output Intermediate/final data clean_A.csv, results.csv
Claim Paper assertions p=0.003 (L.42), Figure 1

Motivation

  • Enables severity analysis: Source-level tampering → full invalidation; Output-level → specific Claims
  • Framework-agnostic vocabulary readers can map to their own pipelines
  • Aligns verify module with figrecipe Schematic's node classes (figrecipe#95)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions