Skip to content

Feature Request: Add Structured Tagging & Labeling Layer to AMB for Distillery / Model Training #2

@zzhang82

Description

@zzhang82

Summary

Add a structured tagging and labeling layer to AMB so it can serve not only as a runtime memory bridge, but also as a high-quality source of distillery, evaluation, and specialty-model training data.

The goal is to make AMB-generated memories and interaction traces more useful, auditable, filterable, and safe for downstream training workflows.

Background

AMB is being defined as the Agent Memory Bridge: a runtime memory architecture that selects, retrieves, and bridges relevant context during agent interactions.

As we explore building specialized model layers on top of open-source/open-weight models, AMB can become one of the most important data sources for training. However, raw memory records or chat logs are not sufficient. We need structured metadata that describes what the data is, where it came from, how reliable it is, how fresh it is, and whether it can safely be used for training.

Problem

Current AMB memory/context records may support basic memory bridging, but they likely do not yet support the richer labels needed for distillery-quality training data.

Without structured tags and labels, we risk:

  • Mixing stable principles with temporary state
  • Training on stale or superseded information
  • Training on private or sensitive data by accident
  • Losing feedback signals from user confirmations or corrections
  • Making it hard to build eval datasets from real AMB traces
  • Making model training noisy, unsafe, or difficult to audit

Proposed Feature

Add a structured tagging and labeling system to AMB records and AMB interaction traces.

Example AMB record:

{
  "memory_id": "amb_001",
  "content": "AMB should be treated as runtime memory bridge, not an authority/rules layer.",
  "type": "architecture_principle",
  "scope": "WandersCop",
  "source": "user_confirmed",
  "confidence": 0.95,
  "stability": "high",
  "freshness": "current",
  "privacy": "internal",
  "training_eligible": true,
  "training_use": ["architecture_reasoning", "retrieval_eval", "style_alignment"],
  "tags": ["AMB", "memory", "runtime", "distillery"],
  "created_at": "...",
  "last_confirmed_at": "...",
  "supersedes": []
}

Suggested Label Categories

1. Type Labels

Examples:

  • architecture_principle
  • project_state
  • user_preference
  • workflow_pattern
  • tool_usage_trace
  • decision_record
  • correction
  • temporary_context

2. Source Labels

Examples:

  • user_confirmed
  • internal_doc
  • tool_result
  • model_generated
  • inferred
  • external_source

3. Stability / Freshness Labels

Examples:

  • static_principle
  • slowly_changing
  • dynamic_state
  • temporary
  • stale
  • superseded

4. Quality Labels

Examples:

  • accepted_by_user
  • corrected_by_user
  • rejected
  • needs_review
  • gold_sample
  • uncertain

5. Privacy / Training Eligibility Labels

Examples:

  • trainable
  • eval_only
  • retrieval_only
  • do_not_train
  • private_internal
  • sensitive
  • needs_anonymization

6. Task Labels

Examples:

  • architecture_reasoning
  • memory_update
  • tool_routing
  • internal_docs_qa
  • code_planning
  • customer_support
  • product_strategy

Training Usage Rules

AMB should help determine how each record can be used downstream.

Trainable

Stable and reusable patterns, such as:

  • Architecture principles
  • Repeated workflow patterns
  • Preferred response structures
  • Tool-routing patterns
  • Generalized reasoning patterns

Eval-only

Useful for testing behavior but not suitable for model-weight training.

Examples:

  • Sensitive internal examples
  • Real project scenarios
  • Private customer-like workflows

Retrieval-only

Should stay in AMB/state/database and not be absorbed into model weights.

Examples:

  • Live project state
  • Recent decisions
  • Customer-specific information
  • User-specific private context

Do-not-train

Should not be used for model training.

Examples:

  • Sensitive personal data
  • Confidential data
  • Temporary state
  • Credentials or access information
  • Legally restricted information

Proposed AMB Direction

AMB v1:
Runtime memory bridge

AMB v1.4 / v1.5:
Runtime memory bridge + dedicated state/memory separation

AMB v2:
Runtime memory bridge + labeled trace store + training-data refinery

Why This Matters

This feature is foundational for a future distillery pipeline.

With structured AMB tags, we can later build:

  • Better retrieval
  • Better memory selection
  • Training datasets
  • Evaluation datasets
  • Router models
  • Verifier models
  • Fine-tuning pipelines
  • Safer internal model training
  • Better distinction between stable knowledge and dynamic state

Long-term architecture:

User interaction
  ↓
AMB runtime context selection
  ↓
Model / agent response
  ↓
User feedback or correction
  ↓
Labeled AMB trace
  ↓
Distillery pipeline
  ↓
Training / eval dataset
  ↓
Specialty model improvement

Acceptance Criteria

  • AMB memory records support structured metadata fields.
  • AMB traces can be labeled by type, source, stability, freshness, quality, privacy, and training eligibility.
  • Records can be marked as trainable, eval_only, retrieval_only, or do_not_train.
  • AMB supports supersession or stale-state handling.
  • User-confirmed corrections can update confidence and quality labels.
  • Training/export pipelines can filter records based on eligibility labels.
  • Sensitive/private records are excluded from training by default.
  • Minimal schema is implemented first, with room to expand later.

Minimal First Version

Start with a small schema:

{
  "type": "architecture_principle",
  "source": "user_confirmed",
  "scope": "WandersCop",
  "stability": "high",
  "privacy": "internal",
  "training_eligible": true,
  "tags": ["AMB", "distillery"]
}

Then expand later into:

  • Quality labels
  • Feedback labels
  • Supersession chains
  • Freshness checks
  • Anonymization rules
  • Eval bucket assignment
  • Distillery export filters

Notes

This should be treated as a core AMB capability, not a side feature.

If AMB is going to become the foundation for future internal model training and distillery loops, then structured tagging and labeling needs to be designed early. Retrofitting labels later will make old traces messy, unsafe, and harder to trust.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions