Skip to content

li-ukumar/seasons

Repository files navigation

SeasonShift

Autonomous demand forecasting with a seasonal regime-change agent.

Watch the model fail in December (missing the holiday spike), the agent detect the recurring pattern via Arize telemetry, decline a one-time decoy feature, and switch its forecast strategy — live.


Architecture

Time cursor advances
  → model predicts (pack_default)
  → predictions + feature performance logged to Arize
  → agent reads Arize history (observe)
  → agent evaluates recurrence (decide)
     ├─ banner_red (fluke) → decline, log reasoning
     └─ days_until_christmas (regime) → switch to pack_holiday
  → model predicts better (verify)
  → UI shows failure → recovery

Components

File Purpose
prepare_data.py Download + preprocess Rossmann data, inject banner_red decoy
model.py LightGBM with two switchable feature packs (no retraining)
arize_logging.py Log predictions to Arize; read per-feature history
agent.py Observe→decide→act→verify loop; Gemini or rule-based reasoning
server.py Flask API serving the UI
ui/ React two-panel demo UI

Quick Start

1. Get the data

Download from Kaggle: rossmann-store-sales and place train.csv + store.csv in the data/ folder.

Or, if you have the Kaggle CLI configured:

./setup.sh --download

2. Configure credentials (optional)

cp .env.example .env
# edit .env — add Arize keys and/or Google Cloud project for Gemini

The system works fully offline without Arize or Gemini credentials — it uses local feature-performance storage and rule-based reasoning.

3. Run setup

chmod +x setup.sh
./setup.sh

This installs Python deps, prepares the dataset, trains the model, seeds the Arize performance log, and runs the initial agent simulation.

4. Start the servers

Terminal 1 — API:

python3 server.py

Terminal 2 — UI:

cd ui
npm install
npm run dev

Open http://localhost:3000


Demo Script (3 minutes)

  1. Stable period — Cursor at the start of the year. Forecast tracks actuals. Agent: quiet.
  2. Decoy fires — Advance to March (year 1). banner_red spikes. Agent logs: "fluke — no prior recurrence → declining."
  3. Holiday break — Advance to November/December. Default forecast diverges badly from actuals. Panel A turns red.
  4. Agent acts — Agent finds days_until_christmas recurring across ≥2 prior years → logs "regime — switching to holiday strategy."
  5. Recovery — Forecast snaps back to tracking actuals. Panel A turns green. Verification shows MAPE drop.

Feature Packs

Pack Features When active
pack_default day_of_week, month, day_of_year, is_promo, lags, banner_red Default
pack_holiday + days_until_christmas, holiday proximity After agent detects December regime

The agent's action is a config switch — instant, no retraining. The model was trained on all features; inactive ones are zeroed at inference.


Acceptance Criteria

  • data/plots/acceptance_check.png shows repeating December spikes + isolated spring banner_red bump
  • December MAPE gap: pack_default vs pack_holiday is large (>20pp)
  • Agent outputs fluke for banner_red with evidence citing no prior recurrence
  • Agent outputs regime for days_until_christmas with evidence citing ≥2 prior years
  • UI Panel A: failure visually obvious to a non-technical viewer; recovery unmissable

Audit: What is real vs. simulated

This section exists so judges and reviewers can evaluate the demo honestly.

Real (not fabricated)

Artifact Detail
Dataset Rossmann Store Sales — Store 1, Open==1 only → 781 trading days 2013-01-02–2015-07-31
Model LightGBM trained on ALL_FEATURES; single train run, 85/15 time split, early stopping
December MAPE gap 27.5 percentage points (default 42.8% vs. holiday 15.3%) verified by run_visual_acceptance_checks() in model.py
Feature masking Inactive features are zeroed at inference; the model is never retrained on a different set
Recurrence evidence Counts only prior calendar years (< current simulation year) with elevated signal; current year is excluded
Leakage guard TelemetryStore._visible() hard-filters all records to date <= current_date; belt-and-suspenders assertions in agent.py will raise if any future record slips through

Simulated / pre-seeded

Artifact Detail
Time cursor The UI slider does not ingest live sales data; it re-runs the stored agent log
Arize telemetry The feature performance CSV is seeded by walking history month-by-month via seed_simulation_log(); it mimics what Arize would have recorded in a live deployment
Holiday signal strengthen_holiday_signal() applies a 1.65× multiplier to the 21 days before Dec 25 each year in the Rossmann data; this amplifies an already-present real signal

Deterministic / rule-based

Artifact Detail
Recurrence scorer score_recurrence() in agent.py is a structured evidence checker, not ML; it is predictable and auditable
banner_red verdict Always fluke — March 2013 only, no recurrence across years
days_until_christmas verdict Regime after Dec 2014 — two prior December cycles (2013, 2014) with elevated signal

Partner-backed (optional)

Partner How it is used
Arize Phoenix Set ARIZE_API_KEY + ARIZE_SPACE_KEY; ArizeTelemetryStore logs predictions and feature signals; the read path falls back to local CSV
Gemini via Vertex AI Set GOOGLE_CLOUD_PROJECT; agent sends structured evidence to Gemini for natural-language explanation; falls back to score_recurrence() if not configured

Known limitations

  • Rossmann data ends July 2015; only 2 complete December cycles are available. The recurrence threshold (recurrence_count >= 2) is at the lower bound of statistical credibility.
  • sales_lag_365 echoes the holiday uplift from the prior year into pack_default predictions, inflating them in December and exaggerating the apparent failure — this is acknowledged, not hidden.
  • Feature signal strength uses abs(Pearson r) of raw feature values vs. target in a sliding window. This is not the same as SHAP or model gain importance; the proxy is labeled feature_signal_strength to distinguish it from model-based importance.

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors