Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
c0c410a
Kaika phase 0: project skeleton, packaging, smoke tests
claude Jun 9, 2026
233ef2f
Kaika phase 1: E1 audio analysis (score model + analyzer) with tests
claude Jun 9, 2026
5caf56a
Kaika phase 2: E2 fluid simulation + recipe model with tests
claude Jun 9, 2026
f82147a
Kaika phase 3: E3 control signals (depth/canny/flow) with tests
claude Jun 9, 2026
c524bc2
Kaika phase 4: E5 post-production (mux, sync check, transfer codec) w…
claude Jun 9, 2026
2b890cf
Kaika phase 5: E4 diffusion (interface, scheduling, local fallback, C…
claude Jun 9, 2026
052f69e
Kaika phase 6: pipeline orchestration + run directory + CLI run, end-…
claude Jun 9, 2026
197b9c9
Kaika phase 7: FastAPI server, SQLite job queue, WebSocket progress, …
claude Jun 9, 2026
bcef644
Kaika phase 8: React+Vite+TS frontend (Studio/Render/Gallery), built …
claude Jun 9, 2026
200eb2e
Kaika phase 9: packaging (wheel includes recipes+frontend), robust re…
claude Jun 9, 2026
42e02ab
Address PR #2 review: prefix-safe path check, cv2 import, lookahead-0…
claude Jun 9, 2026
9c17ed2
Fluid quality core: exact FFT pressure projection + MacCormack advect…
claude Jun 9, 2026
802547a
Fluid redesign: continuous curl-noise forcing + dye emitters, velocit…
claude Jun 9, 2026
d1ce910
Event-driven emission: replace persistent emitters with transient sou…
claude Jun 10, 2026
a10a740
Fix central blob: kicks spawn around a wandering anchor; impulsive de…
claude Jun 10, 2026
01a2010
Directional sources: kicks radiate outward from the wandering centre,…
claude Jun 10, 2026
3a4df61
Segment editor S1: Project model (per-segment prompts + fluid overrid…
claude Jun 10, 2026
6650a25
Segment editor S2: staged pipeline (run_fluid preview / run_diffuse r…
claude Jun 10, 2026
ad032f6
Segment editor S3: project API (create/edit/preview/generate) + gener…
claude Jun 10, 2026
7636954
Segment editor S4: project-based Studio (waveform segments, per-segme…
claude Jun 10, 2026
e4f9326
Docs: segment editor workflow + staged rendering
claude Jun 10, 2026
5d77cc3
Cleanup pass: extract simulate render/source helpers + named tuning c…
claude Jun 10, 2026
29a5787
Structural cleanup: generic nested-dataclass recipe builder (no per-f…
claude Jun 10, 2026
1d53009
Package A: fast iteration — windowed segment preview with warmup, dra…
claude Jun 10, 2026
27a0138
Package B: full timeline — audio playback + playhead, draggable bound…
claude Jun 10, 2026
58fedd6
Package C: render quality — boundary param smoothing, intentional pal…
claude Jun 10, 2026
f309576
Package D: gallery compare (synced side-by-side), open-in-Studio, liv…
claude Jun 10, 2026
c441973
CLI: print immediate startup message before the (slow) engine import …
claude Jun 10, 2026
bcb7ec7
Ignore runtime data dir (.kaika/)
claude Jun 10, 2026
e8b327d
UI: cap video width to container (prevents oversized players overflow…
claude Jun 10, 2026
38e954f
UI polish: pinned Preview action (always visible), collapsible advanc…
claude Jun 10, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions kaika/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
.venv/
__pycache__/
*.pyc
*.egg-info/
runs/
.pytest_cache/
node_modules/
webapp/dist/
.DS_Store
*.tsbuildinfo
.kaika/
122 changes: 122 additions & 0 deletions kaika/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
# Kaika 開花

Turn a piece of music into a video clip. A fluid simulation is *danced* by audio
analysis, then metamorphosed into living forms by a video diffusion model — all
driven from a local web app launched with a single command.

Full specification: [`../project_ideas/kaika.md`](../project_ideas/kaika.md).

```
SON ──▶ FLUIDE ──▶ FLORAISON
du son, un fluide ; du fluide, une fleur
```

## Quickstart

```bash
uv venv && . .venv/bin/activate
uv pip install -e ".[dev]"

pytest # the whole suite (no GPU needed)
kaika run path/to/track.wav --recipe eclosion --seconds 4 # render a 4s extract
kaika serve # launch the local app (http://localhost:8400)
kaika # bare command = serve + open browser
```

`uvx kaika` works once published: the compiled frontend and the recipes ship
embedded in the wheel, so there is no npm at runtime and no config file to edit.

## The pipeline

Five stages in a chain, each with files on disk, each independently testable.

| Stage | Module | In → Out |
| --- | --- | --- |
| **E1** analyze | `kaika.core.analyze` | audio → `score.json` (frame-aligned partition) |
| **E2** simulate | `kaika.core.simulate` | score + recipe → `fluid/*.png`, `velocity/*.npy`, `fluid_stats.json` |
| **E3** control | `kaika.core.control` | fluid → `control/{depth,canny,flow}/` |
| **E4** diffuse | `kaika.core.diffuse` | fluid + control → `styled/*.png` |
| **E5** post | `kaika.core.post` | styled + audio → `kaika_final.mp4` |

`kaika.core.pipeline.run_pipeline` orchestrates them into a reproducible
`runs/<id>/` directory (frozen recipe + score + every intermediate + manifest).

### Design notes

- **E2 is the movement skeleton**, not the final image: a deterministic NumPy
stable-fluids solver (toroidal, Jos-Stam style). Same seed → identical video.
(Taichi/GPU is a drop-in acceleration; NumPy keeps it runnable and testable
everywhere.)
- **The E3→E4 boundary is the most important interface** — "control frames in,
styled frames out". Everything model-specific lives behind `Diffuser`, so E4
is replaceable when vid2vid models churn.
- **E4 has two backends.** `local` is a deterministic, GPU-free stylizer so the
whole pipeline produces a clip on any machine (it is *not* the figurative
metamorphosis — that needs the GPU). `comfyui` drives ComfyUI / Wan 2.2 on a
rented GPU: chunking with section-aligned seams, a prompt schedule from the
score, near-lossless **video** transfer (never thousands of PNGs), and a
versioned workflow template (`diffuse/workflows/`). Provisioning scaffold in
`diffuse/provision.py`.
- **Sync check** (E5) correlates the audio RMS envelope with the fluid's
kinetic energy — deterministically audio-driven — not styled-frame luminance.

## The app

`kaika serve` runs FastAPI + a single-worker job queue + SQLite + WebSocket
progress, and serves the React/Vite/TS frontend. Three screens:

1. **Studio** — drop audio; analysis splits it into **editable segments**. Click
a segment to set *its* prompt and *its* fluid parameters (vorticity, kick/hat
emit, ambient stir). Then **Preview fluid** (no GPU) to iterate on the motion.
2. **Render** — the stages live with progress; watch the fluid preview, and when
the motion is right, **Generate** runs the diffusion to the final clip.
3. **Gallery** — every run, replayable, with its frozen recipe and sync info.

### Projects & staged rendering

A **Project** (`runs/<id>/project.json`) is the mutable working doc: the track's
segments, each with a prompt and partial fluid overrides. A single *continuous*
simulation reads these per-frame, so parameters vary by segment without breaking
the flow. The pipeline runs in two resumable stages:

- **fluid** (`run_fluid`) — E1+E2+E3 + a previewable fluid MP4. Fast, no GPU.
- **diffuse** (`run_diffuse`) — E4+E5, resuming the cached fluid with the
project's per-segment prompts.

Nothing the UI shows is hidden state: runs live on disk under `runs/`.

## Developing the frontend

The built frontend is committed under `src/kaika/webapp_dist/`. To change it:

```bash
cd webapp
npm install
npm run dev # http://localhost:5173, proxies /api + /ws to :8400 (run `kaika serve` too)
npm run build # re-emits into ../src/kaika/webapp_dist
```

## Layout

```
kaika/
├── recipes/ # YAML visual identities (eclosion, encre)
├── src/kaika/
│ ├── core/ # E1–E5 library + pipeline (UI and CLI both call this)
│ │ ├── analyze.py simulate.py control.py post.py pipeline.py
│ │ ├── recipe.py score.py media.py
│ │ └── diffuse/ # E4: base, local, comfy, provision, workflows/
│ ├── server/ # FastAPI app, job queue, SQLite
│ ├── webapp_dist/ # built frontend (embedded)
│ └── cli.py # `kaika` (serve) · `kaika run …` (scripting)
├── webapp/ # React/Vite/TS sources
├── tests/ # pytest, one module per stage + server + e2e
└── runs/ # one dir per render (gitignored)
```

## Sandbox honesty

Everything in this repo runs and is tested with **no GPU** (`pytest` is green
end-to-end). The figurative flower metamorphosis requires the `comfyui` backend
on a rented NVIDIA GPU; that code path is structured, unit-tested offline, and
gated behind a reachable ComfyUI endpoint, but is not exercised here.
43 changes: 43 additions & 0 deletions kaika/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "kaika"
version = "0.1.0"
description = "Turn a piece of music into a video clip: audio-driven fluid simulation, metamorphosed by a video diffusion model, from a single local command."
readme = "README.md"
requires-python = ">=3.10"
authors = [{ name = "Florent Lejoly" }]
dependencies = [
"numpy>=1.24",
"scipy>=1.10",
"librosa>=0.10",
"soundfile>=0.12",
"opencv-python-headless>=4.8",
"pillow>=10.0",
"imageio>=2.31",
"imageio-ffmpeg>=0.4",
"pyyaml>=6.0",
"fastapi>=0.110",
"uvicorn[standard]>=0.27",
"pydantic>=2.6",
"websockets>=12.0",
"python-multipart>=0.0.9",
]

[project.optional-dependencies]
dev = ["pytest>=8.0", "httpx>=0.27"]

[project.scripts]
kaika = "kaika.cli:main"

[tool.hatch.build.targets.wheel]
packages = ["src/kaika"]

[tool.hatch.build.targets.wheel.force-include]
"recipes" = "kaika/recipes"

[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = "-q"
33 changes: 33 additions & 0 deletions kaika/recipes/eclosion.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: eclosion
seed: 4217

fluid:
resolution: 256
render_resolution: 512
dissipation: 0.90
lookahead_s: 8.0
splats:
low: { radius: 0.12, force: 9000, placement: anchored, lifetime_s: 0.8, emit: 0.22, drift: 0.7 }
high: { radius: 0.03, force: 3500, placement: scatter, max_per_beat: 5, lifetime_s: 0.3, emit: 0.11, drift: 0.3 }
vorticity: { min: 8, max: 38, driver: rms }

diffusion:
model: wan-2.2-vace
backend: local
strength: 0.5
control: [depth, flow]
chunk_s: 5.0
overlap_frames: 24

post:
fps: 24
aspect: square

prompts:
base: "macro photography, botanical, dark background, soft light"
intro: "closed flower buds emerging from black water, mist"
build: "buds swelling, petals straining, tension"
drop: "explosive bloom of peonies, petals suspended mid-air"
verse: "slow drifting petals, calm water"
outro: "petals dissolving back into dark water"
default: "botanical organic forms, abstract motion"
34 changes: 34 additions & 0 deletions kaika/recipes/encre.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: encre
seed: 1107

fluid:
resolution: 256
render_resolution: 512
dissipation: 0.92
lookahead_s: 6.0
palette: ["#2a2a2a", "#4a4a4a", "#6c6c6c", "#9a9a9a", "#d9d9d9"]
splats:
low: { radius: 0.16, force: 7000, placement: anchored, lifetime_s: 1.4, emit: 0.16, drift: 0.7 }
high: { radius: 0.025, force: 2600, placement: scatter, max_per_beat: 7, lifetime_s: 0.25, emit: 0.10, drift: 0.3 }
vorticity: { min: 4, max: 24, driver: rms }

diffusion:
model: wan-2.2-vace
backend: local
strength: 0.45
control: [depth, flow]
chunk_s: 5.0
overlap_frames: 24

post:
fps: 24
aspect: square

prompts:
base: "black ink diffusing in water, sumi-e, monochrome, high contrast"
intro: "a single drop of ink hitting still water"
build: "ink tendrils reaching, gathering"
drop: "violent bloom of black ink, fractal plumes"
verse: "slow ink drift, grey washes"
outro: "ink settling, clearing water"
default: "ink in water, abstract"
1 change: 1 addition & 0 deletions kaika/src/kaika/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
__version__ = "0.1.0"
74 changes: 74 additions & 0 deletions kaika/src/kaika/cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
"""Kaika command line.

The terminal starts the app; it never creates. ``kaika`` launches the local
app and opens the browser. ``kaika run`` is the second-citizen scripting entry
point — it calls the very same library the UI does.
"""
from __future__ import annotations

import argparse
import sys
from pathlib import Path


def _cmd_run(args) -> int:
from .core.pipeline import run_pipeline

def progress(stage, done, total):
bar = f"{done}/{total}" if total else ""
print(f"\r[{stage:9}] {bar} ", end="", flush=True)

res = run_pipeline(args.audio, args.recipe, runs_root=args.out,
seconds=args.seconds, progress=progress)
print()
print(f"run {res.run_id} -> {res.final}")
print(f" frames={res.n_frames} backend={res.backend} "
f"sync lag={res.sync_lag}f corr={res.sync_corr}")
return 0


def _cmd_serve(args) -> int:
url = f"http://{args.host}:{args.port}"
print(f"Starting Kaika… loading the analysis engine (first launch ~10s)\n"
f" → {url}", flush=True)
from .server.app import serve
serve(host=args.host, port=args.port, runs_root=args.out,
open_browser=not args.no_browser)
return 0


def build_parser() -> argparse.ArgumentParser:
p = argparse.ArgumentParser(prog="kaika",
description="Turn music into a video clip.")
sub = p.add_subparsers(dest="cmd")

pr = sub.add_parser("run", help="render a clip (scripting/CI)")
pr.add_argument("audio", help="path to an audio file")
pr.add_argument("--recipe", default="eclosion", help="recipe name or path")
pr.add_argument("--seconds", type=float, default=None,
help="render only the first N seconds (fast iteration)")
pr.add_argument("--out", default="runs", help="runs root directory")
pr.set_defaults(func=_cmd_run)

ps = sub.add_parser("serve", help="launch the local app")
ps.add_argument("--host", default="127.0.0.1")
ps.add_argument("--port", type=int, default=8400)
ps.add_argument("--out", default="runs")
ps.add_argument("--no-browser", action="store_true")
ps.set_defaults(func=_cmd_serve)

return p


def main(argv=None) -> int:
argv = list(sys.argv[1:] if argv is None else argv)
parser = build_parser()
args = parser.parse_args(argv)
if not getattr(args, "cmd", None):
# bare `kaika` launches the app
args = parser.parse_args(["serve"])
return args.func(args)


if __name__ == "__main__":
raise SystemExit(main())
17 changes: 17 additions & 0 deletions kaika/src/kaika/core/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
"""Kaika core library (E1–E5 + orchestration).

The UI and the CLI both call this package. Common entry points are re-exported
here so callers can ``from kaika.core import run_pipeline, Project, load_recipe``.
"""
from .recipe import Recipe, load_recipe, from_dict as recipe_from_dict
from .score import Score
from .project import Project, Segment
from .analyze import analyze
from .pipeline import (run_pipeline, run_fluid, run_diffuse, init_project_run,
load_run, list_runs, RunResult)

__all__ = [
"Recipe", "load_recipe", "recipe_from_dict", "Score", "Project", "Segment",
"analyze", "run_pipeline", "run_fluid", "run_diffuse", "init_project_run",
"load_run", "list_runs", "RunResult",
]
Loading