Skip to content

Implement Kaika: full audio→fluid→video pipeline + local app#2

Open
FloLey wants to merge 31 commits into
claude/keen-darwin-cg59kxfrom
claude/kaika-impl
Open

Implement Kaika: full audio→fluid→video pipeline + local app#2
FloLey wants to merge 31 commits into
claude/keen-darwin-cg59kxfrom
claude/kaika-impl

Conversation

@FloLey

@FloLey FloLey commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Complete, working, tested implementation of the Kaika spec, in a new kaika/ folder. Built phase by phase; pytest is green end-to-end (55 tests) with no GPU.

Stacked on top of the spec PR (#1) so this diff shows only the implementation. Merge #1 first, or merge this into main directly.

What's here

The five-stage pipeline, each stage independently testable:

Stage Module In → Out
E1 analyze core/analyze.py audio → frame-aligned score.json (librosa: tempo, beats, per-band onsets, RMS, centroid, sections)
E2 simulate core/simulate.py score + recipe → deterministic NumPy stable-fluids frames + velocity + energy stats
E3 control core/control.py fluid → depth / canny / exact optical-flow (ground truth, nothing estimated)
E4 diffuse core/diffuse/ fluid + control → styled frames
E5 post core/post.py styled + audio → kaika_final.mp4 (ffmpeg mux + sync check)

Plus: core/pipeline.py (reproducible runs/<id>/ dirs), FastAPI server with a single-worker job queue + SQLite + WebSocket progress, and a React/Vite/TS frontend (Studio / Render / Gallery) built and embedded in the package. One command — kaika — launches everything; kaika run is the scripting path calling the same library.

Honoured the review feedback from #1

  • E1 uses integer hop with drift documented; E5 sync correlates RMS against fluid kinetic energy, not styled luminance; recipe prompts always prefix base and fall back to default; the ComfyUI backend transfers compressed video, never raw PNG sequences.

E4 backends

  • local — deterministic, GPU-free stylizer so the whole pipeline runs and is tested anywhere (this is not the figurative metamorphosis).
  • comfyui — ComfyUI / Wan 2.2 on a rented GPU: section-aligned chunking, score-derived prompt schedule, versioned workflow template, provisioning scaffold. Real orchestration is unit-tested offline; the live HTTP path is gated behind a reachable endpoint and not exercised in CI.

Verification

  • pytest → 55 passed (E1–E5, recipe, pipeline e2e, server via TestClient, live HTTP smoke, frontend serving).
  • uv build wheel confirmed to include recipes/ and webapp_dist/.
  • Frontend builds clean under strict TS.

A 6-second proof clip (synthetic track → full pipeline, local backend) was shared in the session.


Generated by Claude Code

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces Kaika, a comprehensive pipeline and local web application that converts music into video clips using audio analysis, fluid simulation, and video diffusion models. The implementation includes a FastAPI backend, a React-based frontend, and an extensive test suite. The code review identified several critical issues and improvement opportunities: a path traversal vulnerability in the file-serving endpoint, performance overhead from importing cv2 inside a hot simulation loop, a potential ZeroDivisionError when lookahead_s is zero, potential ffmpeg failures due to odd-dimension padding in wide aspect ratios or missing explicit stream mappings, and a configuration merging bug where None values can overwrite default settings and cause downstream crashes.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread kaika/src/kaika/server/app.py
Comment thread kaika/src/kaika/core/simulate.py Outdated
Comment thread kaika/src/kaika/core/simulate.py
Comment thread kaika/src/kaika/core/post.py Outdated
Comment thread kaika/src/kaika/core/post.py
Comment thread kaika/src/kaika/core/recipe.py Outdated
claude added 19 commits June 9, 2026 22:35
… guard, even pad, explicit ffmpeg maps, null-merge guard
…y damping, calibrated force/vorticity scales, HDR tonemap+bloom+palette rendering
…rces spawned by onsets (born->emit->die); colour follows the music, ambient stirring is RMS-driven only
…cay + expanding puffs so each kick blooms and fades distinctly
… each source streams matter along its heading (jet + self-propulsion) instead of an isotropic blob -> no central pile-up
…es) and per-frame config variation in a single continuous simulation
…nt prompt + fluid inspector), preview->generate Render flow; rebuilt frontend
…onstants, remove dead color code, dedupe control-signals + cross-module audio helper, centralize test scaffolding
…ield special-casing), ComfyUI workflow registry, curated core public API, job worker logging
…ft mode, E3 deferred to generate, audio slicing
…aries, split/merge, beats/onsets on waveform, rich per-segment inspector, palette/seed/YAML editing; generate is self-sufficient
…ette mapping + centroid brightness, flow-advected texture in local stylizer, HPSS strict onsets, grain/vignette in E5
…e frame peek while rendering, cooperative job cancel
claude added 2 commits June 10, 2026 12:18
…ed controls, segment nav + override indicators + reset, representative video posters, labelled recipe picker
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants