Skip to content

Add descriptor-driven multimodal composition loading#38

Draft
jayden0701 wants to merge 13 commits into
nntrainer:mainfrom
jayden0701:codex/multimodal-composition-v0.4.0
Draft

Add descriptor-driven multimodal composition loading#38
jayden0701 wants to merge 13 commits into
nntrainer:mainfrom
jayden0701:codex/multimodal-composition-v0.4.0

Conversation

@jayden0701

Copy link
Copy Markdown

Summary

  • Add loadMultimodalCompositionJson C API/JNI support for descriptor-driven multimodal compositions.
  • Add role-based component handling for LLM, vision encoder, connector, and composition descriptors.
  • Integrate LFM2 + SigLIP CPU and LFM2 + JEPA mixed CPU/NPU composition paths.
  • Update Android request handling, SampleTestAPP UI, image processor selection, and documentation.

dlwlzzero and others added 13 commits June 4, 2026 12:52
## Summary

Brings the Quick.AI public API up to the v0.4.0 surface in a single commit.
Models are now identified by a string model id (not a C enum): each model
self-registers its descriptor at load time and the catalog is exposed via
`getModelCatalogJson()` (C API) and `ModelCatalog` (Android AAR). The build
discovers model directories generically, so additional models can be dropped
in without editing build files or the public API.

The tree and history contain no proprietary model sources or references;
`git grep -i gauss` matches only the Qualcomm SDK `GAUSSIAN` constant.

## Change

- API: string-id descriptor registry + catalog JSON; per-model self-registration
  via constructors; a lazy (Meyers-singleton) registry so cross-library
  registration survives static-init order.
- API: generic multimodal composer and a vision-encoder capability, decoupled
  from any specific model.
- QNN: set the HTP backend-ext-config before multi-model sub-model loads;
  gemma4-e2b-qnn (NATIVE/NPU) bring-up.
- AAR: `ModelCatalog.selectableFamilies()` to hide embedding-only models in the
  Run/OpenAI and Chat family pickers.
- Build: model build hooks (meson + ndk-build) auto-discover model directories
  instead of naming them, so proprietary models plug in cleanly.
- Guards: allow-list `.gitignore` and a `pre-push` hook that block any
  non-allow-listed model source directory from reaching the public remote
  (allow-list = `src/models/qnn/gemma4-e2b-qnn`).

Verified on device (S26 Ultra): the Chat and OpenAI tabs run qwen3-0.6b,
gemma4-e2b-qnn (NPU), and function_gemma; the catalog lists no proprietary
families.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ction UI

- Add user-editable MODEL BASE PATH field (default: /sdcard/Download/aistudio-mobile/models/)
  replacing the previously hardcoded path, available in both OpenAI and Chat tabs
- Replace MODEL NAME dropdown with read-only folder name display derived from
  the model descriptor; show error message when the expected folder is missing
- Remove Quantization chip selector from the OpenAI tab (W4A32 used internally)
- Change default model from Gemma4 LiteRT/GPU to Gemma4 Native/NPU (GEMMA4_E2B_QNN)
- Pass modelBasePath through createEngine() to LiteRTLm and buildLoadRequest()
- Add bordered card style to Chat tab's model selection section
- Preserve modelBasePathText across theme rebuilds from both OpenAI and Chat tabs

Signed-off-by: jrock-oh <jrock.oh@samsung.com>
Introduce the first planning and catalog layer for pluggable multimodal model composition. This adds role-aware model descriptors, Android catalog decoding, placeholder LFM/SigLIP/JEPA component descriptors, and lightweight contract checks for the new catalog shape.

Also document the repository orientation and the staged implementation plan so later sessions can continue Task-by-Task without relying on conversation context.

Signed-off-by: jrock-oh <jrock.oh@samsung.com>
Signed-off-by: jrock-oh <jrock.oh@samsung.com>
Signed-off-by: jrock-oh <jrock.oh@samsung.com>
Signed-off-by: jrock-oh <jrock.oh@samsung.com>
Signed-off-by: jrock-oh <jrock.oh@samsung.com>
Signed-off-by: jrock-oh <jrock.oh@samsung.com>
Signed-off-by: jrock-oh <jrock.oh@samsung.com>
Signed-off-by: jrock-oh <jrock.oh@samsung.com>
Signed-off-by: jrock-oh <jrock.oh@samsung.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants