Couche 1 — Cognitive Core | oo-system architecture
UEFI x86_64 bare-metal LLM + Mamba SSM inference engine. Boots from USB. No OS. Part of the Operating Organism ecosystem.
By Djiby Diop
llm-baremetal is the sovereign runtime of the larger Operating Organism vision.
It is meant to be preserved and evolved as the bare-metal / survival / recovery pillar of the system, not replaced.
Model weights (.gguf / legacy .bin) are intentionally not tracked in git.
Download them from Hugging Face (or any direct URL) into models/.
Windows:
./scripts/get-weights.ps1 -Url "https://huggingface.co/<org>/<repo>/resolve/main/<file>.gguf" -OutName "<file>.gguf"Stable public test models for this project are also published at djibydiop/llm-baremetal. To fetch one directly into models/:
./scripts/get-stable-model.ps1 -File stories15M.q8_0.gguf
# example for the larger legacy llama2.c export
./scripts/get-stable-model.ps1 -File stories110M.binLinux:
./scripts/get-weights.sh "https://huggingface.co/<org>/<repo>/resolve/main/<file>.gguf" "<file>.gguf"Then pass the model path to the build.
- Ensure
tokenizer.binis present (this repo includes it by default). - Download a model file into
models/(see above).- Supported today for inference:
.bin(llama2.c export) - Supported today for inference:
.gguf(F16/F32 + common quant types like Q4/Q5/Q8; see below) - You can also use a base name without extension (the image builder will copy
.binand/or.ggufif present)
- Supported today for inference:
- Build + create boot image:
./build.ps1Example (base name):
./build.ps1 -ModelBin models/stories110M
# or explicit file
./build.ps1 -ModelBin models/my-model.ggufPrereqs (Ubuntu/Debian):
sudo apt-get update
sudo apt-get install -y build-essential gnu-efi mtools parted dosfstools grub-pc-binThen:
cd llm-baremetal
make clean
make repl
# Build an image with a bundled model:
# MODEL=stories110M ./create-boot-mtools.sh
# Or build a small image without embedding weights (copy your model later):
NO_MODEL=1 ./create-boot-mtools.shGitHub Releases provides a prebuilt x86_64 no-model boot image. It intentionally does not bundle any model weights, and it does not hardcode a model path.
Download these assets from the latest Release:
llm-baremetal-boot-nomodel-x86_64.img.xzSHA256SUMS.txt
Verify + extract (Linux):
sha256sum -c SHA256SUMS.txt
xz -d llm-baremetal-boot-nomodel-x86_64.img.xzFlash to a USB drive (Linux, replace /dev/sdX):
sudo dd if=llm-baremetal-boot-nomodel-x86_64.img of=/dev/sdX bs=4M conv=fsync status=progressCopy your model to the USB EFI/FAT partition:
- Copy your model file (
.ggufor legacy.bin) to the root of the FAT partition (or create amodels/folder and put it there). tokenizer.binis already included in the Release image.
Note: some UEFI FAT drivers can be unreliable with long filenames. If you hit "file not found / open failed" issues, prefer an 8.3-compatible filename (e.g. STORIES11.GGU) or use the FAT 8.3 alias (e.g. STORIE~1.GGU) when setting model= in repl.cfg.
Boot the USB on an x86_64 UEFI machine, then select/load your model from the REPL.
On an 8GB machine, "conversational" works best with a small instruct/chat GGUF model rather than a large 7B model.
Recommended target:
- Size: ~0.5B-1B parameters
- Format:
.gguf - Quantization: prefer variants that are supported by the current GGUF inferencer:
Q4_0/Q4_1/Q5_0/Q5_1/Q8_0(avoidQ4_K_*/Q5_K_*for now)
Suggested first-run settings:
- Keep context small at first (e.g. 256-512) to avoid running out of RAM (KV cache grows with context).
- If your model is Q8_0 and you want lower RAM usage, enable
gguf_q8_blob=1(default in the Release image).
Useful REPL commands:
/diagto inspect GOP, RAM, CPU features, and detected model paths/diag_reportto save the same diagnostic view plus model inventory tollmk-diag.txt/modelsto list.gguf/.binfound in the root andmodels\\/model_info <file>to inspect a model before loading, including files in root,models\\, and FAT 8.3-resolved names/oo_statusto inspect runtime engine state plus persistence/continuity artifacts (OOSTATE.BIN,OORECOV.BIN,OOJOUR.LOG,OOCONSULT.LOG,OOHANDOFF.TXT)/oo_outcometo inspectOOOUTCOME.LOG, pending next-boot checks, and confirmed adaptation outcomes/oo_explainto explain the latest consult decision, with/oo_explain verbosefor confidence/plan/dynamics details and/oo_explain bootfor latest confirmed boot comparison plus recent confirmed history/oo_reboot_probeto arm a reboot continuity check, reboot, then verify that OO state came back aligned on the next boot/cfgto confirm effectiverepl.cfgsettings
Recent OO consult builds also expose higher-level operator fields in /oo_status, /oo_log, and /oo_explain verbose, including:
last.consult.boot_relation/boot_biaslast.consult.trend/trend_biaslast.consult.saturation/saturation_biaslast.consult.operator_summary
This makes it easier to see cases such as positive_but_saturated, where a previously successful action is still favored by history but is no longer directly applicable because the target is already at its bound.
For a first real-machine no-model check, the image also ships with llmk-autorun-real-hw-oo-smoke.txt. Run it with /autorun llmk-autorun-real-hw-oo-smoke.txt or point autorun_file to it in repl.cfg.
For a real-machine reboot continuity check, the image also ships with llmk-autorun-real-hw-oo-reboot-smoke.txt. Run it with /autorun llmk-autorun-real-hw-oo-reboot-smoke.txt; the first /oo_reboot_probe arms the check and reboots, then the next boot verifies continuity and continues the script.
- Use Rufus: select the
.img(or extract from.img.xzfirst), partition scheme GPT, target UEFI (non CSM).
./run.ps1 -Preflight -GuiHost -> sovereign handoff smoke:
./test-qemu-handoff.ps1
# optional if oo-host is not in the default sibling path
./test-qemu-handoff.ps1 -OoHostRoot ..\oo-hostThis smoke flow also extracts OOHANDOFF.TXT beside the repo so oo-host/sync-check can verify the aligned host/export/receipt state.
Model-backed OO consult smoke in QEMU:
./test-qemu-autorun.ps1 -Mode oo_consult_smoke -ModelBin stories15M.q8_0.gguf -SkipPrebuildThis validates /oo_consult, /oo_log, and OOCONSULT.LOG creation with a small bundled model before moving to real hardware.
No-model OO outcome / adaptation learning smoke in QEMU:
./test-qemu-autorun.ps1 -Mode oo_outcome_smoke -Accel tcg -SkipPrebuildThis validates the consult -> persist -> reboot-verified outcome -> learned reselection loop, including /oo_outcome, /oo_explain boot, recent confirmed history, and operator-facing summaries persisted in OOCONSULT.LOG.
For faster iteration, use the unified QEMU wrapper run-qemu-oo-validation.ps1:
# run one focused lane
./run-qemu-oo-validation.ps1 -Mode consult -ModelBin stories15M.q8_0.gguf -Accel tcg -SkipPrebuild
./run-qemu-oo-validation.ps1 -Mode reboot -Accel tcg
./run-qemu-oo-validation.ps1 -Mode handoff -Accel tcg
# or run the core QEMU matrix end to end
./run-qemu-oo-validation.ps1 -Mode all-core -ModelBin stories15M.q8_0.gguf -Accel tcg -SkipPrebuildThe wrapper keeps QEMU as the primary iteration loop for no-model smoke, reboot continuity, host -> sovereign handoff, and model-backed OO consult so hardware reboots are reserved for larger milestones only.
For a real UEFI/USB handoff check, copy sovereign_export.json from the host runtime onto the FAT root of the USB image, then run llmk-autorun-real-hw-handoff-smoke.txt with /autorun llmk-autorun-real-hw-handoff-smoke.txt.
To stage that file from the sibling host workspace, use llm-baremetal/prepare-real-hw-handoff.ps1. It refreshes oo-host/data/sovereign_export.json, can copy both the export and the real-hardware handoff autorun script onto a mounted FAT/USB root, and can also build a dedicated llm-baremetal-boot-real-hw-handoff.img image with the export already injected.
For the next milestone — model-backed sovereign chat on a real machine — use prepare-real-hw-chat.ps1. It generates a dedicated llm-baremetal-boot-real-hw-chat.img with a bundled model, a generated repl.cfg, and conversational defaults already set:
./prepare-real-hw-chat.ps1 -ModelBin stories110M.bin
# optional: boot straight into a tiny chat smoke
./prepare-real-hw-chat.ps1 -ModelBin stories110M.bin -AutoSmokeThe helper keeps the image interactive by default. With -AutoSmoke, it points autorun_file at llmk-autorun-real-hw-model-chat-smoke.txt so the machine can prove model load + first response automatically.
To continue the OO path with a real model, the same helper also supports -AutoOoConsultSmoke. That enables oo_enable=1, oo_llm_consult=1, and boots into llmk-autorun-real-hw-oo-consult-smoke.txt to prove model-backed /oo_consult plus OOCONSULT.LOG creation:
./prepare-real-hw-chat.ps1 -ModelBin stories110M.bin -AutoOoConsultSmokeFor an interactive real-hardware OO image without autorun or auto-shutdown, use -EnableOoConsult instead. This keeps the boot in the REPL while pre-enabling oo_enable=1 and oo_llm_consult=1:
./prepare-real-hw-chat.ps1 -ModelBin stories110M.bin -EnableOoConsult -OutImagePath ..\llm-baremetal-boot-real-hw-oo-consult-interactive.imgValidated demo image:
./prepare-real-hw-chat.ps1 -ModelBin stories110M.bin -EnableOoConsult -SkipPrebuild -CtxLen 256 -MaxTokens 96 -Temperature 0.75 -TopP 0.95 -TopK 80 -RepeatPenalty 1.15 -OutImagePath ..\llm-baremetal-boot-demo-stories110M.imgThis produces a clean interactive USB/demo image with the bundled stories110M.bin model, conversational defaults, OO consult enabled, and no autorun shutdown path. After boot, a short live demo can be:
/cfg/diaghi/oo_status/oo_consult/oo_explain
Published demo artifacts on Hugging Face now include both the raw and compressed forms:
llm-baremetal-boot-demo-stories110M.imgllm-baremetal-boot-demo-stories110M.img.xzSHA256SUMS-demo-stories110M.txtSHA256SUMS-demo-stories110M-xz.txt
After the real-machine run, collect the produced OO artifacts from the mounted FAT partition or from an image copy with collect-real-hw-oo-artifacts.ps1:
./collect-real-hw-oo-artifacts.ps1 -UsbRoot E:\
# or directly from an image file
./collect-real-hw-oo-artifacts.ps1 -ImagePath .\llm-baremetal-boot-real-hw-chat.imgIt gathers OOCONSULT.LOG, OOJOUR.LOG, OOSTATE.BIN, OORECOV.BIN, OOHANDOFF.TXT, and llmk-diag.txt into a timestamped folder under artifacts/ and writes a small summary file for review.
Then validate the collected folder with validate-real-hw-oo-artifacts.ps1:
./validate-real-hw-oo-artifacts.ps1
# explicit folder also works
./validate-real-hw-oo-artifacts.ps1 -ArtifactsDir .\artifacts\real-hw-oo-20260316-012323By default it expects OOSTATE.BIN, OORECOV.BIN, OOJOUR.LOG, and a consult trace in OOCONSULT.LOG. Optional stricter checks are available with -RequireDiag and -RequireHandoff.
If you want a single entrypoint for the whole real-machine consult milestone, use run-real-hw-oo-consult-validation.ps1:
# phase 1: prepare the real-hardware image
./run-real-hw-oo-consult-validation.ps1 -Phase prepare -ModelBin stories110M.bin
# phase 2: after the physical boot, collect + validate from the mounted USB FAT root
./run-real-hw-oo-consult-validation.ps1 -Phase collect -UsbRoot E:\The prepare phase builds the image with -AutoOoConsultSmoke; the collect phase chains collection plus validation automatically.
For the real-machine host -> sovereign handoff milestone, use run-real-hw-handoff-validation.ps1:
# phase 1: refresh host export + build the dedicated handoff image
./run-real-hw-handoff-validation.ps1 -Phase prepare
# phase 2: after the physical boot, collect + validate from the mounted USB FAT root
./run-real-hw-handoff-validation.ps1 -Phase collect -UsbRoot E:\The prepare phase refreshes oo-host/data/sovereign_export.json and builds llm-baremetal-boot-real-hw-handoff.img; the collect phase requires OOHANDOFF.TXT, allows a missing consult log, writes a handoff-focused validation report, and runs oo-bot sync-check when the sibling oo-host workspace is available.
For the real-machine reboot continuity milestone, use run-real-hw-oo-reboot-validation.ps1:
# phase 1: build the dedicated reboot continuity image
./run-real-hw-oo-reboot-validation.ps1 -Phase prepare
# phase 2: after the physical reboot cycle, collect + validate from the mounted USB FAT root
./run-real-hw-oo-reboot-validation.ps1 -Phase collect -UsbRoot E:\The prepare phase builds llm-baremetal-boot-real-hw-oo-reboot.img with oo_enable=1 and the reboot smoke autorun; the firmware also makes a best-effort attempt to set UEFI BootNext to the current USB boot entry before resetting so the second boot returns to the USB device more reliably. The collect phase requires the reboot_probe_arm and reboot_probe_verified journal markers, allows a missing consult log, and writes a reboot-focused validation report.
The chained collect phase also writes oo-real-validation-report.md into the artifact folder so the real-machine milestone has a human-readable receipt with artifact sizes, consult decision, confidence fields, and parsed journal events.
The host runtime lives in the separate oo-host repository and is expected by default as a sibling clone beside this repo.
Validate everything (recommended after pulling updates):
./validate.ps1
# explicit override also works with a relative sibling path
./validate.ps1 -OoHostRoot ..\oo-hostWhen the sibling oo-host workspace is present, validation also runs the handoff smoke plus oo-bot sync-check end to end. Relative -OoHostRoot overrides are resolved against the repo root first, so sibling-path invocations stay stable.
The current release-candidate status is tracked in RELEASE_CANDIDATE.md.
OS-G is included as a self-contained kernel-governor prototype (Memory Warden + D+ pipeline) under:
OS-G (Operating System Genesis)/
Quick validation (UEFI/QEMU smoke test, prints RESULT: PASS/FAIL):
./run-osg-smoke.ps1 -Profile release
# or via the main runner
./run.ps1 -OsgSmokeHost-side tests/tools (requires std feature):
cd 'OS-G (Operating System Genesis)'
cargo test --features stdengine/ssm/ contains a complete freestanding bare-metal Mamba SSM inference engine —
no libc, no heap allocator, no KV cache. It is architecturally ideal for bare-metal:
- O(1) memory per token — the recurrent SSM state
his fixed-size regardless of sequence length - No KV cache — context length does not inflate RAM usage during generation
- Serializable state —
hcan be saved to disk and restored across reboots for OO identity continuity
Requirements: Python ≥ 3.10, PyTorch ≥ 2.0, NumPy
python engine/ssm/export_mamba_baremetal.py \
--model /path/to/checkpoint.pt \
--out models/my_model.mambThe exporter auto-detects d_model, n_layers, vocab_size, d_state, d_conv, expand, and
dt_rank from the checkpoint. It supports:
- HuggingFace
backbone.layers.{l}.mixer.*key layout (Mamba-2.7B, state-spaces/mamba) - Raw state dict or wrapped checkpoint (
{'model': state_dict, 'step': ...}) - BF16 checkpoints (converted to F32 on export)
mamba2backbonerecursion checkpoints (trained with this project's RLF pipeline):
# Export the phase-14c boolean-reanchored model (recommended, 24-layer, d_model=768)
python engine/ssm/export_mamba_baremetal.py \
--model ~/.gemini/antigravity/scratch/mamba2backbonerecursion/checkpoints/mamba3_p14c_bool_reanchored.pt \
--out models/mamba3_p14c.mamb
# Or any other phase checkpoint
python engine/ssm/export_mamba_baremetal.py \
--model ~/.gemini/antigravity/scratch/mamba2backbonerecursion/checkpoints/mamba3_p15_conversational_thoughts.pt \
--out models/mamba3_p15.mambOutput: flat binary .mamb file (~640 MB at FP32 for the 24-layer / d_model=768 model).
Once booted, use these REPL commands to interact with the SSM engine:
| Command | Description |
|---|---|
/ssm_load <file> |
Load a .mamb model from the FAT root or models/ |
/ssm_info |
Print loaded model config (d_model, n_layers, d_state, …) |
/ssm_infer <prompt> |
Run inference from a prompt |
/ssm_reset |
Reset recurrent SSM state (clear context) |
The SSM state is automatically serialized to OOSTATE.BIN on reboot for continuity.
| Item | Size |
|---|---|
.mamb model binary |
~640 MB |
SSM recurrent state h per layer |
~96 KB |
| Total state for 24 layers | ~2.3 MB |
| Min RAM to run | ~700 MB (model + state + firmware) |
Note: The SSM model binary must fit in the UEFI COLD memory zone.
For machines with less than 1 GB RAM, export smaller checkpoints or reducen_layers.
- Model weights are intentionally not tracked in git; use GitHub Releases or your own files.
- Optional config: copy
repl.cfg.example->repl.cfg(not committed) and rebuild.
Optional OO policy gate:
- If a file named
policy.dplusexists on the FAT root, the firmware treats it as a D+ policy (OS-G style) and gates/oo*commands from it. - Otherwise, it falls back to a simpler legacy file
oo-policy.dplus. - If neither file is present, behavior is unchanged.
Example policy.dplus (D+ style; deny-by-default; requires @@LAW + @@PROOF):
@@LAW
allow /oo_list
allow /oo_new
allow /oo_note
deny /oo_exec*
@@PROOF
proof op:7
Legacy example oo-policy.dplus (best-effort):
mode=deny_by_default
allow=/oo_list
allow=/oo_new
allow=/oo_note
deny=/oo_exec*