Amni-AI

INTELLIGENCE WITHOUT THE CLOUD

A self-improving AI engine that runs entirely on your machine. No API keys required for core operation. No data leaves your hardware. Knowledge stored as GPU textures, not neural weights.

📚

DICTIONARY ATLAS

147K words with deterministic Reffelt Constant Nonces (SHA256-seeded 512d vectors). Knowledge stored as TMU textures, not neural network weights.

🎨

GPU TEXTURE INFERENCE

Full atlas loaded to VRAM (~451MB). Hardware-native tex1Dfetch lookups. 4.5ms average query latency via single matmul cosine similarity.

🧠

DUAL-MIND

Red Team / Blue Team architecture. Proposer generates, auditor cross-validates. Role rotation every 10 generations. Consensus on disagreement.

🔬

SELF-IMPROVING

3-core dialectical engine: Thesis, Antithesis, Mediator. Overnight daemon runs GROW / COMPRESS / DREAM cycles autonomously.

⚖

ASIMOV-GATED

5-law safety framework enforced at every decision point. SHA-256 hash seals on protected files. Immutable axioms.

🚀

KERNEL EVOLUTION

Self-profiling identifies bottlenecks. Kernel evolver generates optimized replacements. A/B benchmarks promote winners automatically.

ARCHITECTURE

LAYER	FUNCTION
L0 — Asimov	5-law safety enforcement (immutable, always active)
L1 — NonceNet	POS-aware word graph (573K links, 147K words). Truth ROM.
L2 — TandemEngine	GPU cosine similarity + batch relationship decode (4.5ms)
L3 — DualMind	LLM voice (Qwen 7B via Vulkan). Red/Blue team debate format.
L4 — Output	Confidence-scored response with delta enrichment
L5 — Reflector	Self-improvement: dialectic engine, kernel evolver, panel of experts

HOW IT WORKS

1. ATLAS LOOKUP

Query enters NonceNet. Reffelt Constant Nonces resolve words to deterministic 512d vectors. TMU texture pages provide relationships, co-occurrences, and domain classification in a single GPU pass.

2. EXPERT PANEL

25 domain specialists fire based on query topic. TandemEngine runs GPU cosine similarity across the full 147K-word atlas. Batch relationship decode extracts context.

3. DUAL-MIND DEBATE

Proposer model generates response from atlas context. Auditor model scores on factual accuracy, safety, quality, and hallucination (4 axes). Consensus on failure.

4. SELF-IMPROVEMENT

Overnight daemon: Thesis proposes, Antithesis attacks, Mediator synthesizes. Verified deltas committed to learnings sandbox. Kernel evolver optimizes hot paths.

SPECIFICATIONS

PARAMETER	VALUE
Knowledge Base	147K words, 573K links (WordNet + Brown + Reuters + Gutenberg)
Texture Format	4096×4096 RGBA (64MB/page), TMU hardware fetch
VRAM Usage	~451MB atlas + ~4.4GB LLM (Vulkan offload)
LLM Backend	llama-cpp-python + Vulkan (Qwen2.5-7B-Q4_K_M)
Inference Speed	~18 tok/s (Vulkan), 4.5ms atlas query
Safety	5-law Asimov framework, SHA-256 sealed
Self-Improvement	3-core dialectic, kernel evolution, panel of 5 experts
Code Atlas	5K+ entities, 9K+ links (12 languages, 18 topics)
Platform	Windows desktop (Python + Vulkan/DirectML)
GPU	AMD RDNA2/3 optimized (RX 7800 XT primary target)