Private alpha · invite only

Your shadow,
in the machine.

A personal AI that lives on your device, runs agents on your terms, and shares nothing with anyone but you.

影 · kah-geh · the sovereign shadow of you

Manifesto

Every assistant you've used was someone else's product. Every memory you trusted it with became their training data. Kage inverts that. Your shadow doesn't live on our servers. Your memory doesn't become our model. Your agents don't report to anyone. They work for you, and they stop when you say stop.

Founder's letter

What it does

Three promises.
No asterisks.

Local

Lives on your device.

Inference, memory, credentials — all on your hardware. The small model runs on Apple Neural Engine, CUDA, or Metal. The 72B runs on your GPU box. Cloud is opt-in, single-request, never passive.

Agentic

Acts on your behalf.

Triage, draft, book, execute. Headless agents complete tasks while you sleep — each tool call scoped to a 60-second Vault token, single-use, revoked the moment the step ends.

Sovereign

Owned by you, forever.

Your Kage is an age-encrypted bundle. One file. Move it to new hardware, tap your YubiKey, keep going. Delete and it's truly gone. No lock-in. No quiet retention.

Under the hood

One architecture.
Three ways to run it.

Your data, retrieval, agents, and audit ledger — always local. Only the model lane changes. Pick the profile that matches your hardware and threat model.

The core · always local

kage-core

RetrievalHKDF-keyed, SipHash pre-filter, AES-GCM per-chunk
Router3B small-model, picks tool & lane deterministically
AgentPlans, calls tools, waits for your tap on every write
Guardrailstool_input scanning is non-disableable
MemoryPer-member HKDF key, top-k decrypt only at read time
AuditEd25519 hash-chained ledger, replay by audit_id

Trust boundary

The model lane · you choose

full

The appliance profile.

72B on your GPU box.

Qwen 2.5 72B · R1-32B reasoning · 3B router

All three model lanes on your own appliance. Voice, reasoning, and deterministic audit — every byte stays on the local network. Recommended for households and small teams.

No external boundary crossed.

Connectors · your sources

oauth · pkce

Gmail

iMessage

Slack

Calendar

Photos

Files

Notes

Refresh tokens live in Vault at secret/kage/connectors/{member}. Every backfill runs the same chunk → encrypt → ingest pipeline. Revoke any connector in one click.

Channels · how you reach it

thin adapters

Web

CLI

iMessage

Slack

Voice

Every channel is a thin adapter that translates inbound events into one canonical /ask call. Planning, retrieval, and guardrails live in the core — never reimplemented per channel.

Trust model

Privacy by architecture, not by promise.

Every claim other assistants make in a Privacy Policy is a thing Kage enforces in code. If we can't prove it with a client-side key check or a signed audit entry, we don't claim it.

End-to-end encrypted vault

Age encryption happens client-side before the byte leaves your laptop. MinIO and Postgres never see plaintext. The Vault is sealed by your YubiKey — nothing unwraps without a physical tap.

On-device inference (first)

Small-model tasks run local via llama.cpp on Apple Neural Engine, CUDA, or Metal. Cloud fallback is opt-in, encrypted, per-request. The big model never sees your member id or tool results — only an allowlisted set of chat fields.

Per-step agent sandbox

Every tool call gets a Vault-minted token, 60-second TTL, single-use. Gmail.read, Calendar.write — scoped, revocable. Guardrails on tool_input are non-disableable. No secret leaks to a third-party connector.

Hash-chained audit ledger

Every reply, every retrieval, every confirmation — signed and hash-chained. Replay any action with one audit_id. We can't silently edit the past. You can verify the chain yourself.

A day with your Kage