CapaKit Docs#

CapaKit is a free runtime and CLI toolkit for building AI app Kits.

Current Alpha Scope#

macOS only.
Bun workloads only.

Known Limitations#

Some workloads do not run under the current macOS sandbox.
Chromium workloads are known not to work yet.

Quick Start#

Install#

Install CapaKit with the shell installer:

curl -fsSL https://capakit.com/install.sh | sh

The installer places capakit under ~/.capakit/bin by default and updates your shell profile unless CAPAKIT_NO_MODIFY_PATH=1 is set.

Or install with Homebrew:

brew install capakit/tap/capakit

Manual release archives and checksums are available from GitHub Releases.

Restart your shell if capakit is not found immediately:

capakit --version

Run a Kit#

Run the hello-world Kit from GitHub:

capakit run https://github.com/capakit/hello-world-demo-kit

CapaKit downloads the Kit, prepares the workload, starts the local runtime, and prints local URLs.

Keep the command running while you use the app. Press Ctrl-C to stop it.

Run as a Skill#

Kits that expose MCP can run as skills. CapaKit generates the provider-specific skill structure, including agent instructions and a command entrypoint for calling the Kit's MCP tools.

capakit run https://github.com/capakit/hello-world-demo-kit --global-skill codex
~/.codex/skills/hello-world/hello-world hello-world

Supported global skill providers:

codex: installs in ~/.codex/skills/<skill-name>/.
claude: installs in ~/.claude/skills/<skill-name>/.

Supported project skill providers:

generic: installs in <project>/<skill-name>/.
claude: installs in <project>/.claude/skills/<skill-name>/.

Generated skill files are temporary and are removed when capakit run exits.

Edit a Local Kit#

Clone a Kit source directory, test it, then run it:

git clone https://github.com/capakit/hello-world-demo-kit
cd hello-world-demo-kit
capakit test
capakit run

Use capakit exec [WORKLOAD] -- <COMMAND> to safely run workload commands such as bun add <package>.

Where CapaKit Writes#

~/.capakit/bin: default install location for the capakit executable.
~/.capakit: CapaKit home. Stores materialized package sources, runtime state, and persistent secret stores.
~/.cache/capakit: persistent build cache.
<kit>/_state/: generated state for a local Kit.

Storage commands:

capakit storage status
capakit storage status --detailed

Working with Coding Agents#

CapaKit is designed to be used with coding agents. Tell the agent to use capakit to create or edit a Kit, then describe the app you want in plain language.

Useful terms:

AI app: the user-facing or agent-facing product you want.
Kit: a CapaKit project the agent can create, edit, test, and run.
Web UI: a local browser app for forms, dashboards, workflows, uploads, or visual output.
Tool: a named action an assistant can call, such as summarize_file, search_docs, or extract_invoice_fields.
MCP server: a set of tools exposed to AI assistants and coding agents.
Skill: a tool app installed into an agent environment, such as a Codex skill.

Recommended Prompt Shape#

Start with the outcome you want. Only mention the user-facing surface when you already know it.

Create a web app:

Use capakit to create a new Kit called invoice-helper.
I want a web UI where I can upload invoice text and see extracted fields: vendor, date, total, and line items.
Please test it and leave clear run instructions.

Create assistant tools:

Use capakit to create a Kit that works as an MCP server.
It should provide a tool named extract_invoice_fields that accepts invoice text and returns structured invoice fields.
Please make the tool description clear enough for an AI assistant to choose it correctly.

Modify an existing Kit:

This directory contains a Kit for CapaKit.
Add a web UI for reviewing the results, keep the existing tools working, run the CapaKit tests, and update the README.

Agent-Readable Context#

CapaKit ships agent instructions that coding agents can reuse when creating or editing Kits.

capakit agents-md print

Glossary#

These terms appear throughout the docs.

A2A (Agent-to-Agent): Communication and collaboration between autonomous AI agents. A2A enables agents to delegate tasks, share context, and invoke each other's tools or skills.
Agent protocols: Standardized communication rules and data formats that allow AI agents, assistants, and external applications to securely exchange context and invoke tools across different environments.
AI app: The user-facing or agent-facing product that has AI functionality. It can be a web UI, a set of assistant tools, or a combination of both.
CapaKit CLI: The downloadable capakit command-line tool.
CapaKit runtime: The local CapaKit process that prepares, runs, and orchestrates an AI app.
CapaKit runtime and CLI toolkit: Shorthand for the CapaKit CLI, the local runtime, and its supporting toolkit.
CapaKit toolkit: The supporting developer tooling built around the CLI and runtime, including manifests, packaging, scaffolding, testing, registry access, and workload SDKs.
Endpoint: A specific network address or URL path exposed by a workload to receive and respond to incoming requests.
Kit: One AI app project for CapaKit as either a local directory, a GitHub repository, or a compressed .capakit archive.
Kit secret: A per-Kit secret that source workload code is allowed to resolve and access only when explicitly granted.
Local models: AI models executed locally on your machine's hardware rather than relying on external cloud APIs.
MCP (Model Context Protocol): An open standard that enables AI models, coding agents, and assistants to securely connect to external data sources and invoke tools.
OpenAI-compatible: An API standard that mimics OpenAI's REST API endpoints (such as /v1/chat/completions). This allows tools and clients originally built for OpenAI to work seamlessly with local or third-party models.
Public path: The local routing path CapaKit exposes for a specific service, such as /ui for a web interface or /mcp for tool access.
Registry: The public catalog of reusable Kits and examples.
Relay: A secure, CapaKit-managed proxy workload that handles calls to trusted external providers without exposing provider secrets to source workload code.
Skill: A bundled capability or specialized toolset installed into an agent environment (such as a Codex skill) that expands what the agent can accomplish.
Tool: An executable function or external API that an AI assistant can invoke to perform a specific action, such as search_docs or extract_invoice_fields.
Vault secret: A highly protected secret shared between Kits and strictly used by trusted, CapaKit-managed integrations.
Workload: An isolated execution process running inside a Kit, such as a Bun script serving a web UI or a Python process exposing MCP tools.

Anatomy of a Kit#

A Kit is one AI app project for CapaKit

The root file is capability.yml. It declares the AI app name, workloads, public paths, options, dependencies, host mounts, secrets, and connection policy. Workload source code lives under workloads/<workload-name>/.

Kit Sources#

Source	Editable	Supported Commands
Local source directory	Yes	`run`, `test`, `exec`, `kit package`, `kit clean`, `kit workloads`, `kit secrets`
Local `.capakit` archive	No	`run`
GitHub repository URL	No	`run`

Examples:

capakit run .
capakit run ./hello-world.capakit
capakit run https://github.com/capakit/hello-world-demo-kit

When CapaKit runs a packaged or remote source, it materializes the Kit under CapaKit-managed storage. To change the Kit, edit the original source directory.

Kit Files#

AGENTS.md: Kit-local instructions for coding agents.
capability.yml: runtime manifest.
capability-test.yml: optional test manifest for capakit test.
tests/: optional fixture directories used by capakit test.
workloads/<workload>/: workload source roots.
README.md: user-facing usage notes for the Kit.
.gitignore: excludes generated state and dependency/build output.
_state/: generated local CapaKit state. Do not edit it.

`AGENTS.md`#

Points coding agents to capakit agents-md print.

`README.md`#

README.md is the user-facing guide for the Kit. Keep it specific to the AI app.

Useful README content:

what the app does
how to run it
how to test it
mounts or secrets
public paths or tools users should call
screenshots for UI or visual-output Kits
known limitations

Coding agents should update README.md when app behavior, setup, commands, secrets, mounts, or user-visible outputs change.

`capability.yml` Manifest Reference#

Reference for the Kit runtime manifest.

capability.yml declares the Kit runtime graph: workloads, public paths, options, dependencies, host mounts, secrets, and connection policy.

Minimal Manifest#

version: '1'
name: hello-world

workloads:
  hello:
    endpoints:
      - mcp
    runtime:
      source:
        toolchain: bun
        prepare:
          command: bun install
        start:
          command: bun run src/index.ts

expose:
  - path: /mcp
    target: hello
    endpoints:
      - mcp
    default_mcp: true

YAML Conventions#

Use .yml or .yaml; generated Kits use capability.yml.
YAML map order is not significant unless a field explicitly says otherwise.
Quote version: '1' so YAML treats it as a string.
HTTP paths and endpoint paths start with /.
null and omitted optional fields are equivalent unless a field says otherwise.

ID Formats#

Manifest ids name Kit objects: name, workload ids, option ids, dependency ids, mount ids, secret ids, and variant ids.

Manifest ids start with a lowercase ASCII letter, end with a lowercase ASCII letter or digit, and contain only lowercase ASCII letters, digits, _, or -.

Top-Level Fields#

`version`#

Purpose: selects the manifest format.
Type: string.
Required: yes.
Allowed values: '1'.
Example:

version: '1'

`name`#

Purpose: names the Kit at runtime.
Type: manifest id.
Required: yes.
Example:

name: local-image-tagger

`workloads`#

Purpose: declares addressable workloads in the Kit.
Type: map of workload id to workload definition.
Required: yes.
Example:

workloads:
  app:
    endpoints:
      - http
    runtime:
      source:
        toolchain: bun
        start:
          command: bun run start

`expose`#

Purpose: maps host-reachable public paths to workload endpoints.
Type: list of expose entries.
Required: no.
Default: no public paths.
Example:

expose:
  - path: /
    target: app
    endpoints:
      - http

`options`#

Purpose: declares typed Kit inputs used by commands, dependencies, and runtime capabilities.
Type: map of option id to option declaration.
Required: no.
Default: no options.
Example:

options:
  model:
    kind: string
    default: ggml-org/gemma-3-270m-it-GGUF:Q8_0

`dependencies`#

Purpose: declares other Kits that this Kit imports.
Type: map of dependency id to dependency declaration.
Required: no.
Default: no dependencies.
Example:

dependencies:
  llama:
    source:
      git: https://github.com/capakit/llama-cpp-local-kit

`host_mounts`#

Purpose: declares host folders that users can bind at run, test, MCP, or exec time.
Type: map of mount id to mount declaration.
Required: no.
Default: no host mounts.
Example:

host_mounts:
  models:
    usage: Local model cache
    access: read_write

`kit_secrets`#

Purpose: declares secrets that source workload code may resolve through the workload SDK.
Type: list of secret declarations.
Required: no.
Default: no Kit secrets.
Example:

kit_secrets:
  - key: vendor_api_key
    usage: Vendor API key

`vault_secrets`#

Purpose: declares protected secrets for trusted CapaKit-managed integrations such as Relays.
Type: list of secret declarations.
Required: no.
Default: no vault secrets.
Example:

vault_secrets:
  - key: openai_api_key
    usage: OpenAI API key

Workloads#

Each workload entry is one of:

a source or Relay runtime workload with runtime
an imported workload with import
a variant workload with variants

Shared workload fields can be used with all workload shapes.

`workloads.<workload_id>`#

Purpose: defines one addressable workload.
Type: workload definition.
Required: yes.
Example:

workloads:
  app:
    endpoints:
      - http
    runtime:
      source:
        toolchain: bun
        start:
          command: bun run start

`workloads.<workload_id>.runtime`#

Purpose: declares how CapaKit runs or attaches the workload.
Type: runtime object.
Required: yes for source and Relay workloads.
Fields:
- source: source workload command runtime.
- attachments.relays: Relay attachments.
- capabilities: optional runtime capability grants.

`workloads.<workload_id>.runtime.source`#

Purpose: declares source code execution for a workload.
Type: source runtime object.
Required: yes for source workloads.
Fields:
- toolchain: workload toolchain.
- prepare: optional stack-start prep work.
- hydrate: optional per-instance pre-request work.
- start: source workload main entry point.
Example:

runtime:
  source:
    toolchain: bun
    prepare:
      command: bun install
    start:
      command: bun run src/index.ts

Source workload lifecycle stages:

prepare: prep work before the stack starts.
hydrate: per-instance work before the workload starts handling requests.
start: source workload main entry point.

`workloads.<workload_id>.runtime.source.prepare`#

Purpose: optional prep work such as bun install or other work before the stack starts.
Type: workload command.
Required: no.
Default: no prepare command.
Default network: full.
Example:

prepare:
  command: bun install

`workloads.<workload_id>.runtime.source.hydrate`#

Purpose: optional work done for every workload instance before it starts handling requests.
Type: workload command.
Required: no.
Default: no hydrate command.
Default network: full.
Latency: hydrate work can slow initial request handling.
Example:

hydrate:
  command: bun run src/hydrate.ts

`workloads.<workload_id>.runtime.source.start`#

Purpose: source workload main entry point.
Type: workload command.
Required: yes for source workloads.
Default network: none.
Example:

start:
  command: bun run src/index.ts

Workload Command Fields#

command: required shell command string.
env: optional map of literal environment variables.
env_from_options: optional map of environment variable name to option id.
network: optional network policy.

Allowed network values:

none: no IP networking.
loopback: local loopback networking.
full: full networking.

Example:

start:
  command: bun run start
  env:
    NODE_ENV: production
  env_from_options:
    MODEL_NAME: model
  network: loopback

Environment behavior:

Define each environment variable in one env source.
env_from_options values are converted to strings.

`workloads.<workload_id>.runtime.attachments.relays`#

Purpose: declares CapaKit-managed provider exit attachments.
Type: list of Relay attachments.
Required: no.
Default: no Relay attachments.
Example:

runtime:
  attachments:
    relays:
      - kind: exit_to_open_ai
        endpoint: oaic
        api_key_secret: openai_api_key

`workloads.<workload_id>.import`#

Purpose: imports a public path exposed by a dependency Kit.
Type: import object.
Required: yes for imported workloads.
Fields:
- dependency: dependency id from top-level dependencies.
- exposed_path: public path exposed by the dependency Kit.
Example:

workloads:
  llama:
    import:
      dependency: llama
      exposed_path: /oaic

`workloads.<workload_id>.variants`#

Purpose: declares multiple runtime implementations behind one workload id.
Type: map of variant id to variant object.
Required: yes for variant workloads.
Scope: variants share one trust model.
Example:

workloads:
  model:
    endpoints:
      - oaic
    variants:
      default:
        runtime:
          source:
            toolchain: bun
            start:
              command: bun run src/model.ts
      gpu:
        runtime:
          source:
            toolchain: bun
            start:
              command: bun run src/model-gpu.ts
          capabilities:
            gpu: metal

`workloads.<workload_id>.endpoints`#

Purpose: declares internal endpoints served by the workload.
Type: list of endpoint declarations.
Required: no.
Default: no endpoints.
Allowed protocols: http, mcp, oaic, a2a.
Example:

endpoints:
  - http
  - mcp

`workloads.<workload_id>.connections`#

Purpose: grants outbound workload-to-workload access through the CapaKit service mesh.
Type: list of workload ids.
Required: no.
Default: no outbound workload access.
Example:

connections:
  - llama
  - tools

`workloads.<workload_id>.mounts`#

Purpose: grants declared host mounts to the workload.
Type: list of mount ids or mount grant objects.
Required: no.
Default: no host mount access.
Example:

mounts:
  - images
  - mount_mid: models
    access: read_write

`workloads.<workload_id>.exposed_secrets`#

Purpose: grants Kit secrets to source workload code.
Type: list of secret ids.
Required: no.
Default: no secret access.
Example:

exposed_secrets:
  - vendor_api_key

Endpoints#

Endpoint declarations are used by workloads.<workload_id>.endpoints.

Expose entries use mesh endpoints, which accept protocol shorthand or endpoint paths.

Endpoint Forms#

Workload endpoint entries can use any of these forms:

Form	Example	Result
Protocol shorthand	`mcp`	MCP endpoint at `/mcp`
HTTP path shorthand	`/health`	HTTP endpoint at `/health`
Protocol/path map	`mcp: /agent`	MCP endpoint at `/agent`
Endpoint config	see below	Endpoint with explicit protocol, path, and optional MCP config

Protocol shorthand values are http, mcp, oaic, and a2a. Default paths are /http, /mcp, /oaic, and /a2a.

Example:

endpoints:
  - mcp
  - /health
  - mcp: /agent
  - protocol: mcp
    path: /tool
    config:
      timeout_ms: 30000

Protocol Config#

Endpoint config fields:
- protocol: http, mcp, oaic, or a2a.
- path: internal endpoint path.
- config: optional MCP config.
mcp config fields:
- timeout_ms: optional timeout in milliseconds.

Expose#

expose maps public host paths to internal workload endpoints.

`expose[].path`#

Purpose: public path reachable from the host.
Type: absolute HTTP path.
Required: yes.
Example:

path: /mcp

`expose[].target`#

Purpose: workload id to expose.
Type: workload id.
Required: yes.
Example:

target: app

`expose[].endpoints`#

Purpose: internal endpoints on the target workload to expose.
Type: list of mesh endpoint declarations.
Required: yes.
Example:

endpoints:
  - mcp

`expose[].default_mcp`#

Purpose: marks an MCP expose entry as the Kit's primary MCP endpoint for capakit mcp and skill installs.
Type: boolean.
Required: no.
Default: false.
Constraint: at most one expose entry can set default_mcp: true.
Example:

default_mcp: true

Options#

options declares typed Kit inputs.

`options.<option_id>.kind`#

Purpose: declares the option value type.
Type: option kind.
Required: yes.
Allowed values: string, number, boolean, enum, path.
Example:

kind: enum

`options.<option_id>.default`#

Purpose: fallback value when no override is provided.
Type: string, number, or boolean.
Required: no.
Default: unset.
Example:

default: metal

`options.<option_id>.required`#

Purpose: requires the user to provide a value when no default exists.
Type: boolean.
Required: no.
Default: false.
Example:

required: true

`options.<option_id>.enum_values`#

Purpose: restricts values for kind: enum.
Type: list of strings.
Required: no.
Default: no enum restriction.
Example:

enum_values:
  - none
  - metal

`options.<option_id>.description`#

Purpose: user-facing helper text for generated guidance and capakit kit info.
Type: description string.
Required: no.
Example:

description: Local GPU acceleration mode.

Option References#

Option references use:

from_option: <option_id>

Supported locations:

dependency option bindings
runtime.capabilities.gpu

Workload command environment uses env_from_options:

env_from_options:
  MODEL_NAME: model

Host Mounts#

`host_mounts.<mount_id>.usage`#

Purpose: describes the expected user-provided binding.
Type: usage string.
Required: yes.
Example:

usage: Local model cache

`host_mounts.<mount_id>.access`#

Purpose: default maximum access for the mount.
Type: mount access.
Required: no.
Default: read_only.
Allowed values: read_only, read_write.
Example:

access: read_write

Workload Mount Grants#

Grant a declared mount to a workload:

workloads:
  app:
    mounts:
      - models
      - mount_mid: uploads
        access: read_only

Mount behavior:

access defaults to the top-level mount access.
workload grants use access at or below the top-level mount access.

Dependency Mount Bindings#

Bind a dependency mount to a local mount:

dependencies:
  llama:
    source:
      git: https://github.com/capakit/llama-cpp-local-kit
    mount_bindings:
      models: models

Binding behavior:

keys are mount ids declared by the dependency Kit.
values are mount ids declared by the current Kit.
local mounts provide the access required by the dependency mount.

Secrets#

Secret Declaration Fields#

Secret declarations appear under kit_secrets and vault_secrets.

Fields:

key: required secret id.
usage: required user-facing helper text.
source: optional, prompt or generate; defaults to prompt.
format: optional, opaque or password.

Example:

kit_secrets:
  - key: vendor_api_key
    usage: Vendor API key
    source: prompt
    format: opaque

Format defaults:

Kit secrets default to format: opaque.
Vault secrets default to format: password.
source: generate requires format: password.

Kit Secrets#

Purpose: secrets source workload code may resolve through the workload SDK.
Declaration field: kit_secrets.
Workload grant field: exposed_secrets.
Example:

kit_secrets:
  - key: vendor_api_key
    usage: Vendor API key

workloads:
  app:
    exposed_secrets:
      - vendor_api_key

Vault Secrets#

Purpose: protected secrets for trusted CapaKit-managed integrations.
Declaration field: vault_secrets.
Example:

vault_secrets:
  - key: openai_api_key
    usage: OpenAI API key

Relays reference vault secrets through api_key_secret.

Workload Secret Exposure#

Expose a Kit secret to source workload code:

workloads:
  app:
    exposed_secrets:
      - vendor_api_key

Secret values are resolved at runtime through the workload SDK. capability.yml stores secret references.

Dependency Secret Bindings#

Bind a dependency secret to a local secret:

dependencies:
  child:
    source:
      git: https://github.com/capakit/child-kit
    secret_bindings:
      child_api_key: parent_api_key

Binding behavior:

keys are secret ids declared by the dependency Kit.
values are secret ids declared by the current Kit.

Dependencies#

`dependencies.<dependency_id>.source`#

Purpose: declares where to load a dependency Kit from.
Type: source object.
Required: yes.
Supported shapes:
- local path: { path: <path> }
- Git URL: { git: <url>, ref?: <ref> }

`dependencies.<dependency_id>.source.path`#

Purpose: local dependency Kit path.
Type: filesystem path.
Required: yes for local path sources.
Example:

source:
  path: ../shared-tools-kit

`dependencies.<dependency_id>.source.git`#

Purpose: Git dependency Kit source.
Type: Git repository URL.
Required: yes for Git sources.
Example:

source:
  git: https://github.com/capakit/llama-cpp-local-kit

`dependencies.<dependency_id>.source.ref`#

Purpose: optional Git ref.
Type: branch, tag, or commit SHA.
Required: no.
Example:

source:
  git: https://github.com/capakit/llama-cpp-local-kit
  ref: main

`dependencies.<dependency_id>.options`#

Purpose: passes option values to the dependency Kit.
Type: map from dependency option id to literal value or option reference.
Required: no.
Example:

options:
  default_model:
    from_option: model
  context_size: 8192

Binding behavior:

keys are option ids declared by the dependency Kit.
values can be string, number, boolean, or { from_option: <local_option_id> }.

`dependencies.<dependency_id>.mount_bindings`#

Purpose: maps dependency host mounts to local host mounts.
Type: map from dependency mount id to local mount id.
Required: no.
Example:

mount_bindings:
  models: models

`dependencies.<dependency_id>.secret_bindings`#

Purpose: maps dependency secrets to local secrets.
Type: map from dependency secret id to local secret id.
Required: no.
Example:

secret_bindings:
  api_key: vendor_api_key

Runtime Capabilities#

`runtime.capabilities.gpu`#

Purpose: grants GPU access to a source workload.
Type: GPU capability or option reference.
Required: no.
Default: no GPU access.
Allowed values: metal, { from_option: <option_id> }.
Example:

runtime:
  capabilities:
    gpu: metal

Option-backed example:

runtime:
  capabilities:
    gpu:
      from_option: gpu

GPU values:

direct GPU capability currently supports metal.
option-backed GPU capability accepts none or metal.

Relay Kinds#

`exit_to_open_ai`#

Purpose: routes mesh calls to OpenAI-compatible upstream APIs.
Required fields:
- endpoint: Relay endpoint.
- api_key_secret: vault secret id.
Upstream host: api.openai.com.
Auth header written by Relay: Authorization: Bearer <api-key>.
Endpoint protocol: oaic.

`exit_to_anthropic`#

Purpose: routes mesh calls to Anthropic.
Required fields:
- endpoint: Relay endpoint.
- api_key_secret: vault secret id.
Upstream host: api.anthropic.com.
Auth header written by Relay: x-api-key: <api-key>.
Endpoint protocol: http.

`exit_to_google_ai_studio`#

Purpose: routes mesh calls to Google AI Studio.
Required fields:
- endpoint: Relay endpoint.
- api_key_secret: vault secret id.
Upstream host: generativelanguage.googleapis.com.
Auth header written by Relay: x-goog-api-key: <api-key>.
Endpoint protocol: http.

Example Manifests#

hello-world-demo-kit: minimal Bun MCP workload.
local-image-tagger-demo-kit: HTTP and MCP endpoints, host mounts, and imported local model dependency.
llama-cpp-local-kit: local model workload with hydration, options, mount cache, GPU capability, MCP, and OAIC.
stable-diffusion-local-kit: local image generation workload with model/backend options and cache mount.
kids-storybook-creator-demo-kit: multi-dependency Kit with option and mount bindings.

`capability-test.yml` Manifest Reference#

Reference for the Kit test manifest.

capability-test.yml declares test cases run by capakit test.

Minimal Manifest#

tests:
  - id: hello-world
    kind: mcp
    target:
      exposed_path: /mcp
    request:
      tool: hello-world
    validations:
      - $.message.exists()

YAML Conventions#

Generated Kits use capability-test.yml.
YAML map order is not significant unless a field explicitly says otherwise.
validations entries are string expressions.
JSON paths start with $.

ID Formats#

Test ids are used for test output and fixture discovery. They contain ASCII letters, digits, -, or _.

Document Shapes#

Single Test Case#

capability-test.yml can contain one test case directly.

id: hello-world-mcp-smoke
kind: mcp
target:
  exposed_path: /mcp
request:
  tool: hello-world
validations:
  - $.message.exists()

`tests`#

capability-test.yml can contain a list of test cases under tests.

tests:
  - id: mcp-smoke
    kind: mcp
    target:
      exposed_path: /mcp
    request:
      tool: hello-world
  - id: typecheck
    kind: exec
    target:
      workload: hello
    request:
      command: bun run build

Shared Test Case Fields#

`id`#

Purpose: stable test id used for test output and tests/<id>/<mount-id>/ fixture discovery.
Type: test id.
Required: yes.
Example:

id: image-fixture-smoke

`kind`#

Purpose: selects the test runner.
Type: test kind.
Required: yes.
Allowed values: mcp, http, exec.
Example:

kind: http

`validations`#

Purpose: assertions evaluated against the validation value.
Type: assertion expression string or list of assertion expression strings.
Required: no.
Default: no validations.
Example:

validations:
  - $.ok.eq(true)
  - $.items.lenGte(1)

`verbose_output`#

Purpose: JSON paths from the validation value printed by capakit test --verbose.
Type: JSON path string or list of JSON path strings.
Required: no.
Default: no verbose output.
Constraint: each path starts with $ and exists in the validation value.
Example:

verbose_output:
  - $.summary
  - $.items[0]

Target Selection#

MCP and HTTP tests can target either:

an existing public path from capability.yml
an internal workload endpoint, which CapaKit exposes temporarily for that test

Exec tests target a workload artifact directly.

For MCP and HTTP tests, choose either target.exposed_path or both target.workload and target.endpoint.

`target.exposed_path`#

Purpose: selects an existing public path from capability.yml.
Type: absolute HTTP path string.
Required: yes when no workload endpoint target is provided for MCP or HTTP.
Example:

target:
  exposed_path: /mcp

`target.workload`#

Purpose: selects a workload for internal target-based MCP, HTTP, or exec tests.
Type: workload id.
Required: yes for workload-targeted MCP and HTTP tests, and for all exec tests.
Example:

target:
  workload: app

`target.endpoint`#

Purpose: selects a protocol-matching internal endpoint path for target-based MCP or HTTP tests.
Type: endpoint path string.
Required: yes for target-based MCP and HTTP tests.
Example:

target:
  workload: app
  endpoint: /http

MCP Tests#

`request.tool`#

Purpose: MCP tool name to call.
Type: MCP tool name.
Required: yes for MCP tests.
Example:

request:
  tool: extract_invoice_fields

`request.inputs`#

Purpose: MCP tool arguments.
Type: YAML mapping; nested values can be any JSON value.
Required: no.
Default: {}.
Example:

request:
  tool: extract_invoice_fields
  inputs:
    text: "Invoice total: 42.00"

HTTP Tests#

`request.method`#

Purpose: HTTP method.
Type: HTTP method.
Required: no.
Default: POST.
Supported values: GET, POST, PUT, PATCH, DELETE, HEAD.
Example:

request:
  method: GET

`request.path`#

Purpose: request path appended to the selected public test endpoint.
Type: absolute HTTP path string.
Required: yes for HTTP tests.
Note: request.path: / calls the selected public test endpoint exactly.
Example:

request:
  path: /health

`request.json`#

Purpose: JSON request body for POST, PUT, and PATCH.
Type: JSON value.
Required: no.
Default: {}.
Example:

request:
  json:
    prompt: lighthouse

Exec Tests#

`target.workload`#

Purpose: workload whose prepared artifact runs the command.
Type: workload id.
Required: yes.
Example:

target:
  workload: app

`target.variant`#

Purpose: workload variant to test.
Type: variant id.
Required: no.
Default: default variant.
Example:

target:
  workload: app
  variant: gpu

`request.command`#

Purpose: command to run against the prepared workload artifact.
Type: shell string or non-empty argv list.
Required: yes for exec tests.
Example:

request:
  command: bun run build

Argv example:

request:
  command:
    - bun
    - x
    - tsc
    - --noEmit

Default network: full.

Assertions#

Assertions use:

<json-path>.<operator>(<expected>)

Examples:

validations:
  - $.ok.eq(true)
  - $.items[0].total.eq(42)
  - $.summary.contains(done)

Expected values are parsed as JSON when possible. Otherwise they are treated as strings.

JSON Path#

Supported root: $.
Supported object form: .field.
Supported array form: [index].
Examples: $, $.items, $.items[0].total.

`exists`#

Purpose: passes when the JSON path exists.
Syntax:

$.field.exists()

`notNull`#

Purpose: passes when the JSON path exists and is not null.
Syntax:

$.field.notNull()

`eq`#

Purpose: passes when actual value equals expected value.
Syntax:

$.field.eq(value)

Numeric equality allows small floating-point differences.

`ne`#

Purpose: passes when actual value does not equal expected value.
Syntax:

$.field.ne(value)

`gte`#

Purpose: passes when actual numeric value is greater than or equal to expected.
Expected value: number.
Syntax:

$.field.gte(10)

`lte`#

Purpose: passes when actual numeric value is less than or equal to expected.
Expected value: number.
Syntax:

$.field.lte(10)

`lenEq`#

Purpose: passes when array, object, or string length equals expected.
Expected value: non-negative integer.
Syntax:

$.items.lenEq(3)

`lenGte`#

Purpose: passes when array, object, or string length is greater than or equal to expected.
Expected value: non-negative integer.
Syntax:

$.items.lenGte(1)

`lenLte`#

Purpose: passes when array, object, or string length is less than or equal to expected.
Expected value: non-negative integer.
Syntax:

$.items.lenLte(10)

`contains`#

Purpose: passes when actual contains expected.
Syntax:

$.field.contains(value)

Expected value:

arrays: value equal to at least one item.
objects: string key present in the object.
strings: substring.

Validation Value#

validations and verbose_output evaluate against the normalized JSON result for the test case.

MCP#

For MCP tests, the validation value is the tool result value.

Failure behavior:

MCP tool errors fail before validations run.

HTTP#

For HTTP tests, the validation value is:

{
  "status_code": 200,
  "body": {
    "ok": true
  }
}

Validation value fields:

status_code is the HTTP response status.
body is the parsed JSON response body.
empty response bodies evaluate as null.
non-JSON response bodies evaluate as strings.
HTTP response status is part of the validation value; validate $.status_code when status matters.
transport and protocol errors fail before validations run.

Exec#

For exec tests, the validation value is:

{
  "exit_code": 0,
  "target_count": 1,
  "targets": [
    {
      "workload": "app",
      "variant": "default",
      "artifact_root": "/path/to/prepared/artifact",
      "command_root": "/path/to/command/root"
    }
  ]
}

Failure behavior:

exec tests produce a result object only when the command exits successfully.
non-zero command exits fail before validations run.

Fixtures#

Fixture Directory Layout#

Fixture directories live under:

tests/<test-id>/<mount-id>/

Example:

tests/
  image-fixture-smoke/
    images/
      photo.jpg

Mount Binding#

Fixture mount binding:

<test-id> is the test id.
<mount-id> matches a declared host_mounts key in capability.yml.
fixture mounts are auto-bound for that test.
explicit --mount <mount-id>=<path> values override fixture mounts.

Execution Order#

Exec tests run first as preflight checks.
MCP and HTTP tests run after exec tests.
Exec tests run against prepared workload artifacts.
MCP and HTTP tests run against a live test runtime.

Example Test Manifests#

hello-world-demo-kit: MCP structured result validation and exec typecheck.
local-image-tagger-demo-kit: MCP test with fixture-backed host mount input.
stable-diffusion-local-kit: HTTP test against a workload-local test endpoint plus exec typecheck.
realtime-voice-demo-kit: HTTP test covering connected workloads and generated result assertions.

Workloads#

Each source workload lives in workloads/<workload-name>/.

The workload workspace is the root filesystem view available to that workload. It includes the workload's own files, declared mounts, runtime home/cache/tmp dirs, and CapaKit toolchain paths.

A workload can only see inside its workspace. Sibling workload source directories are outside that workspace.

Workload Layout#

Typical generated Bun workload:

workloads/app/
  .gitignore
  package.json
  tsconfig.json
  src/
    index.ts
    capakit_mcp.ts

`capakit exec`#

Run workload commands through CapaKit:

capakit exec app -- bun install
capakit exec app -- bun test
capakit exec --all -- bun run build

capakit exec runs commands inside the workload workspace. File writes land in the local workload directory, while the process keeps the same workspace boundaries as the workload runtime.

Bun Workloads#

Generated Bun workloads use @capakit/sdk and protocol-specific helpers:

MCP: @capakit/sdk/mcp.
HTTP: @capakit/sdk plus Hono/Vite/React scaffold.
A2A: @capakit/sdk/a2a.
OpenAI-compatible: @capakit/sdk/oaic.

Generated Bun manifests use prepare.command: bun install. Generated MCP, A2A, and OpenAI-compatible workers start with bun run src/index.ts; app scaffolds can use scripts such as bun run start.

CLI Reference#

The capakit CLI is the runtime and toolkit for coding agents.

Global Flags#

Use -h or --help on any command:

capakit --help
capakit --version
capakit <command> --help

Output Formats#

capakit defaults to human-readable text output.

Use --output json for commands that produce structured summaries or tables:

capakit kit info --output json
capakit kit workloads list --output json
capakit storage status --output json

Set the default output format with:

CAPAKIT_OUTPUT_FORMAT=json

Supported values are text and json. --output takes precedence over CAPAKIT_OUTPUT_FORMAT.

Logging#

Runtime logs are separate from command output. For machine-readable logs, set:

CAPAKIT_LOG_FORMAT=json

Useful diagnostic controls:

--verbose: raise CapaKit log verbosity for the command.
--log-details lineage|fields|all|none: include internal lineage and/or structured log fields.
CAPAKIT_LOG_DETAILS=lineage|fields|all|none: default log detail mode.
CAPAKIT_CLI_THEME=auto|dark|light|mono: override terminal log styling.

Commands#

Command	Purpose
`capakit run`	Run a Kit.
`capakit test`	Run a Kit's `capability-test.yml`.
`capakit exec`	Run a command inside a workload context.
`capakit kit package`	Package a local Kit as a `.capakit` archive.
`capakit kit info`	Inspect a Kit.
`capakit kit workloads list`	List Kit workloads.
`capakit storage status`	Inspect CapaKit storage usage.
`capakit registry search`	Search public registry Kits.

Workload SDKs#

Workload SDKs are used inside workload source code. They give a workload its CapaKit runtime context, expose protocol servers to the runtime, call connected workloads, resolve declared secrets, and read declared host mounts.

TypeScript SDK#

Bun workloads use the TypeScript SDK package @capakit/sdk.

Use the root module for runtime primitives:

import {
  createWorkloadSdk,
  endpointPath,
  hostMountMid,
  secretMid,
  workloadMid,
} from "@capakit/sdk";

Core APIs#

createWorkloadSdk(): creates the workload runtime SDK.
sdk.start(): starts the workload's mounted protocol endpoints.
sdk.stop(): stops mounted endpoints and SDK clients.
sdk.mount(...): mounts an HTTP-style endpoint handler.
sdk.workloads.endpoint(...): resolves a manifest-declared connected workload endpoint.
sdk.secrets.resolve(secretMid("...")): resolves a Kit secret listed in exposed_secrets.
sdk.mounts.get(hostMountMid("...")): reads a declared host mount binding.
endpointPath, workloadMid, secretMid, hostMountMid: typed id helpers for SDK calls.

HTTP Workloads#

Minimal HTTP workload:

import { createWorkloadSdk, endpointPath } from "@capakit/sdk";

const sdk = createWorkloadSdk();

sdk.mount({
  protocol: "http",
  endpoint: endpointPath("/http"),
  handler: async () =>
    Response.json({
      ok: true,
    }),
});

await sdk.start();

MCP Workloads#

Provider helpers live in protocol-specific modules:

@capakit/sdk/mcp: mountMcp, createMcpClient.
@capakit/sdk/oaic: mountOaic, createOaicClient.
@capakit/sdk/a2a: mountA2a, createA2aClient.
@capakit/sdk/anthropic: createAnthropicClient.
@capakit/sdk/google-ai-studio: createGoogleAiStudioClient.
@capakit/sdk/websocket: connectWebSocket.

Minimal MCP workload:

import { createWorkloadSdk } from "@capakit/sdk";
import { mountMcp } from "@capakit/sdk/mcp";
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";

const sdk = createWorkloadSdk();
const server = new McpServer({
  name: "hello",
  version: "1.0.0",
});

mountMcp(sdk, {
  endpoint: "/mcp",
  server,
});

await sdk.start();

Connected Workloads#

Calling a connected MCP workload:

import { createWorkloadSdk, endpointPath, workloadMid } from "@capakit/sdk";
import { createMcpClient } from "@capakit/sdk/mcp";

const sdk = createWorkloadSdk();
const tools = await createMcpClient(
  sdk,
  workloadMid("tools"),
  endpointPath("/mcp"),
);

const result = await tools.callTool({
  name: "lookup",
  arguments: { query: "hello" },
});

List the target workload in the caller's connections.

Example Kits:

hello-world-demo-kit: minimal Bun MCP workload.
local-image-tagger-demo-kit: SDK usage with mounts, connected model dependency, HTTP, and MCP.
llama-cpp-local-kit: OpenAI-compatible local model workload.

Security Model#

CapaKit treats Kits and their workloads as code that runs inside explicit boundaries. The local runtime prepares the Kit, starts workload processes, routes requests between them, and limits what each workload can see or call based on the Kit manifest.

A workload's lifecycle is sandboxed from the host across prepare, hydrate, and start commands.

Runtime Isolation#

Runtime isolation boundaries:

Kit
  Package/source with host-exposed public paths

  Workloads
    Code processes that implement the app
    Connected through an isolated workload service mesh

    Workload
      Sandboxed runtime process, such as `bun run start`
      Access to its workload files and declared resources

Secrets#

CapaKit has two secret stores:

Kit secrets: per-Kit secrets exposed to configured workloads.
Vault secrets: protected secrets shared between Kits and used by trusted CapaKit-managed integrations.

capability.yml stores secret declarations and grants, not secret values.

Kit secrets are for values a workload may need directly, such as an API key used by Kit source code. Workloads receive Kit secret access by listing the secret under exposed_secrets, then resolving it through the workload SDK.

Vault secrets are reserved for trusted CapaKit-managed integrations, such as Relays.

Isolation Resources#

Workload resource access is explicit and granular:

Environment: CapaKit constructs workload-specific env.
Filesystem: workloads can read their workload root, runtime home/cache/tmp dirs, CapaKit toolchain paths, and declared host mounts.
Network: workload-to-workload access is limited to manifest-declared connections.
Secrets: workload code resolves configured exposed Kit secrets.
RPC: CapaKit service-mesh RPC links use mTLS with SPIFFE-style service identities.

Prepare commands default to full network access for dependency installation. Start commands default to no IP networking. Workload commands can override the default with network: none, network: loopback, or network: full.

Run Modes#

CapaKit has two main local run modes:

Source mode: runs root Kit workloads from local source roots.
Managed mode: builds workload artifacts and runs from those artifacts.

Imported Kit dependencies use managed artifact execution by default.

Sandbox Backends#

CapaKit translates the same workload isolation model into different runtime backends.

On macOS, CapaKit runs workload commands through a generated Seatbelt policy. The generated policy starts from default deny, then allowlists required system files, runtime paths, host mounts, IPC paths, network access, and GPU access based on the workload manifest.

Set CAPAKIT_DEBUG_SEATBELT_POLICY=1 to have CapaKit write the generated Seatbelt policy path to stderr for debugging.

Relays#

Relays are CapaKit-managed exit workloads for calling trusted external providers.

Instead of injecting provider secrets into source workloads, Relay exit nodes attach headers in transit and route connections to preconfigured provider URLs. The source workload talks to the Relay like any other connected workload.

The result is a narrower trust surface:

provider API keys stay out of source workload code
the provider key can live in the global vault secret store
external provider access is represented as a workload connection in capability.yml
the Relay is a CapaKit-managed runtime attachment

Relays are provider-specific adapters with explicit authentication behavior. They are not general-purpose proxies.

The public Registry is a catalog of example and reusable Kits. Its source of truth is github.com/capakit/registry, and the CLI reads the catalog from that repository.

Package Shape#

A packaged Kit is a .capakit archive created from a local Kit source directory.

capakit kit package

Run packaged Kits with capakit run. Edit, test, and package from local source directories.

Registry Metadata#

Use the Registry to discover current public Kits:

capakit registry tags
capakit registry search --tag llama.cpp
capakit registry info llama-cpp-local

Publishing Flow#

Public Registry updates are managed through the capakit/registry repository.

Support#

For runtime bugs, CLI failures, docs issues, or confusing behavior, open an issue in github.com/capakit/cli.

Issue Reports#

Include:

capakit --version
macOS version and CPU architecture
install method: shell installer, Homebrew, or GitHub Release/manual download
the command you ran
the Kit source, preferably a GitHub link when possible
relevant error output or logs

Runtime Diagnostics#

Useful diagnostic inputs:

--verbose
--log-details lineage|fields|all|none
CAPAKIT_LOG_FORMAT=json
CAPAKIT_DEBUG_SEATBELT_POLICY=1

Do not open public GitHub issues for vulnerabilities or sensitive reports. Use the security contact flow on capakit.com/security.

CapaKit Docs#

Current Alpha Scope#

Known Limitations#

Quick Start#

Install#

Run a Kit#

Run as a Skill#

Edit a Local Kit#

Where CapaKit Writes#

Working with Coding Agents#

Recommended Prompt Shape#

Agent-Readable Context#

Glossary#

Anatomy of a Kit#

Kit Sources#

Kit Files#

AGENTS.md#

README.md#

capability.yml Manifest Reference#

Minimal Manifest#

YAML Conventions#

ID Formats#

Top-Level Fields#

version#

name#

workloads#

expose#

options#

dependencies#

host_mounts#

kit_secrets#

vault_secrets#

Workloads#

workloads.<workload_id>#

workloads.<workload_id>.runtime#

workloads.<workload_id>.runtime.source#

workloads.<workload_id>.runtime.source.prepare#

workloads.<workload_id>.runtime.source.hydrate#

workloads.<workload_id>.runtime.source.start#

Workload Command Fields#

workloads.<workload_id>.runtime.attachments.relays#

workloads.<workload_id>.import#

workloads.<workload_id>.variants#

workloads.<workload_id>.endpoints#

workloads.<workload_id>.connections#

workloads.<workload_id>.mounts#

workloads.<workload_id>.exposed_secrets#

Endpoints#

Endpoint Forms#

Protocol Config#

Expose#

expose[].path#

expose[].target#

expose[].endpoints#

expose[].default_mcp#

Options#

options.<option_id>.kind#

options.<option_id>.default#

options.<option_id>.required#

options.<option_id>.enum_values#

options.<option_id>.description#

Option References#

Host Mounts#

host_mounts.<mount_id>.usage#

host_mounts.<mount_id>.access#

Workload Mount Grants#

Dependency Mount Bindings#

Secrets#

Secret Declaration Fields#

Kit Secrets#

Vault Secrets#

Workload Secret Exposure#

Dependency Secret Bindings#

Dependencies#

dependencies.<dependency_id>.source#

dependencies.<dependency_id>.source.path#

dependencies.<dependency_id>.source.git#

dependencies.<dependency_id>.source.ref#

dependencies.<dependency_id>.options#

dependencies.<dependency_id>.mount_bindings#

`AGENTS.md`#

`README.md`#

`capability.yml` Manifest Reference#

`version`#

`name`#

`workloads`#

`expose`#

`options`#

`dependencies`#

`host_mounts`#

`kit_secrets`#

`vault_secrets`#

`workloads.<workload_id>`#

`workloads.<workload_id>.runtime`#

`workloads.<workload_id>.runtime.source`#

`workloads.<workload_id>.runtime.source.prepare`#

`workloads.<workload_id>.runtime.source.hydrate`#

`workloads.<workload_id>.runtime.source.start`#

`workloads.<workload_id>.runtime.attachments.relays`#

`workloads.<workload_id>.import`#

`workloads.<workload_id>.variants`#

`workloads.<workload_id>.endpoints`#

`workloads.<workload_id>.connections`#

`workloads.<workload_id>.mounts`#

`workloads.<workload_id>.exposed_secrets`#

`expose[].path`#

`expose[].target`#

`expose[].endpoints`#

`expose[].default_mcp`#

`options.<option_id>.kind`#

`options.<option_id>.default`#

`options.<option_id>.required`#

`options.<option_id>.enum_values`#

`options.<option_id>.description`#

`host_mounts.<mount_id>.usage`#

`host_mounts.<mount_id>.access`#

`dependencies.<dependency_id>.source`#

`dependencies.<dependency_id>.source.path`#

`dependencies.<dependency_id>.source.git`#

`dependencies.<dependency_id>.source.ref`#

`dependencies.<dependency_id>.options`#

`dependencies.<dependency_id>.mount_bindings`#

`dependencies.<dependency_id>.secret_bindings`#

`runtime.capabilities.gpu`#

`exit_to_open_ai`#

`exit_to_anthropic`#

`exit_to_google_ai_studio`#

`capability-test.yml` Manifest Reference#

`tests`#

`id`#

`kind`#

`validations`#

`verbose_output`#

`target.exposed_path`#

`target.workload`#

`target.endpoint`#

`request.tool`#

`request.inputs`#

`request.method`#

`request.path`#

`request.json`#

`target.workload`#

`target.variant`#

`request.command`#

`exists`#

`notNull`#

`eq`#

`ne`#

`gte`#

`lte`#

`lenEq`#

`lenGte`#

`lenLte`#

`contains`#

`capakit exec`#