Architecture v1 — MVP

This is the architecture for the first shippable version of Jan Agent Framework. It covers the core trait system, runtime orchestrator, memory/policy plugins, and CLI integration — everything needed to run jan agent chat through the framework with pluggable components.

Items marked as future in the complete architecture (v2) — hooks, I/O type erasure, turn engine, builder pattern, benchmark harness — are excluded here.

Component architecture

The system is organized into four layers — consumer, orchestration, core, and execution:

Component Architecture — four layers from consumer to execution

💡

The agent core never makes direct HTTP calls. All LLM inference goes through ctx.provider.chat_completion() and all tool execution goes through ctx.tools.dispatch().

Component interaction

The AgentRuntime sits at the center. Consumers call run_turn(), which builds an AgentContext containing all real components and passes it to the core. Events flow out via a broadcast channel.

Component Interaction — AgentRuntime, AgentContext, and AgentCore

Trait hierarchy

These are the contract interfaces that define the framework's extension points:


AgentCore                          LlmProvider
├── core_type() -> &str            ├── name() -> &str
├── run_turn(input, ctx)           └── chat_completion(messages, tools, config)
├── init(ctx)                              -> Result<LlmResponse, ProviderError>
├── shutdown(ctx)
└── built_in_tools() -> Vec<ToolMeta>
ToolDispatcher                     MemoryPlugin
├── dispatch(tool_id, args)        ├── read(query) -> Vec<MemoryEntry>
│       -> Result<DispatchResult>  ├── write(key, content)
└── tool_schemas() -> Vec<ToolMeta>├── forget(key)
                                   ├── pre_turn_context(msg, history)
RuntimePolicy                      └── post_turn_observe(msg, response, history)
├── check_permission(tool, args)
│       -> Result<ExecutionGrant>  SensorPlugin
└── name() -> &str                 ├── latest_frame() -> Option<SensorFrame>
                                   ├── frame_count() -> u64
                                   └── is_active() -> bool

AgentRuntime — the orchestrator

The runtime owns all components as trait objects and manages the agent lifecycle:


pub struct AgentRuntime {
    core:     Box<dyn AgentCore>,
    provider: Box<dyn LlmProvider>,
    tools:    Box<dyn ToolDispatcher>,
    memory:   Box<dyn MemoryPlugin>,
    policy:   Box<dyn RuntimePolicy>,
    config:   FrameworkConfig,
    event_tx: EventSender,
}

During a turn, the runtime temporarily moves real components into an AgentContext using std::mem::replace with noop placeholders, gives the core exclusive ownership, then restores them after the turn completes.


impl AgentRuntime {
    pub fn new(core, provider, tools, memory, policy) -> Self;
    pub fn with_config(self, config: FrameworkConfig) -> Self;
    pub async fn init(&mut self) -> Result<()>;
    pub async fn run_turn(&mut self, message, history, cancel) -> Result<TurnOutput>;
    pub fn events(&self) -> EventReceiver;
    pub async fn shutdown(&mut self) -> Result<()>;
}

Data flow — one turn end to end


User input (text)
    │
    ▼
AgentRuntime::run_turn(message, history, cancel)
    │
    ├── memory.pre_turn_context(message, history)
    │       → returns context entries to inject
    │
    ├── take_context()
    │       → swaps real components into AgentContext
    │
    ├── core.run_turn(TurnInput, AgentContext)
    │   │
    │   │   ┌── REACT LOOP (may iterate many steps) ──────────────┐
    │   │   │                                                      │
    │   │   │  1. Build messages (system + history + user + vision) │
    │   │   │  2. Check cancellation                               │
    │   │   │  3. Check token budget → compact if needed           │
    │   │   │  4. provider.chat_completion(messages, tools, config) │
    │   │   │     ├── on success → LlmResponse                    │
    │   │   │     ├── on ContextLength → compact and retry         │
    │   │   │     ├── on RateLimited → exponential backoff         │
    │   │   │     └── on AuthError → fatal                         │
    │   │   │  5. If final text (no tool calls) → return           │
    │   │   │  6. If empty + no tools → nudge model, continue      │
    │   │   │  7. For each tool call:                              │
    │   │   │     a. ctx.tools.dispatch(name, args)                │
    │   │   │     b. Compress result, add to messages              │
    │   │   │     c. Emit AgentEvent::ToolResult                   │
    │   │   │  8. If vision → inject latest frame                  │
    │   │   │  9. Loop back to step 2                              │
    │   │   └──────────────────────────────────────────────────────┘
    │   │
    │   └── returns TurnOutput { response, tokens_used, steps,
    │                             finish_reason, history }
    │
    ├── restore_context()
    │
    ├── memory.post_turn_observe(message, response, history)
    │
    └── returns TurnOutput to consumer

Event system

The core emits events through a tokio::broadcast channel:


ReActCore (inside run_turn)
    │
    │  ctx.events.send(AgentEvent::Thinking { step })
    │  ctx.events.send(AgentEvent::ToolCall { step, tool_id, args })
    │  ctx.events.send(AgentEvent::ToolResult { ... })
    │  ctx.events.send(AgentEvent::Retrying { step, attempt, delay_ms })
    │  ctx.events.send(AgentEvent::ContextCompacted { turns_removed })
    │  ctx.events.send(AgentEvent::TokenBudget { used, total })
    │
    ▼
EventSender (tokio::broadcast)
    │
    ├──▶ Direct subscriber (Jan Desktop, API server)
    │
    └──▶ spawn_event_bridge() → mpsc channel (jan-cli TUI)

Crate layout


crates/
├── jan-framework/          ← Traits + types (the contract crate)
│   └── src/
│       ├── agent_core.rs   AgentCore trait, AgentContext, TurnInput/Output
│       ├── provider.rs     LlmProvider trait, CompletionConfig, ProviderError
│       ├── tool.rs         ToolDispatcher trait
│       ├── memory.rs       MemoryPlugin + NullMemory, WorkingMemory
│       ├── policy.rs       RuntimePolicy + HostPolicy, StandardPolicy
│       ├── sensor.rs       SensorPlugin trait
│       ├── events.rs       AgentEvent enum, EventSender/Receiver
│       ├── types.rs        ChatMessage, ToolMeta, FrameworkConfig, LlmResponse
│       ├── registry.rs     PluginRegistry
│       └── runtime.rs      AgentRuntime (orchestrator)
│
├── jan-agent-core/         ← Agent implementations
│   └── src/
│       ├── react_core.rs   ReActCore (standalone AgentCore impl)
│       ├── openai_provider.rs  OpenAiProvider (LlmProvider impl)
│       ├── helpers.rs      Context compaction, system prompts
│       ├── dispatcher.rs   Dispatcher (WASM/host/microVM routing)
│       └── vision.rs       VisionProvider + captures
│
├── jan-cli/                ← CLI binary (reference consumer)
│   └── src/
│       ├── main.rs         CLI entry, TUI loop, one-shot mode
│       ├── runtime_builder.rs  build_runtime() + spawn_event_bridge()
│       └── agent_tui.rs    Ratatui TUI state and rendering
│
├── jan-agent-sandbox/      ← WASM execution engine (wasmtime)
├── jan-data/               ← Model discovery, thread storage
├── jan-llamacpp/           ← LlamaCPP process management
└── jan-utils/              ← Shared utilities

Security model


User input
    │
    ▼
LLM inference (via ctx.provider)
    │
    ▼
Tool call requested by LLM
    │
    ▼
RuntimePolicy::check_permission()
    │
    ├── Denied → error returned to LLM, turn continues
    ├── Granted(Host) → native execution, full access
    ├── Granted(Wasm) → wasmtime sandbox, fuel-limited
    ├── Granted(MicroVm) → microsandbox, network-isolated
    └── Granted(Remote) → forwarded to remote endpoint
    │
    ▼
Tool execution (in granted sandbox)

Policy	Behavior
HostPolicy	All tools run natively on host (development)
StandardPolicy	WASM default, microVM for `code.exec`, host for `robot.*`
ProfilePolicy	Configurable per-tool overrides with deny-list

Key design decisions

Decision	Rationale
Trait objects over generics	`AgentRuntime` stores `Box<dyn Trait>`. Consumers never deal with `AgentRuntime<C, P, T, M, Pol>`. Dynamic dispatch cost is negligible vs LLM latency.
Take/restore over shared references	`std::mem::replace` moves real components into `AgentContext`. No `Arc<Mutex<>>` in the hot loop.
Event bridge for TUI	Framework uses `tokio::broadcast`; TUI expects `mpsc`. `spawn_event_bridge()` converts via JSON serde. Temporary bridge.
Two ChatMessage types	Legacy `agent.rs` has its own type. ReActCore uses framework type exclusively. Legacy type persists for backward compat only.

What v1 does NOT include

These are designed but deferred to Architecture v2:

Capability	Why deferred
Lifecycle hooks (HookChain)	Designed in doc 08; AgentContext has no `.hooks` field yet
I/O type erasure (AgentInput/Output)	Designed in doc 09; AgentCore uses fixed TurnInput/TurnOutput
Turn engine extraction	Shared single-turn runner for all cores (doc 12)
Builder pattern	`AgentRuntimeBuilder` with fluent API (doc 15)
Benchmark harness	Criterion benchmarks with mock LLM (doc 16)
Structured error strategy	Layered error handling by severity (doc 17)

Architecture v2 — Complete