Lifecycle Hooks

⚠️

This is a v2 design — not yet implemented. See Architecture v2 for context.

The AgentHook system provides composable, priority-ordered hooks for cross-cutting concerns. Unlike plugins (which provide capabilities), hooks observe and optionally intercept lifecycle events.

The problem

The current framework has hook-like mechanisms scattered across layers — init() / shutdown() on AgentCore, pre_turn_context() / post_turn_observe() on MemoryPlugin, check_permission() on RuntimePolicy. There is no unified mechanism for code that needs to observe or intercept behavior at multiple lifecycle points.

Use cases that cannot be cleanly implemented today:

Use Case	Required Hook Points
Audit logging	pre-turn, post-turn, pre-tool, post-tool, errors
Cost tracking	post-LLM-call (token counts), post-turn (aggregate)
Input guardrails	pre-turn (before LLM sees the message)
Output guardrails	post-LLM-call (before tool dispatch or response)
Tool argument rewriting	pre-tool (modify args before execution)
Human-in-the-loop	pre-tool (for dangerous operations)
Provider fallback	on-LLM-error (switch to backup provider)

The `AgentHook` trait


#[async_trait]
pub trait AgentHook: Send + Sync {
    fn name(&self) -> &str;
    fn priority(&self) -> u32 { 100 }  // lower = runs first
    fn fail_open(&self) -> bool { true } // true = errors are non-fatal
    // Session lifecycle
    async fn on_session_start(&self, ctx: &HookContext) -> HookResult<()>;
    async fn on_session_end(&self, ctx: &HookContext) -> HookResult<()>;
    // Turn lifecycle
    async fn pre_turn(&self, input: &TurnInput, ctx: &HookContext)
        -> HookResult<HookAction<TurnInput>>;
    async fn post_turn(&self, input: &TurnInput, output: &TurnOutput, ctx: &HookContext)
        -> HookResult<HookAction<TurnOutput>>;
    async fn on_turn_error(&self, input: &TurnInput, error: &AgentError, ctx: &HookContext)
        -> HookResult<HookAction<TurnOutput>>;
    // LLM call lifecycle
    async fn pre_llm_call(&self, messages: &[ChatMessage], tools: &[ToolMeta], ctx: &HookContext)
        -> HookResult<HookAction<LlmCallInput>>;
    async fn post_llm_call(&self, response: &LlmResponse, ctx: &HookContext)
        -> HookResult<HookAction<LlmResponse>>;
    async fn on_llm_error(&self, error: &ProviderError, attempt: usize, ctx: &HookContext)
        -> HookResult<HookAction<RetryDecision>>;
    // Tool lifecycle
    async fn pre_tool_call(&self, tool_name: &str, args: &Value, grant: &ExecutionGrant, ctx: &HookContext)
        -> HookResult<HookAction<Value>>;
    async fn post_tool_call(&self, tool_name: &str, args: &Value, result: &Result<DispatchResult, String>, elapsed: Duration, ctx: &HookContext)
        -> HookResult<HookAction<DispatchResult>>;
    // Context management
    async fn pre_compaction(&self, messages: &[ChatMessage], budget: &TokenBudget, ctx: &HookContext)
        -> HookResult<HookAction<CompactionHint>>;
    async fn post_compaction(&self, removed_count: usize, summary: &str, ctx: &HookContext)
        -> HookResult<()>;
}

All methods have default no-op implementations — hooks only override the points they care about.

HookAction — what hooks can do


pub enum HookAction<T> {
    Continue,          // pass through to next hook
    Transform(T),      // replace value, continue chain
    Reject(String),    // abort operation with reason
    Replace(T),        // replace value, skip remaining hooks
}

Two categories of hooks:

Gate hooks (2): on_session_start, on_tool_call — can abort the operation
Observation hooks (8): everything else — can transform data but not abort

Hook composition

Multiple hooks execute in priority order (lower number = earlier):


pre_turn hooks (priority order):
  [guardrail (p=10)] → [logging (p=50)] → [metrics (p=100)]
       │                      │                     │
  check input             log input            start timer
       ▼                      ▼                     ▼
                   AgentCore::run_turn()
       │                      │                     │
post_turn hooks (reverse priority):
  [metrics (p=100)] → [logging (p=50)] → [guardrail (p=10)]
       │                      │                     │
  stop timer             log output           check output

Short-circuit rules

Continue — pass to next hook
Transform(val) — next hook sees the transformed value
Reject(reason) — remaining hooks skipped; operation fails
Replace(val) — remaining hooks skipped; value used as-is

HookContext and shared state


pub struct HookContext {
    pub session_id: String,
    pub turn_number: usize,
    pub step_number: usize,
    pub core_type: String,
    pub events: EventSender,
    pub state: HookState,  // shared key-value state across hooks
}
pub struct HookState {
    inner: Arc<RwLock<HashMap<String, Value>>>,
}

Hooks share state via HookState — for example, a cost tracker accumulates token costs across LLM calls.

Integration with AgentContext

The AgentContext gains a hooks: HookChain field:


pub struct AgentContext {
    pub provider: Box<dyn LlmProvider>,
    pub tools:    Box<dyn ToolDispatcher>,
    pub memory:   Box<dyn MemoryPlugin>,
    pub policy:   Box<dyn RuntimePolicy>,
    pub hooks:    HookChain,              // NEW
    pub events:   EventSender,
    pub config:   FrameworkConfig,
}

Cores call ctx.hooks.run_pre_llm_call() etc. at appropriate points. Hooks are opt-in for cores — a minimal core can ignore hooks entirely.

Updated runtime flow


AgentRuntime::run_turn(message)
│
├─ hooks.pre_turn(input)              ← NEW
├─ memory.pre_turn_context()          ← existing
├─ core.run_turn(input, ctx)          ← existing
│    ├─ hooks.pre_llm_call()          ← NEW (inside core)
│    ├─ provider.chat_completion()
│    ├─ hooks.post_llm_call()         ← NEW
│    ├─ hooks.pre_tool_call()         ← NEW (per tool)
│    ├─ tool.execute()
│    └─ hooks.post_tool_call()        ← NEW
├─ memory.post_turn_observe()         ← existing
└─ hooks.post_turn(input, output)     ← NEW

Example hooks

Audit logging


pub struct AuditLogHook { logger: slog::Logger }
impl AgentHook for AuditLogHook {
    fn name(&self) -> &str { "audit-log" }
    fn priority(&self) -> u32 { 50 }
    async fn pre_turn(&self, input: &TurnInput, ctx: &HookContext)
        -> HookResult<HookAction<TurnInput>>
    {
        slog::info!(self.logger, "turn_start";
            "session" => &ctx.session_id,
            "turn" => ctx.turn_number,
        );
        Ok(HookAction::Continue)
    }
}

Input guardrail


pub struct InputGuardrailHook { blocked_patterns: Vec<Regex> }
impl AgentHook for InputGuardrailHook {
    fn name(&self) -> &str { "input-guardrail" }
    fn priority(&self) -> u32 { 10 }  // run first
    async fn pre_turn(&self, input: &TurnInput, _ctx: &HookContext)
        -> HookResult<HookAction<TurnInput>>
    {
        for pattern in &self.blocked_patterns {
            if pattern.is_match(&input.message) {
                return Ok(HookAction::Reject("Message blocked by content policy".into()));
            }
        }
        Ok(HookAction::Continue)
    }
}

Cost tracking


pub struct CostTrackingHook {
    cost_per_input_token: f64,
    cost_per_output_token: f64,
}
impl AgentHook for CostTrackingHook {
    fn name(&self) -> &str { "cost-tracker" }
    async fn post_llm_call(&self, response: &LlmResponse, ctx: &HookContext)
        -> HookResult<HookAction<LlmResponse>>
    {
        let cost = (response.usage.input_tokens as f64 * self.cost_per_input_token)
                 + (response.usage.output_tokens as f64 * self.cost_per_output_token);
        let prev: f64 = ctx.state.get("total_cost").and_then(|v| v.as_f64()).unwrap_or(0.0);
        ctx.state.set("total_cost", json!(prev + cost));
        Ok(HookAction::Continue)
    }
}

Human-in-the-loop approval


pub struct ApprovalHook {
    patterns: Vec<String>,       // tools requiring approval
    approval_tx: mpsc::Sender<ApprovalRequest>,
    approval_rx: Arc<Mutex<mpsc::Receiver<ApprovalResponse>>>,
}
impl AgentHook for ApprovalHook {
    fn name(&self) -> &str { "human-approval" }
    fn priority(&self) -> u32 { 5 }       // run before other pre-tool hooks
    fn fail_open(&self) -> bool { false }  // approval failures must block
    async fn pre_tool_call(&self, tool_name: &str, args: &Value, ...) -> HookResult<HookAction<Value>> {
        if !self.requires_approval(tool_name) { return Ok(HookAction::Continue); }
        // Ask UI for approval, wait for response
        match self.request_approval(tool_name, args).await? {
            ApprovalResponse::Approved => Ok(HookAction::Continue),
            ApprovalResponse::Denied(reason) => Ok(HookAction::Reject(reason)),
            ApprovalResponse::Modified(new_args) => Ok(HookAction::Transform(new_args)),
        }
    }
}

Relationship to existing mechanisms

AgentEvent vs AgentHook

Aspect	`AgentEvent`	`AgentHook`
Direction	Outward (agent to observers)	Inward (interceptor to agent)
Can modify?	No — read-only	Yes — transform, reject, replace
Performance	Zero-cost if no subscribers	Async call overhead per hook
Use case	UI updates, logging	Guardrails, approval, arg rewriting

RuntimePolicy vs pre_tool_call hook

Policy remains the first gate (sandbox mode, limits, allow/deny). Hooks run after policy:


tool requested → Policy: where/how → Hooks: whether/what → execution

Runtime Policy Input & Output Traits