Docs
Advanced Subsystems
Lifecycle Hooks

Lifecycle Hooks

⚠️

This is a v2 design — not yet implemented. See Architecture v2 for context.

The AgentHook system provides composable, priority-ordered hooks for cross-cutting concerns. Unlike plugins (which provide capabilities), hooks observe and optionally intercept lifecycle events.

The problem

The current framework has hook-like mechanisms scattered across layers — init() / shutdown() on AgentCore, pre_turn_context() / post_turn_observe() on MemoryPlugin, check_permission() on RuntimePolicy. There is no unified mechanism for code that needs to observe or intercept behavior at multiple lifecycle points.

Use cases that cannot be cleanly implemented today:

Use CaseRequired Hook Points
Audit loggingpre-turn, post-turn, pre-tool, post-tool, errors
Cost trackingpost-LLM-call (token counts), post-turn (aggregate)
Input guardrailspre-turn (before LLM sees the message)
Output guardrailspost-LLM-call (before tool dispatch or response)
Tool argument rewritingpre-tool (modify args before execution)
Human-in-the-looppre-tool (for dangerous operations)
Provider fallbackon-LLM-error (switch to backup provider)

The AgentHook trait


#[async_trait]
pub trait AgentHook: Send + Sync {
fn name(&self) -> &str;
fn priority(&self) -> u32 { 100 } // lower = runs first
fn fail_open(&self) -> bool { true } // true = errors are non-fatal
// Session lifecycle
async fn on_session_start(&self, ctx: &HookContext) -> HookResult<()>;
async fn on_session_end(&self, ctx: &HookContext) -> HookResult<()>;
// Turn lifecycle
async fn pre_turn(&self, input: &TurnInput, ctx: &HookContext)
-> HookResult<HookAction<TurnInput>>;
async fn post_turn(&self, input: &TurnInput, output: &TurnOutput, ctx: &HookContext)
-> HookResult<HookAction<TurnOutput>>;
async fn on_turn_error(&self, input: &TurnInput, error: &AgentError, ctx: &HookContext)
-> HookResult<HookAction<TurnOutput>>;
// LLM call lifecycle
async fn pre_llm_call(&self, messages: &[ChatMessage], tools: &[ToolMeta], ctx: &HookContext)
-> HookResult<HookAction<LlmCallInput>>;
async fn post_llm_call(&self, response: &LlmResponse, ctx: &HookContext)
-> HookResult<HookAction<LlmResponse>>;
async fn on_llm_error(&self, error: &ProviderError, attempt: usize, ctx: &HookContext)
-> HookResult<HookAction<RetryDecision>>;
// Tool lifecycle
async fn pre_tool_call(&self, tool_name: &str, args: &Value, grant: &ExecutionGrant, ctx: &HookContext)
-> HookResult<HookAction<Value>>;
async fn post_tool_call(&self, tool_name: &str, args: &Value, result: &Result<DispatchResult, String>, elapsed: Duration, ctx: &HookContext)
-> HookResult<HookAction<DispatchResult>>;
// Context management
async fn pre_compaction(&self, messages: &[ChatMessage], budget: &TokenBudget, ctx: &HookContext)
-> HookResult<HookAction<CompactionHint>>;
async fn post_compaction(&self, removed_count: usize, summary: &str, ctx: &HookContext)
-> HookResult<()>;
}

All methods have default no-op implementations — hooks only override the points they care about.

HookAction — what hooks can do


pub enum HookAction<T> {
Continue, // pass through to next hook
Transform(T), // replace value, continue chain
Reject(String), // abort operation with reason
Replace(T), // replace value, skip remaining hooks
}

Two categories of hooks:

  • Gate hooks (2): on_session_start, on_tool_call — can abort the operation
  • Observation hooks (8): everything else — can transform data but not abort

Hook composition

Multiple hooks execute in priority order (lower number = earlier):


pre_turn hooks (priority order):
[guardrail (p=10)] → [logging (p=50)] → [metrics (p=100)]
│ │ │
check input log input start timer
▼ ▼ ▼
AgentCore::run_turn()
│ │ │
post_turn hooks (reverse priority):
[metrics (p=100)] → [logging (p=50)] → [guardrail (p=10)]
│ │ │
stop timer log output check output

Short-circuit rules

  • Continue — pass to next hook
  • Transform(val) — next hook sees the transformed value
  • Reject(reason) — remaining hooks skipped; operation fails
  • Replace(val) — remaining hooks skipped; value used as-is

HookContext and shared state


pub struct HookContext {
pub session_id: String,
pub turn_number: usize,
pub step_number: usize,
pub core_type: String,
pub events: EventSender,
pub state: HookState, // shared key-value state across hooks
}
pub struct HookState {
inner: Arc<RwLock<HashMap<String, Value>>>,
}

Hooks share state via HookState — for example, a cost tracker accumulates token costs across LLM calls.

Integration with AgentContext

The AgentContext gains a hooks: HookChain field:


pub struct AgentContext {
pub provider: Box<dyn LlmProvider>,
pub tools: Box<dyn ToolDispatcher>,
pub memory: Box<dyn MemoryPlugin>,
pub policy: Box<dyn RuntimePolicy>,
pub hooks: HookChain, // NEW
pub events: EventSender,
pub config: FrameworkConfig,
}

Cores call ctx.hooks.run_pre_llm_call() etc. at appropriate points. Hooks are opt-in for cores — a minimal core can ignore hooks entirely.

Updated runtime flow


AgentRuntime::run_turn(message)
├─ hooks.pre_turn(input) ← NEW
├─ memory.pre_turn_context() ← existing
├─ core.run_turn(input, ctx) ← existing
│ ├─ hooks.pre_llm_call() ← NEW (inside core)
│ ├─ provider.chat_completion()
│ ├─ hooks.post_llm_call() ← NEW
│ ├─ hooks.pre_tool_call() ← NEW (per tool)
│ ├─ tool.execute()
│ └─ hooks.post_tool_call() ← NEW
├─ memory.post_turn_observe() ← existing
└─ hooks.post_turn(input, output) ← NEW

Example hooks

Audit logging


pub struct AuditLogHook { logger: slog::Logger }
impl AgentHook for AuditLogHook {
fn name(&self) -> &str { "audit-log" }
fn priority(&self) -> u32 { 50 }
async fn pre_turn(&self, input: &TurnInput, ctx: &HookContext)
-> HookResult<HookAction<TurnInput>>
{
slog::info!(self.logger, "turn_start";
"session" => &ctx.session_id,
"turn" => ctx.turn_number,
);
Ok(HookAction::Continue)
}
}

Input guardrail


pub struct InputGuardrailHook { blocked_patterns: Vec<Regex> }
impl AgentHook for InputGuardrailHook {
fn name(&self) -> &str { "input-guardrail" }
fn priority(&self) -> u32 { 10 } // run first
async fn pre_turn(&self, input: &TurnInput, _ctx: &HookContext)
-> HookResult<HookAction<TurnInput>>
{
for pattern in &self.blocked_patterns {
if pattern.is_match(&input.message) {
return Ok(HookAction::Reject("Message blocked by content policy".into()));
}
}
Ok(HookAction::Continue)
}
}

Cost tracking


pub struct CostTrackingHook {
cost_per_input_token: f64,
cost_per_output_token: f64,
}
impl AgentHook for CostTrackingHook {
fn name(&self) -> &str { "cost-tracker" }
async fn post_llm_call(&self, response: &LlmResponse, ctx: &HookContext)
-> HookResult<HookAction<LlmResponse>>
{
let cost = (response.usage.input_tokens as f64 * self.cost_per_input_token)
+ (response.usage.output_tokens as f64 * self.cost_per_output_token);
let prev: f64 = ctx.state.get("total_cost").and_then(|v| v.as_f64()).unwrap_or(0.0);
ctx.state.set("total_cost", json!(prev + cost));
Ok(HookAction::Continue)
}
}

Human-in-the-loop approval


pub struct ApprovalHook {
patterns: Vec<String>, // tools requiring approval
approval_tx: mpsc::Sender<ApprovalRequest>,
approval_rx: Arc<Mutex<mpsc::Receiver<ApprovalResponse>>>,
}
impl AgentHook for ApprovalHook {
fn name(&self) -> &str { "human-approval" }
fn priority(&self) -> u32 { 5 } // run before other pre-tool hooks
fn fail_open(&self) -> bool { false } // approval failures must block
async fn pre_tool_call(&self, tool_name: &str, args: &Value, ...) -> HookResult<HookAction<Value>> {
if !self.requires_approval(tool_name) { return Ok(HookAction::Continue); }
// Ask UI for approval, wait for response
match self.request_approval(tool_name, args).await? {
ApprovalResponse::Approved => Ok(HookAction::Continue),
ApprovalResponse::Denied(reason) => Ok(HookAction::Reject(reason)),
ApprovalResponse::Modified(new_args) => Ok(HookAction::Transform(new_args)),
}
}
}

Relationship to existing mechanisms

AgentEvent vs AgentHook

AspectAgentEventAgentHook
DirectionOutward (agent to observers)Inward (interceptor to agent)
Can modify?No — read-onlyYes — transform, reject, replace
PerformanceZero-cost if no subscribersAsync call overhead per hook
Use caseUI updates, loggingGuardrails, approval, arg rewriting

RuntimePolicy vs pre_tool_call hook

Policy remains the first gate (sandbox mode, limits, allow/deny). Hooks run after policy:


tool requested → Policy: where/how → Hooks: whether/what → execution