Skip to content

budget

Context budget computation — tracks token allocation across the LLM context window to prevent overflow and optimize utilization.

Token budget allocation for an LLM context window.

Breaks the total context window into reserved zones and computes the remaining space available for conversation history.

┌────────────────────────────────────────────────────┐
│ Context Window (total) │
├────────────┬────────────┬────────┬─────────────────┤
│ System │ Tools │ Output │ Available │
│ (fixed) │ (schemas) │ (gen) │ for History │
└────────────┴────────────┴────────┴─────────────────┘

Fields

FieldTypeDescription
totalu32Total context window in tokens (from ModelCapabilities).
reserved_outputu32Tokens reserved for model output generation.
reserved_systemu32Tokens consumed by the system prompt.
reserved_toolsu32Tokens consumed by tool schema definitions.
available_for_historyu32Remaining tokens available for conversation history + memories.

Methods

fn new(context_window: u32, max_output_tokens: u32, system_message: Option<&str>, tools_json: &[serde_json::Value]) -> Self

Create a new budget from model capabilities and current context.

  • context_window: Total tokens the model can accept
  • max_output_tokens: Tokens to reserve for generation
  • system_message: The system prompt text (will be estimated)
  • tools_json: Tool schema definitions (will be estimated)
fn history_usage(&self, history_messages: &[serde_json::Value]) -> u32

Compute how many tokens are actually used by a set of history messages.

fn would_exceed(&self, history_messages: &[serde_json::Value]) -> bool

Check whether adding the given messages would exceed the history budget.

fn utilization(&self, history_messages: &[serde_json::Value]) -> f32

Utilization ratio (0.0 to 1.0) of the full context window.

fn remaining(&self, history_messages: &[serde_json::Value]) -> u32

Remaining tokens available after current history usage.