Context Management API¶
PatchPal's context management system handles token estimation, context window limits, and automatic compaction.
TokenEstimator¶
patchpal.context.TokenEstimator(model_id)
¶
Estimate tokens in messages for context management.
Source code in patchpal/context.py
estimate_tokens(text)
¶
Estimate tokens in text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to estimate tokens for |
required |
Returns:
| Type | Description |
|---|---|
int
|
Estimated token count |
Source code in patchpal/context.py
estimate_message_tokens(message)
¶
Estimate tokens in a single message.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
Dict[str, Any]
|
Message dict with role, content, tool_calls, etc. |
required |
Returns:
| Type | Description |
|---|---|
int
|
Estimated token count |
Source code in patchpal/context.py
estimate_messages_tokens(messages)
¶
Estimate tokens in a list of messages.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
messages
|
List[Dict[str, Any]]
|
List of message dicts |
required |
Returns:
| Type | Description |
|---|---|
int
|
Total estimated token count |
Source code in patchpal/context.py
ContextManager¶
patchpal.context.ContextManager(model_id, system_prompt)
¶
Manage context window with auto-compaction and pruning.
Initialize context manager.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_id
|
str
|
LiteLLM model identifier |
required |
system_prompt
|
str
|
System prompt text |
required |
Source code in patchpal/context.py
needs_compaction(messages)
¶
Check if context window needs compaction.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
messages
|
List[Dict[str, Any]]
|
Current message history |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if compaction is needed |
Source code in patchpal/context.py
get_usage_stats(messages)
¶
Get current context usage statistics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
messages
|
List[Dict[str, Any]]
|
Current message history |
required |
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dict with usage statistics |
Source code in patchpal/context.py
Usage Example¶
from patchpal.agent import create_agent
agent = create_agent()
# Check context usage
stats = agent.context_manager.get_usage_stats(agent.messages)
print(f"Token usage: {stats['total_tokens']:,} / {stats['context_limit']:,}")
print(f"Usage: {stats['usage_percent']}%")
print(f"Output budget remaining: {stats['output_budget_remaining']:,} tokens")
# Check if compaction is needed
if agent.context_manager.needs_compaction(agent.messages):
print("Context window getting full - compaction will trigger soon")
# Manually trigger compaction (usually automatic)
agent._perform_auto_compaction()
How Context Management Works¶
- Token Estimation: Uses tiktoken (or fallback character estimation) to estimate message tokens
- Context Limits: Tracks model-specific context window sizes (e.g., 200K for Claude Sonnet)
- Automatic Compaction: When context reaches 70% full, summarizes old messages to free space
- Output Budget: Reserves tokens for model output based on context window size
Context Limits by Model Family¶
The context manager automatically detects limits for common models:
- Claude 3.5 Sonnet: 200,000 tokens
- Claude 3 Opus: 200,000 tokens
- GPT-4 Turbo: 128,000 tokens
- GPT-4: 8,192 tokens
- GPT-3.5: 16,385 tokens
For unknown models, falls back to 128,000 tokens.
Related¶
- Context Management Guide - Overview of context management
- Agent API - Using the agent with automatic context management