Anthropic
Use Anthropic's Claude models with streaming, tool use, vision, and large context windows.
Install
pip install synapsekit[anthropic]
Usage
from synapsekit.llms import AnthropicLLM, LLMConfig
llm = AnthropicLLM(LLMConfig(
model="claude-sonnet-4-6",
api_key="sk-ant-...",
temperature=0.7,
max_tokens=1024,
))
# Streaming
async for token in llm.stream("Explain RAG in simple terms."):
print(token, end="", flush=True)
# Full response
response = await llm.generate("Explain RAG in simple terms.")
print(response)
Available models
| Model | Context | Input (per 1M) | Output (per 1M) | Notes |
|---|---|---|---|---|
claude-opus-4-6 | 200K | $15.00 | $75.00 | Most capable |
claude-sonnet-4-6 | 200K | $3.00 | $15.00 | Best balance |
claude-haiku-4-5-20251001 | 200K | $0.25 | $1.25 | Fastest, cheapest |
note
max_tokens is required for Anthropic models. The API will reject requests without it.
Function calling (tool use)
Anthropic uses a tool_use flow. SynapseKit handles the multi-step protocol automatically:
- Send user message + tool schemas
- Receive
tool_useblock from Claude - Execute the tool and collect results
- Send
tool_resultback in the next message - Receive final text response
from synapsekit import tool, FunctionCallingAgent
from synapsekit.llms import AnthropicLLM, LLMConfig
@tool
def get_stock_price(ticker: str) -> dict:
"""Get current stock price for a ticker symbol."""
prices = {"AAPL": 185.20, "GOOG": 142.50, "MSFT": 415.30}
return {"ticker": ticker, "price": prices.get(ticker, 0), "currency": "USD"}
@tool
def calculate_portfolio_value(holdings: dict) -> float:
"""Calculate total portfolio value given ticker to shares mapping."""
return sum(shares * 100 for shares in holdings.values())
llm = AnthropicLLM(LLMConfig(
model="claude-sonnet-4-6",
api_key="sk-ant-...",
max_tokens=2048,
))
agent = FunctionCallingAgent(llm=llm, tools=[get_stock_price, calculate_portfolio_value])
answer = await agent.run("What's the current price of AAPL and MSFT?")
print(answer)
Raw call_with_tools
tools = [
{
"type": "function",
"function": {
"name": "run_sql",
"description": "Run a SQL SELECT query",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string"},
"database": {"type": "string", "default": "main"},
},
"required": ["query"],
},
},
}
]
result = await llm.call_with_tools(
messages=[{"role": "user", "content": "How many users signed up last week?"}],
tools=tools,
)
# {"content": None, "tool_calls": [{"id": "toolu_01...", "name": "run_sql", "arguments": {...}}]}
Vision
Claude models support image inputs via MultimodalMessage:
from synapsekit.multimodal import MultimodalMessage, ImageContent
# From URL
message = MultimodalMessage(
role="user",
content=[
ImageContent.from_url("https://example.com/diagram.png"),
"Explain what this architecture diagram shows.",
],
)
response = await llm.generate(message)
# From file bytes
with open("screenshot.png", "rb") as f:
image_bytes = f.read()
message = MultimodalMessage(
role="user",
content=[
ImageContent.from_bytes(image_bytes, media_type="image/png"),
"Describe this UI and identify any accessibility issues.",
],
)
response = await llm.generate(message)
Large context (200K tokens)
Claude's 200K context window lets you load entire codebases or documents:
import os
# Load all Python files in a project
code_files = []
for root, _, files in os.walk("./myproject"):
for f in files:
if f.endswith(".py"):
with open(os.path.join(root, f)) as fh:
code_files.append(f"# {f}\n{fh.read()}")
full_codebase = "\n\n".join(code_files)
llm = AnthropicLLM(LLMConfig(
model="claude-opus-4-6",
api_key="sk-ant-...",
max_tokens=4096,
))
response = await llm.generate(
f"Here is the full codebase:\n\n{full_codebase}\n\nIdentify any security vulnerabilities."
)
LLMConfig options
| Parameter | Type | Default | Description |
|---|---|---|---|
model | str | required | Claude model name |
api_key | str | required | Your Anthropic API key |
max_tokens | int | required | Max output tokens |
temperature | float | 1.0 | Sampling temperature (0-1) |
seed | int | None | For reproducible outputs |
max_retries | int | 3 | Auto-retry on transient errors |
requests_per_minute | int | None | Rate throttle |
cache_backend | str | None | "sqlite" or "lru" |
Cost tracking
from synapsekit.observability import CostTracker
tracker = CostTracker()
llm = AnthropicLLM(LLMConfig(
model="claude-sonnet-4-6",
api_key="sk-ant-...",
max_tokens=2048,
))
llm.attach_tracker(tracker)
await llm.generate("Summarize the French Revolution in 3 bullet points.")
print(f"Cost: ${tracker.total_cost_usd:.6f}")
Error handling
from synapsekit.exceptions import LLMError, RateLimitError, AuthenticationError
try:
response = await llm.generate("Hello")
except AuthenticationError:
print("Invalid API key -- check sk-ant-...")
except RateLimitError as e:
print(f"Rate limited. Retry after {e.retry_after}s")
except LLMError as e:
print(f"Anthropic error: {e}")
Using the RAG facade
from synapsekit import RAG
rag = RAG(
model="claude-sonnet-4-6",
api_key="sk-ant-...",
provider="anthropic",
)
rag.add("SynapseKit documentation goes here.")
answer = rag.ask_sync("What is SynapseKit?")
tip
Set ANTHROPIC_API_KEY in your environment to avoid passing api_key in code.