Skip to main content

OpenAI

Use OpenAI's GPT models with streaming, function calling, vision, and structured output.

Install

pip install synapsekit[openai]

Usage

from synapsekit.llms import OpenAILLM, LLMConfig

llm = OpenAILLM(LLMConfig(
model="gpt-4o-mini",
api_key="sk-...",
temperature=0.7,
))

# Streaming
async for token in llm.stream("Tell me about async Python."):
print(token, end="", flush=True)

# Full response
response = await llm.generate("Tell me about async Python.")
print(response)

Available models

ModelContextInput (per 1M tokens)Output (per 1M tokens)Notes
gpt-4o128K$2.50$10.00Best quality, multimodal
gpt-4o-mini128K$0.15$0.60Fast and cheap
gpt-4-turbo128K$10.00$30.00Legacy high-quality
gpt-3.5-turbo16K$0.50$1.50Legacy, cheapest
o1200K$15.00$60.00Reasoning, no streaming
o1-mini128K$3.00$12.00Reasoning, fast
o3-mini200K$1.10$4.40Latest reasoning

Any model supported by the OpenAI API works.

Function calling

Use the @tool decorator or pass raw JSON Schema. SynapseKit auto-generates schemas from Python type hints.

from synapsekit import tool, FunctionCallingAgent
from synapsekit.llms import OpenAILLM, LLMConfig

@tool
def get_weather(city: str, unit: str = "celsius") -> str:
"""Get current weather for a city."""
return f"Weather in {city}: 22 degrees {unit}, sunny"

@tool
def calculate(expression: str) -> float:
"""Evaluate a math expression."""
return eval(expression)

llm = OpenAILLM(LLMConfig(model="gpt-4o-mini", api_key="sk-..."))
agent = FunctionCallingAgent(llm=llm, tools=[get_weather, calculate])

answer = await agent.run("What's the weather in Paris? Also, what's 144 / 12?")
print(answer)

Raw tool schema

tools = [
{
"type": "function",
"function": {
"name": "search_web",
"description": "Search the web for information",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"},
"num_results": {"type": "integer", "default": 5},
},
"required": ["query"],
},
},
}
]

result = await llm.call_with_tools(
messages=[{"role": "user", "content": "What happened in AI this week?"}],
tools=tools,
)
# {"content": None, "tool_calls": [{"id": "call_abc123", "name": "search_web", "arguments": {"query": "AI news this week"}}]}

Vision

Pass images alongside text using ImageContent:

from synapsekit.multimodal import ImageContent

message = {
"role": "user",
"content": [
ImageContent.from_url("https://example.com/chart.png"),
{"type": "text", "text": "Describe this chart."},
],
}

response = await llm.generate(message)
# From local file
from synapsekit.multimodal import ImageContent

with open("screenshot.png", "rb") as f:
image = ImageContent.from_bytes(f.read(), media_type="image/png")

response = await llm.generate([image, "What's in this image?"])

LLMConfig options

ParameterTypeDefaultDescription
modelstrrequiredOpenAI model name
api_keystrrequiredYour OpenAI API key
temperaturefloat1.0Sampling temperature (0-2)
max_tokensintNoneMaximum output tokens
seedintNoneFor deterministic outputs
max_retriesint3Auto-retry on transient errors
requests_per_minuteintNoneRate limit (RPM) throttle
cache_backendstrNone"sqlite" or "lru"

Response caching

Enable caching to avoid re-requesting the same prompt:

llm = OpenAILLM(LLMConfig(
model="gpt-4o-mini",
api_key="sk-...",
cache_backend="sqlite",
cache_path="~/.synapsekit/cache.db",
))

# First call hits the API
response1 = await llm.generate("What is Python?")

# Second call is served from cache
response2 = await llm.generate("What is Python?")

Cost tracking

from synapsekit.observability import CostTracker

tracker = CostTracker()
llm = OpenAILLM(LLMConfig(model="gpt-4o", api_key="sk-..."))
llm.attach_tracker(tracker)

await llm.generate("Summarize the Python docs.")
await llm.generate("Write a haiku about async.")

print(tracker.total_cost_usd)
print(tracker.summary())

Reasoning models (o1, o3)

The o1 and o3 series have different constraints:

# o1 does not support streaming or system messages
llm = OpenAILLM(LLMConfig(
model="o1",
api_key="sk-...",
# Do not set temperature -- unsupported for o1
))

response = await llm.generate("Solve this logic puzzle: ...")
caution

o1 and o3-mini do not support stream(), system messages, or temperature. Use generate() only.

Error handling

from synapsekit.exceptions import LLMError, RateLimitError, AuthenticationError

try:
response = await llm.generate("Hello")
except AuthenticationError:
print("Invalid API key")
except RateLimitError as e:
print(f"Rate limited. Retry after {e.retry_after}s")
except LLMError as e:
print(f"LLM error: {e}")

SynapseKit automatically retries on 429 Too Many Requests and 5xx errors up to max_retries times with exponential backoff.

Using the RAG facade

from synapsekit import RAG

rag = RAG(model="gpt-4o-mini", api_key="sk-...")
rag.add("SynapseKit is a Python library for building LLM applications.")
rag.add_file("docs/readme.txt")

answer = rag.ask_sync("What is SynapseKit?")
print(answer)
tip

Set OPENAI_API_KEY in your environment and omit api_key from LLMConfig -- SynapseKit will pick it up automatically.