Skip to main content

LLM API Reference

BaseLLM

Abstract base class for all LLM providers.

class BaseLLM(ABC):
async def stream(self, prompt: str, **kwargs) -> AsyncIterator[str]: ...
async def generate(self, prompt: str, **kwargs) -> str: ...
async def stream_with_messages(self, messages: list[dict], **kwargs) -> AsyncIterator[str]: ...
async def generate_with_messages(self, messages: list[dict], **kwargs) -> str: ...

@property
def tokens_used(self) -> dict: ... # {"input": int, "output": int}

Note (v1.9.1): stream_with_messages on OllamaLLM now correctly awaits AsyncClient.chat() before iterating the response stream. Earlier versions raised TypeError: 'async for' requires an object with __aiter__ method, got coroutine on recent ollama-python SDK releases. Upgrade to synapsekit>=1.9.1 to get the fix.

LLMConfig

@dataclass
class LLMConfig:
model: str
api_key: str
provider: str
system_prompt: str = "You are a helpful assistant."
temperature: float = 0.2
max_tokens: int = 1024
# Caching (v0.5.0)
cache: bool = False
cache_maxsize: int = 128
# Retries (v0.5.0)
max_retries: int = 0
retry_delay: float = 1.0
FieldTypeDefaultDescription
modelstrrequiredModel name
api_keystrrequiredAPI key
providerstrrequiredProvider string ("openai", "anthropic", etc.)
system_promptstr"You are a helpful assistant."System instruction
temperaturefloat0.2Sampling temperature
max_tokensint1024Maximum output tokens
cacheboolFalseEnable LRU response caching
cache_maxsizeint128Maximum cached responses
max_retriesint0Retry attempts with exponential backoff
retry_delayfloat1.0Initial retry delay in seconds

See Caching & Retries for usage details.

Provider classes

ClassImport pathExtra
OpenAILLMsynapsekit.llm.openaisynapsekit[openai]
AnthropicLLMsynapsekit.llm.anthropicsynapsekit[anthropic]
OllamaLLMsynapsekit.llm.ollamasynapsekit[ollama]
CohereLLMsynapsekit.llm.coheresynapsekit[cohere]
MistralLLMsynapsekit.llm.mistralsynapsekit[mistral]
GeminiLLMsynapsekit.llm.geminisynapsekit[gemini]
BedrockLLMsynapsekit.llm.bedrocksynapsekit[bedrock]
AzureOpenAILLMsynapsekit.llm.azure_openaisynapsekit[openai]
GroqLLMsynapsekit.llm.groqsynapsekit[groq]
DeepSeekLLMsynapsekit.llm.deepseeksynapsekit[openai]
OpenRouterLLMsynapsekit.llm.openroutersynapsekit[openai]
TogetherLLMsynapsekit.llm.togethersynapsekit[openai]
FireworksLLMsynapsekit.llm.fireworkssynapsekit[openai]

All providers share the same constructor signature:

LLM(config: LLMConfig)

BedrockLLM accepts an additional optional region: str = "us-east-1" argument. AzureOpenAILLM accepts additional azure_endpoint, api_version, and azure_deployment arguments.

call_with_tools()

Available on providers that support native function calling:

Providercall_with_tools()
OpenAILLM
AnthropicLLM
GeminiLLM
MistralLLM
DeepSeekLLM
OpenRouterLLM
TogetherLLM
FireworksLLM
AzureOpenAILLM
GroqLLM
OllamaLLM❌ — use ReActAgent
CohereLLM❌ — use ReActAgent
BedrockLLM❌ — use ReActAgent
result = await llm.call_with_tools(messages, tools)
# Returns: {"content": str | None, "tool_calls": list | None}

Voice exports (lazy-loaded)

The following voice symbols are exported from the top-level synapsekit namespace but are lazily imported — they are only loaded from disk when first accessed:

SymbolCategory
VoicePipelinePipeline orchestrator
BaseSTTBase speech-to-text
BaseTTSBase text-to-speech
LocalWhisperSTTLocal Whisper STT
OpenAIWhisperSTTOpenAI Whisper STT
DeepgramSTTDeepgram STT
OpenAITTSOpenAI TTS
ElevenLabsTTSElevenLabs TTS
CartesiaTTSCartesia TTS
PiperTTSPiper (local, offline) TTS
EnergyVADEnergy-based voice activity detection
SileroVADSilero VAD model

Note (v1.9.1): Prior to v1.9.1, import synapsekit eagerly imported the entire voice module tree, which pulled in sounddevice and other audio dependencies at startup even for users who never use voice features. All voice exports are now lazily loaded — sounddevice and other voice deps are only imported when a voice symbol is first accessed.