Skip to main content

LLM Overview

All LLMs in SynapseKit extend BaseLLM and share the same interface.

Interface

class BaseLLM(ABC):
async def stream(self, prompt: str, **kwargs) -> AsyncIterator[str]: ...
async def generate(self, prompt: str, **kwargs) -> str: ...
async def stream_with_messages(self, messages: list[dict], **kwargs) -> AsyncIterator[str]: ...
async def generate_with_messages(self, messages: list[dict], **kwargs) -> str: ...

generate() is always implemented as "".join([...async for... in stream()]) — streaming is primary.

LLMConfig

from synapsekit import LLMConfig

config = LLMConfig(
model="gpt-4o-mini",
api_key="sk-...",
provider="openai",
system_prompt="You are a helpful assistant.",
temperature=0.2,
max_tokens=1024,
# Optional: caching and retries
cache=False, # Enable LRU response caching
cache_maxsize=128, # Max cached responses
cache_backend="memory", # "memory" or "sqlite"
max_retries=0, # Retry attempts (0 = disabled)
retry_delay=1.0, # Initial retry delay in seconds
# Rate limiting
requests_per_minute=None, # Token-bucket rate limiter
)

See Caching & Retries for details on response caching and exponential backoff.

Available providers

ProviderClassExtraProvider string
OpenAIOpenAILLMpip install synapsekit[openai]"openai"
AnthropicAnthropicLLMpip install synapsekit[anthropic]"anthropic"
OllamaOllamaLLMpip install synapsekit[ollama]"ollama"
CohereCohereLLMpip install synapsekit[cohere]"cohere"
MistralMistralLLMpip install synapsekit[mistral]"mistral"
Google GeminiGeminiLLMpip install synapsekit[gemini]"gemini"
AWS BedrockBedrockLLMpip install synapsekit[bedrock]"bedrock"
Azure OpenAIAzureOpenAILLMpip install synapsekit[openai]"azure"
GroqGroqLLMpip install synapsekit[groq]"groq"
DeepSeekDeepSeekLLMpip install synapsekit[openai]"deepseek"
OpenRouterOpenRouterLLMpip install synapsekit[openai]"openrouter"
Together AITogetherLLMpip install synapsekit[openai]"together"
Fireworks AIFireworksLLMpip install synapsekit[openai]"fireworks"
Perplexity AIPerplexityLLMpip install synapsekit[openai]"perplexity"
CerebrasCerebrasLLMpip install synapsekit[openai]"cerebras"

Auto-detection

The RAG facade auto-detects the provider from the model name:

Model prefixDetected provider
claude-*anthropic
gemini-*gemini
command-*cohere
mistral-*, open-mistral-*mistral
deepseek-*deepseek
llama-*, mixtral-*, gemma-*groq
*/... (contains /)openrouter
everything elseopenai

Override with the provider= argument:

rag = RAG(model="llama3", api_key="", provider="ollama")

Tokens and cost tracking

Every provider tracks input/output tokens:

llm = OpenAILLM(config)
await llm.generate("Hello!")
print(llm.tokens_used) # {"input": 12, "output": 8}

The TokenTracer in RAGPipeline aggregates this across all calls.