Troubleshooting
Diagnosis steps and fixes for common SynapseKit symptoms, organized by what you observe.
RAG returns irrelevant results
Symptoms: Answers are off-topic, contain hallucinations, or say "I don't know" when the document clearly has the answer.
Diagnosis checklist:
- Chunk size too large or too small. Large chunks dilute the signal; small chunks lose context.
# Start here: 512 tokens with 64-token overlap
from synapsekit.rag.splitters import RecursiveCharacterSplitter
splitter = RecursiveCharacterSplitter(chunk_size=512, chunk_overlap=64)
- k too small. You may not be retrieving the relevant chunk at all.
retriever = store.as_retriever(k=6) # retrieve more, then rerank
- Wrong retrieval strategy. Try hybrid (BM25 + vector) or MMR for diverse coverage.
retriever = store.as_retriever(strategy="hybrid", k=6)
retriever = store.as_retriever(strategy="mmr", k=6, lambda_mult=0.5)
- Embedding model mismatch. The embedding model used at index time must match query time.
# Both must use the same model
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
store = ChromaVectorStore(embeddings=embeddings)
- Inspect what is actually retrieved:
docs = await retriever.retrieve("your query")
for doc in docs:
print(doc.metadata, doc.content[:200])
My agent is looping
Symptoms: The agent keeps calling tools in circles, exceeds max_steps, or never returns a final answer.
Diagnosis checklist:
- Set
max_stepsto catch infinite loops early:
agent = ReActAgent(llm=llm, tools=tools, max_steps=10)
- Inspect the thought process. Add verbose logging to see each step:
agent = ReActAgent(llm=llm, tools=tools, verbose=True)
- Temperature too high. High temperature causes inconsistent action selection.
from synapsekit.llms.config import LLMConfig
llm = OpenAILLM(model="gpt-4o", config=LLMConfig(temperature=0.0))
-
Tool descriptions are ambiguous. The LLM may not know when to stop using a tool. Make the stopping condition explicit in the tool description.
-
Prompt missing a termination instruction. Add a clear stopping condition:
When you have enough information to answer the question, stop using tools and return FINAL ANSWER: <answer>.
Streaming is not working
Symptoms: async for chunk in llm.stream(prompt) hangs, returns all at once, or raises RuntimeError: no running event loop.
Diagnosis checklist:
- Running in a sync context. Streaming requires an async context.
# Wrong — sync context
for chunk in llm.stream("Hello"): # AttributeError or hangs
print(chunk)
# Correct — async context
async def main():
async for chunk in llm.stream("Hello"):
print(chunk, end="", flush=True)
import asyncio
asyncio.run(main())
- Jupyter notebooks. Use
nest_asyncioif you getRuntimeError: This event loop is already running.
import nest_asyncio
nest_asyncio.apply()
- FastAPI / Starlette. Use
StreamingResponsewith an async generator:
from fastapi.responses import StreamingResponse
@app.get("/stream")
async def stream_endpoint(q: str):
async def generate():
async for chunk in llm.stream(q):
yield chunk
return StreamingResponse(generate(), media_type="text/plain")
- Provider does not support streaming. Check the provider's docs. All OpenAI and Anthropic models support streaming; some smaller providers may not.
ImportError for a backend
Symptoms: ImportError: ChromaDB is not installed. Run: pip install synapsekit[chroma]
Fix: Install the extra indicated in the error message.
pip install "synapsekit[chroma]"
pip install "synapsekit[redis]"
pip install "synapsekit[postgres]"
pip install "synapsekit[serve]"
pip install "synapsekit[all]" # install everything
See Error Reference — ImportError for the full extras table.
Graph checkpoint not saving
Symptoms: graph.resume() raises CheckpointNotFoundError, or state is lost between runs.
Diagnosis checklist:
- Checkpointer not attached. You must pass a checkpointer at compile time.
from synapsekit.graph.checkpointing import SQLiteCheckpointer
checkpointer = SQLiteCheckpointer(path="checkpoints.db")
graph = workflow.compile(checkpointer=checkpointer)
- Different
thread_idbetween run and resume. Both calls must use the same ID.
thread_id = "my-conversation-1"
await graph.run(state, thread_id=thread_id)
await graph.resume(thread_id=thread_id, updates={...})
- Redis/Postgres not reachable. Verify the
REDIS_URLorDATABASE_URLenvironment variable is set and the service is running.
redis-cli -u $REDIS_URL ping # should return PONG
psql $DATABASE_URL -c "SELECT 1"
- Checkpointer table not created. For Postgres, run the schema migration:
await PostgresCheckpointer.create_tables(DATABASE_URL)
Rate limit exceeded
Symptoms: RateLimitError: Rate limit reached for gpt-4o after many requests.
Fix: Configure LLMConfig with a rate limiter and automatic retries.
from synapsekit.llms.config import LLMConfig
llm = OpenAILLM(
model="gpt-4o",
config=LLMConfig(
max_retries=5,
requests_per_minute=60, # stay under the limit
tokens_per_minute=90_000,
)
)
For batch workloads, add delays between calls:
import asyncio
results = []
for chunk in batches:
result = await llm.complete(chunk)
results.append(result)
await asyncio.sleep(1.0) # 1 second between calls
My eval cases always pass
Symptoms: @eval_case tests never fail even when the output is clearly wrong.
Diagnosis checklist:
- Threshold too low. The default threshold may be 0.5 — raise it.
@eval_case(metric="faithfulness", threshold=0.85)
async def test_rag():
...
-
Wrong metric for the task. Use
faithfulnessfor RAG (does the answer match the context),answer_relevancyfor Q&A,groundednessfor factual claims. -
MockLLM grader. If you are using
MockLLMin tests, swap in a real LLM for the evaluation judge.
from synapsekit.evaluation import Evaluator
evaluator = Evaluator(judge_llm=OpenAILLM(model="gpt-4o"))
- Empty or trivial outputs. If the RAG returns "I don't know" every time, faithfulness is technically 1.0 (no hallucinations). Check
answer_relevancyas well.
synapsekit serve returns 500
Symptoms: All requests to the served endpoint return HTTP 500 Internal Server Error.
Diagnosis checklist:
- Import path is wrong. Verify the
module:attributeformat.
# Check the object loads correctly in Python first
python -c "from myapp import rag; print(rag)"
# Then serve it
synapsekit serve myapp:rag
-
Object is not an Agent, Pipeline, or Graph.
synapsekit serveonly accepts these types. Pass a compatible object. -
Missing environment variables. The server process inherits the shell environment. Ensure API keys are exported.
export OPENAI_API_KEY=sk-...
synapsekit serve myapp:rag
- Check the server logs. Run with
--log-level debugfor the full traceback.
synapsekit serve myapp:rag --log-level debug
Memory not persisting between runs
Symptoms: The agent starts fresh on every run and has no recall of previous conversations.
Diagnosis checklist:
- In-memory store. The default
ConversationMemoryis in-process and lost on restart. Switch to a persistent backend.
from synapsekit.memory import RedisConversationMemory, SQLiteConversationMemory
# Persists across restarts
memory = SQLiteConversationMemory(path="memory.db", session_id="user-123")
memory = RedisConversationMemory(redis_url="redis://localhost:6379/0", session_id="user-123")
- Different
session_ideach run. The session ID must be the same to retrieve the same history.
# Use a stable identifier — user ID, conversation ID, etc.
memory = SQLiteConversationMemory(path="memory.db", session_id="user-123")
- SQLite file path changes. If you use a relative path, it resolves differently depending on your working directory. Use an absolute path.
import os
memory = SQLiteConversationMemory(
path=os.path.expanduser("~/.synapsekit/memory.db"),
session_id="user-123"
)
Function calling not working
Symptoms: Tools are never called, the agent always responds with plain text, or you get NotImplementedError.
Supported providers: Function calling is only available on providers that expose a tool-call API.
| Provider | Function calling |
|---|---|
| OpenAI (gpt-4o, gpt-4o-mini, gpt-3.5-turbo) | Yes |
| Anthropic (claude-3+) | Yes |
| Groq (llama-3, mixtral) | Yes |
| Mistral (mistral-large, mistral-medium) | Yes |
| Google Gemini (gemini-1.5+) | Yes |
| AWS Bedrock (Claude, Llama 3) | Yes |
| Ollama | Model-dependent |
| Cohere | Yes (command-r+) |
| DeepSeek | Yes (deepseek-chat) |
Fix for unsupported models: Use ReActAgent (text-based tool use via prompting) instead of FunctionCallingAgent.
# FunctionCallingAgent requires native tool-call support
agent = FunctionCallingAgent(llm=OpenAILLM(model="gpt-4o"), tools=[...])
# ReActAgent works with any model via prompting
agent = ReActAgent(llm=OllamaLLM(model="llama3"), tools=[...])
Embeddings are slow
Symptoms: Indexing or retrieval takes too long, especially in a loop.
Diagnosis and fixes:
- Enable embedding caching. Re-embedding the same text repeatedly is wasteful.
from synapsekit.llms.caching import RedisLLMCache
cache = RedisLLMCache(redis_url="redis://localhost:6379/0")
embeddings = OpenAIEmbeddings(model="text-embedding-3-small", cache=cache)
- Use batch embedding. Embed many texts in one API call.
texts = [doc.content for doc in documents]
vectors = await embeddings.embed_batch(texts) # one API call
- Use a local embedding model for development. Avoids network round-trips entirely.
from synapsekit.llms.embeddings import SentenceTransformerEmbeddings
embeddings = SentenceTransformerEmbeddings(model="all-MiniLM-L6-v2")
- Parallelise large indexing jobs.
import asyncio
async def embed_all(texts: list[str]):
tasks = [embeddings.embed(t) for t in texts]
return await asyncio.gather(*tasks)
BudgetExceededError immediately
Symptoms: Every call raises BudgetExceededError even for cheap requests, including the very first one.
Diagnosis checklist:
- Circuit breaker is tripped. After any budget exceedance, the guard stays tripped until you reset it.
guard.reset() # reset the tripped circuit breaker
- Limit is too tight. A
per_requestlimit of0.001(0.1 cent) will fire even for short responses with gpt-4o.
from synapsekit import BudgetLimit
# Realistic limits (2026 prices)
limit = BudgetLimit(
per_request=0.05, # 5 cents per call
per_day=5.00, # $5 / day
per_month=50.00, # $50 / month
)
- Cost estimate is using the wrong model pricing. Ensure the model name passed to
BudgetGuardmatches the model you are actually using.
guard = BudgetGuard(limit, model="gpt-4o-mini") # uses gpt-4o-mini pricing
See Observability — Cost Tracker for the full BudgetGuard reference.
TypeError: 'async for' requires an object with __aiter__ method, got coroutine (Ollama)
Symptoms: Streaming from OllamaLLM (via stream(), stream_with_messages(), or the RAG facade with provider="ollama") raises:
TypeError: 'async for' requires an object with __aiter__ method, got coroutine
Cause: A breaking change in ollama-python SDK changed AsyncClient.chat() from an async generator function to a regular coroutine that returns an async generator. The previous SynapseKit implementation iterated the coroutine directly rather than awaiting it first, causing this error on newer ollama-python versions.
Fix: Upgrade to synapsekit>=1.9.1, which correctly awaits the coroutine before iterating:
pip install --upgrade "synapsekit[ollama]"
sounddevice import error on import synapsekit (non-voice users)
Symptoms: Users who do not use SynapseKit's voice features see an import error such as:
ImportError: No module named 'sounddevice'
or:
OSError: PortAudio library not found
immediately on import synapsekit, even though they never use VoicePipeline or any STT/TTS class.
Cause: Prior to v1.9.1, the top-level synapsekit package eagerly imported the voice module tree at startup, which in turn imported sounddevice and other native audio dependencies. This caused failures in environments where those libraries are not installed (CI, serverless, Docker images without PortAudio, etc.).
Fix: Upgrade to synapsekit>=1.9.1. All voice exports (VoicePipeline, BaseSTT, BaseTTS, LocalWhisperSTT, OpenAIWhisperSTT, DeepgramSTT, OpenAITTS, ElevenLabsTTS, CartesiaTTS, PiperTTS, EnergyVAD, SileroVAD, and associated types) are now lazily loaded — they are only imported from disk when first accessed. import synapsekit no longer triggers any audio library loading.
pip install --upgrade synapsekit
If you actively use voice features and want to ensure the extras are present:
pip install "synapsekit[voice]"