Roadmap
Phase 1 — Core RAG ✅ Done
BaseLLMABC +LLMConfigOpenAILLM— async streamingAnthropicLLM— async streamingSynapsekitEmbeddings— sentence-transformers backendInMemoryVectorStore— numpy cosine sim +.npzpersistenceRetriever— vector search + BM25 rerankTextSplitter— pure Python, zero depsConversationMemory— sliding windowTokenTracer— tokens, latency, cost per callTextLoader,StringLoaderRAGPipeline— full orchestratorRAGfacade — 3-line happy pathrun_sync()— works inside/outside event loops- 52 tests, all passing
Phase 2 — Own the Niche ✅ Done
- Loaders:
PDFLoader,HTMLLoader,CSVLoader,JSONLoader,DirectoryLoader,WebLoader - Output parsers:
JSONParser,PydanticParser,ListParser - Vector store backends:
ChromaVectorStore,FAISSVectorStore,QdrantVectorStore,PineconeVectorStore - LLM providers:
OllamaLLM,CohereLLM,MistralLLM,GeminiLLM,BedrockLLM - Prompt templates:
PromptTemplate,ChatPromptTemplate,FewShotPromptTemplate - VectorStore ABC — all backends share one interface
Retriever.add()— cleaner API, no internal_storeaccessRAGPipeline.add_documents(docs)— ingestList[Document]directlyRAG.add_documents()+RAG.add_documents_async()- 141 tests, all passing
Phase 3 — Agents ✅ Done
BaseToolABC —run(),schema(),anthropic_schema(),ToolResultToolRegistry— lookup by name, OpenAI + Anthropic schema generationAgentMemory— step scratchpad,format_scratchpad(), max_steps limitReActAgent— Thought → Action → Observation loop, anyBaseLLM, no function calling requiredFunctionCallingAgent— native OpenAI tool_calls / Anthropic tool_use, multi-tool per stepAgentExecutor— unified runner,run()/stream()/run_sync(), picks agent from configcall_with_tools()— added toOpenAILLMandAnthropicLLM- Built-in tools:
CalculatorTool— safe math eval, no depsPythonREPLTool— exec with persistent namespace, stdout captureFileReadTool— read local filesWebSearchTool— DuckDuckGo search, no API key (pip install synapsekit[search])SQLQueryTool— SQLite (stdlib) + SQLAlchemy for other databases
- 223 tests, all passing
Phase 4 — Graph Workflows ✅ Done
StateGraph— fluent DAG builder with compile-time validation and cycle detectionCompiledGraph— wave-based async executor,run()/stream()/run_sync()Node,Edge,ConditionalEdge— sync + async node functions and routingagent_node(),rag_node()— wrap agents and RAG pipelines as graph nodes- Parallel execution — nodes in the same wave run via
asyncio.gather() - Mermaid diagram export —
get_mermaid()for any compiled graph _MAX_STEPS = 100guard against infinite conditional loops- 267 tests, all passing
Phase 5 — Production Features ✅ Done
- Text Splitters —
BaseSplitterABC,CharacterTextSplitter,RecursiveCharacterTextSplitter,TokenAwareSplitter,SemanticSplitter(cosine similarity boundaries) - Function calling for Gemini + Mistral —
call_with_tools()added toGeminiLLMandMistralLLM(4 providers now support native tool use) - LLM Response Caching —
AsyncLRUCachewith SHA-256 cache keys, opt-in viaLLMConfig(cache=True) - LLM Retries — exponential backoff via
retry_async(), skips auth errors, opt-in viaLLMConfig(max_retries=N) - Graph Cycles —
compile(allow_cycles=True)skips static cycle detection for intentional loops - Configurable max_steps —
compile(max_steps=N)overrides the default 100-step guard - Graph Checkpointing —
BaseCheckpointerABC,InMemoryCheckpointer,SQLiteCheckpointer CompiledGraph.resume()— re-execute from saved state- Adjacency optimization — pre-built index for faster edge lookup
RAGConfig.splitter— plug anyBaseSplitterinto the RAG pipeline- 332 tests, all passing
Phase 6 — Polish & Ecosystem ✅ Done (v0.5.1–v0.5.3)
v0.5.1
@tooldecorator — create agent tools from plain functions with auto-generated JSON Schema- Metadata filtering —
VectorStore.search(metadata_filter={"key": "value"}) - Vector store lazy exports — all backends importable from
synapsekit - File existence checks — loaders raise
FileNotFoundErrorbefore attempting to read - Parameter validation — agents and memory reject invalid config
- 357 tests, all passing
v0.5.2
__repr__methods — human-readable repr onStateGraph,CompiledGraph,RAGPipeline,ReActAgent,FunctionCallingAgent- Empty document handling —
RAGPipeline.add()silently skips empty text - Retry for
call_with_tools()—LLMConfig(max_retries=N)applies to function calling - Cache hit/miss statistics —
BaseLLM.cache_statsproperty - MMR retrieval —
search_mmr()andretrieve_mmr()for diversity-aware retrieval - Rate limiting —
LLMConfig(requests_per_minute=N)with token-bucket algorithm - Structured output with retry —
generate_structured(llm, prompt, schema=Model)parses to Pydantic - 389 tests, all passing
v0.5.3
- Azure OpenAI —
AzureOpenAILLMfor enterprise Azure deployments - Groq —
GroqLLMfor ultra-fast inference (Llama, Mixtral, Gemma) - DeepSeek —
DeepSeekLLMwith function calling support - SQLite LLM cache — persistent cache via
LLMConfig(cache_backend="sqlite") - RAG Fusion —
RAGFusionRetrieverwith multi-query + Reciprocal Rank Fusion - Excel loader —
ExcelLoaderfor.xlsxfiles - PowerPoint loader —
PowerPointLoaderfor.pptxfiles - 10 LLM providers, 10 document loaders, 415 tests passing
Phase 7 — Tools, Providers & Advanced Retrieval ✅ Done (v0.6.0)
- Built-in tools (6 new):
HTTPRequestTool— GET/POST/PUT/DELETE/PATCH with aiohttpFileWriteTool— write/append with auto-mkdirFileListTool— list directories with glob patterns, recursiveDateTimeTool— current time, parse, format with tz supportRegexTool— findall, match, search, replace, splitJSONQueryTool— dot-notation path queries on JSON data
- LLM providers (3 new, all OpenAI-compatible):
OpenRouterLLM— unified API for 200+ modelsTogetherLLM— Together AI fast inferenceFireworksLLM— Fireworks AI optimized serving
- Advanced retrieval (2 new):
ContextualRetriever— Anthropic-style contextual retrievalSentenceWindowRetriever— sentence-level embedding with window expansion
- 13 LLM providers, 11 built-in tools, 12 document loaders, 452 tests passing
Phase 7.1 — Graph Power-ups & Advanced Retrieval ✅ Done (v0.6.1)
- Graph: Human-in-the-Loop —
GraphInterruptexception pauses execution for human review;InterruptStateholds interrupt details;resume(updates=...)applies human edits and continues - Graph: Subgraphs —
subgraph_node(compiled_graph, input_mapping, output_mapping)nests aCompiledGraphas a node in a parent graph - Graph: Token Streaming —
llm_node(llm, stream=True)+compiled.stream_tokens(state)yields{"type": "token", "node", "token"}events for real-time LLM output - Retrieval: SelfQueryRetriever — LLM decomposes natural-language queries into semantic search + metadata filters automatically
- Retrieval: ParentDocumentRetriever — embeds small chunks for precision, returns full parent documents for context
- Retrieval: CrossEncoderReranker — reranks retrieval results with cross-encoder models for higher precision (
pip install synapsekit[semantic]) - Memory: HybridMemory — sliding window of recent messages + LLM summary of older messages for token-efficient long conversations
Phase 7.2 — Retrieval Strategies, Memory & Tools ✅ Done (v0.6.2)
- Retrieval: CRAGRetriever — Corrective RAG: grades retrieved docs for relevance, rewrites query and retries when too few are relevant
- Retrieval: QueryDecompositionRetriever — breaks complex queries into sub-queries, retrieves for each, deduplicates
- Retrieval: ContextualCompressionRetriever — compresses retrieved documents to only relevant excerpts using an LLM
- Retrieval: EnsembleRetriever — fuses results from multiple retrievers using weighted Reciprocal Rank Fusion
- Memory: SQLiteConversationMemory — persistent chat history in SQLite, multi-conversation support, optional sliding window
- Memory: SummaryBufferMemory — token-budget-aware memory that progressively summarizes older messages
- Tools: HumanInputTool — pauses agent execution to ask the user a question (custom sync/async input functions)
- Tools: WikipediaTool — searches and fetches Wikipedia article summaries via REST API, no extra deps
- 13 providers, 13 tools, 12 loaders, 10 retrieval strategies, 4 memory backends, 512 tests passing
Phase 7.3 — Typed State, Fan-Out, SSE & LLM Tools ✅ Done (v0.6.3)
- Graph: TypedState with reducers —
StateFieldwith per-field reducers for safe parallel state merging - Graph: fan_out_node() — run multiple subgraphs concurrently with
asyncio.gather(), custom merge functions - Graph: SSE streaming —
sse_stream()for HTTP Server-Sent Events streaming - Graph: Event callbacks —
EventHookswithon_node_start,on_node_complete,on_wave_start,on_wave_complete - LLM: SemanticCache — similarity-based cache lookup using embeddings, configurable threshold
- Tools: SummarizationTool — summarize text with concise, bullet_points, or detailed styles
- Tools: SentimentAnalysisTool — sentiment analysis with confidence and explanation
- Tools: TranslationTool — translate text between languages
- 13 providers, 16 tools, 12 loaders, 10 retrieval strategies, 4 memory backends, 540 tests passing
Phase 7.4 — Loaders, HyDE, Tools, Caching & Checkpointing ✅ Done (v0.6.4)
- Loaders: DocxLoader — load Microsoft Word (.docx) files with python-docx (
pip install synapsekit[docx]) - Loaders: MarkdownLoader — load Markdown files with optional YAML frontmatter stripping (stdlib, no deps)
- Retrieval: HyDERetriever — Hypothetical Document Embeddings: generates a hypothetical answer with an LLM and uses it as the search query for improved retrieval
- Tools: ShellTool — async shell command execution with configurable timeout and allowed_commands whitelist
- Tools: SQLSchemaInspectionTool — inspect database schema (list_tables, describe_table) for SQLite and SQLAlchemy backends
- LLM: FilesystemLLMCache — persistent file-based LLM cache using JSON files on disk (
cache_backend="filesystem") - Graph: JSONFileCheckpointer — file-based graph checkpointing using JSON files
- Observability: COST_TABLE — updated with GPT-4.1 family, o3/o4-mini, Gemini 2.5, DeepSeek, Groq-hosted models
- 13 providers, 19 tools, 14 loaders, 11 retrieval strategies, 3 cache backends, 3 checkpointers, 587 tests passing
Phase 7.5 — Retrieval, Tools, Memory & Redis Cache ✅ Done (v0.6.5)
- Retrieval: CohereReranker — rerank results using the Cohere Rerank API (
pip install synapsekit[cohere]) - Retrieval: StepBackRetriever — generate a step-back question, retrieve in parallel for both, deduplicate
- Retrieval: FLARERetriever — Forward-Looking Active REtrieval: iterative generate/retrieve loop with
[SEARCH: ...]markers - Tools: DuckDuckGoSearchTool — extended DuckDuckGo search with text and news search types
- Tools: PDFReaderTool — read and extract text from PDF files with optional page selection
- Tools: GraphQLTool — execute GraphQL queries against any endpoint
- Memory: TokenBufferMemory — token-budget-aware memory that drops oldest messages (no LLM needed)
- LLM: RedisLLMCache — distributed Redis cache backend (
pip install synapsekit[redis]) - 13 providers, 22 tools, 14 loaders, 14 retrieval strategies, 4 cache backends, 6 memory backends, 642 tests passing
Phase 7.6 — Providers, Retrieval, Tools & Memory ✅ Done (v0.6.6)
- LLM: PerplexityLLM — Perplexity AI with Sonar models, OpenAI-compatible
- LLM: CerebrasLLM — Cerebras ultra-fast inference, OpenAI-compatible
- Retrieval: HybridSearchRetriever — BM25 keyword matching + vector similarity via Reciprocal Rank Fusion (RRF)
- Retrieval: SelfRAGRetriever — self-reflective RAG: retrieve, grade relevance, generate, check support, retry if needed
- Retrieval: AdaptiveRAGRetriever — LLM classifies query complexity (simple/moderate/complex) and routes to different retrievers
- Retrieval: MultiStepRetriever — iterative retrieval-generation with automatic gap identification and follow-up queries
- Tools: ArxivSearchTool — search arXiv for academic papers via Atom API (stdlib only, no deps)
- Tools: TavilySearchTool — AI-optimized web search via Tavily API (
pip install synapsekit[tavily]) - Memory: BufferMemory — simplest unbounded buffer, keeps all messages until cleared
- Memory: EntityMemory — LLM-based entity extraction with running descriptions and eviction policy
- 15 providers, 24 tools, 14 loaders, 18 retrieval strategies, 4 cache backends, 8 memory backends, 698 tests passing
Phase 7.8 — Tools & Graph Routing ✅ Done (v0.6.9)
- Tools: SlackTool — send messages via Slack webhook or bot token (stdlib only, no deps)
- Tools: JiraTool — Jira REST API v2: search, get, create issues, add comments (stdlib only)
- Tools: BraveSearchTool — web search via Brave Search API (stdlib only)
- Graph: approval_node() — gate graph execution on human approval; raises
GraphInterruptwhen state key is falsy; dynamic messages via callable - Graph: dynamic_route_node() — route to different compiled subgraphs at runtime based on routing function; sync/async, input/output mapping
- 15 providers, 32 tools, 14 loaders, 18 retrieval strategies, 4 cache backends, 8 memory backends, 795 tests passing
Phase 8 — MCP + Multi-Agent ✅ Done (v0.7.0)
- MCP: MCPClient — connect to MCP servers via
connect_stdio()orconnect_sse() - MCP: MCPToolAdapter — wrap MCP tools as
BaseToolinstances for use with any agent - MCP: MCPServer — expose SynapseKit tools as an MCP-compatible server
- Multi-Agent: SupervisorAgent + WorkerAgent — supervisor orchestrates workers using DELEGATE/FINAL protocol
- Multi-Agent: HandoffChain + Handoff — condition-based agent transfers
- Multi-Agent: Crew + CrewAgent + Task — role-based teams, sequential or parallel task execution
Phase 9 — Evaluation + Observability ✅ Done (v0.8.0)
- Evaluation: FaithfulnessMetric — measure answer faithfulness to source documents
- Evaluation: RelevancyMetric — measure answer relevancy to the question
- Evaluation: GroundednessMetric — measure groundedness in retrieved context
- Evaluation: EvaluationPipeline + EvaluationResult — multi-metric pipeline with
mean_score - Observability: OTelExporter + Span — export traces in OpenTelemetry format
- Observability: TracingMiddleware — auto-trace LLM calls with zero code changes
- Observability: TracingUI — HTML dashboard for viewing traces
Phase 10 — A2A + Guardrails + Distributed Tracing ✅ Done (v0.9.0)
- A2A: A2AClient + A2AServer — Agent-to-Agent protocol for inter-agent communication
- A2A: AgentCard — agent metadata for discovery
- A2A: A2ATask + A2AMessage + TaskState — task lifecycle management
- Guardrails: ContentFilter — block harmful or inappropriate content
- Guardrails: PIIDetector — detect and redact personally identifiable information
- Guardrails: TopicRestrictor — restrict agent conversations to allowed topics
- Guardrails: Guardrails — compose multiple guardrail checks into a pipeline
- Tracing: DistributedTracer + TraceSpan — distributed tracing across services/agents
Phase 11 — Multimodal + API Markers ✅ Done (v1.0.0)
- Multimodal: ImageContent — image payloads with
from_file(),from_url(),from_base64(), provider-specific formatting - Multimodal: AudioContent — audio payloads with
from_file(),from_base64() - Multimodal: MultimodalMessage — compose text + images + audio, convert to OpenAI/Anthropic formats
- Multimodal: ImageLoader — sync
load()andasync_load()with optional vision LLM description - API Markers: @public_api — mark stable public API surfaces
- API Markers: @experimental — mark experimental features
- API Markers: @deprecated(reason, alternative) — deprecation with migration guidance
Phase 12 — Retrieval, Memory, Providers & Visualization ✅ Done (v1.1.0)
- Retrieval: GraphRAGRetriever — knowledge-graph-augmented retrieval: extract entities via LLM, traverse KG, merge with vector results
- Retrieval: KnowledgeGraph — in-memory triple store with BFS traversal, entity-document linking, LLM-powered extraction
- Memory: RedisConversationMemory — Redis-backed conversation memory with windowing, multi-conversation, JSON serialization
- LLM: VertexAILLM — Google Vertex AI provider with ADC auth, streaming, native function calling
- Text Splitters: MarkdownTextSplitter — header-hierarchy-aware splitting with parent context preservation
- Graph: GraphVisualizer — ASCII timeline, Mermaid trace highlighting, step replay, HTML export
- Graph: get_mermaid_with_trace() — Mermaid diagrams with CSS status classes (completed/errored/skipped)
- 16 providers, 20 retrieval strategies, 6 text splitters, 9 memory backends, 1047 tests passing
Phase 13 — Platform 🔜
synapsekit serve— deploy any app as FastAPI in one command- Prompt hub — versioned prompt registry
- Plugin system for community extensions
- Conversation branching and tree-of-thought