ConversationMemory maintains a sliding window of recent messages for multi-turn conversations.
from synapsekit.memory import ConversationMemory
memory = ConversationMemory(window_size=10)
memory.add_user("What is SynapseKit?")
memory.add_assistant("SynapseKit is an async-first RAG framework.")
memory.add_user("How do I install it?")
memory.add_assistant("Run: pip install synapsekit[openai]")
history = memory.get()
memory.clear()
| Parameter | Default | Description |
|---|
window_size | 10 | Max number of message pairs to keep |
When the window fills up, the oldest messages are dropped automatically.
HybridMemory combines a sliding window of recent messages with an LLM-generated summary of older messages. This gives the model both precise recent context and compressed long-term context, reducing token usage while preserving important information.
from synapsekit.memory.hybrid import HybridMemory
memory = HybridMemory(llm=llm, window=5, summary_max_tokens=200)
memory.add("user", "What is SynapseKit?")
memory.add("assistant", "SynapseKit is an async-first RAG framework.")
memory.add("user", "How do I install it?")
memory.add("assistant", "Run: pip install synapsekit")
When the conversation exceeds the window size, older messages are summarized:
messages = await memory.get_messages_with_summary()
Use format_context() to get a formatted string suitable for prompt injection:
context = await memory.format_context()
all_msgs = memory.get_messages()
recent = memory.get_recent_messages()
print(memory.summary)
print(len(memory))
memory.clear()
| Parameter | Default | Description |
|---|
llm | — | LLM instance used for summarization |
window | 5 | Number of message pairs to keep in the recent window |
summary_max_tokens | 200 | Max tokens for the summary |
| Method | Returns | Description |
|---|
add(role, content) | None | Append a message |
get_messages() | list[dict] | All messages, no summary |
get_recent_messages() | list[dict] | Only the recent window of messages |
get_messages_with_summary() | list[dict] | Recent messages + summary of older ones (async) |
format_context() | str | Formatted conversation string with summary (async) |
clear() | None | Clear all messages and summary |
SQLiteConversationMemory persists chat history to SQLite. Messages survive process restarts. Supports multiple conversations via conversation_id and an optional sliding window.
from synapsekit.memory.sqlite import SQLiteConversationMemory
memory = SQLiteConversationMemory(
db_path="chat.db",
conversation_id="user-1",
window=10,
)
memory.add("user", "Hello!")
memory.add("assistant", "Hi there!")
messages = memory.get_messages()
user1_memory = SQLiteConversationMemory(db_path="chat.db", conversation_id="user-1")
user2_memory = SQLiteConversationMemory(db_path="chat.db", conversation_id="user-2")
conversations = user1_memory.list_conversations()
memory.add("user", "Hello!", metadata={"timestamp": "2026-03-13", "source": "web"})
messages = memory.get_messages()
context = memory.format_context()
| Parameter | Default | Description |
|---|
db_path | "conversations.db" | Path to the SQLite database file |
conversation_id | "default" | Identifier for this conversation |
window | None | Max message pairs to keep (oldest are deleted) |
| Method | Returns | Description |
|---|
add(role, content, metadata=None) | None | Append a message |
get_messages() | list[dict] | All messages for this conversation |
format_context() | str | Formatted conversation string |
clear() | None | Delete all messages for this conversation |
list_conversations() | list[str] | All conversation IDs in the database |
close() | None | Close the database connection |
SummaryBufferMemory tracks approximate token count and progressively summarizes older messages when the buffer exceeds a token limit. Unlike HybridMemory (fixed window), this uses token estimation to decide when to summarize.
from synapsekit.memory.summary_buffer import SummaryBufferMemory
memory = SummaryBufferMemory(
llm=llm,
max_tokens=2000,
chars_per_token=4,
)
memory.add("user", "Hello!")
memory.add("assistant", "Hi there!")
messages = await memory.get_messages()
- When
get_messages() is called, the buffer's token count is estimated
- If under
max_tokens, all messages are returned unchanged
- If over
max_tokens, the oldest 2 messages are summarized by the LLM
- The summary is stored and prepended as a system message
- This repeats until the buffer is under the limit (keeping at least 2 messages)
context = memory.format_context()
| Parameter | Default | Description |
|---|
llm | — | LLM instance used for summarization |
max_tokens | 2000 | Token budget for the buffer (must be >= 100) |
chars_per_token | 4 | Characters per token for estimation |
| Method | Returns | Description |
|---|
add(role, content) | None | Append a message |
get_messages() | list[dict] | Messages with automatic summarization (async) |
format_context() | str | Formatted string, sync, no summarization |
clear() | None | Clear all messages and summary |
summary | str | The current running summary (property) |
TokenBufferMemory tracks approximate token count and drops the oldest messages when the buffer exceeds a token limit. Unlike SummaryBufferMemory, this does not use an LLM — it simply discards the oldest messages to stay within budget.
from synapsekit.memory.token_buffer import TokenBufferMemory
memory = TokenBufferMemory(
max_tokens=4000,
chars_per_token=4,
)
memory.add("user", "Hello!")
memory.add("assistant", "Hi there! How can I help you today?")
messages = memory.get_messages()
- When
add() is called, the message is appended to the buffer
- The total token count is estimated (
len(content) // chars_per_token)
- If the total exceeds
max_tokens, the oldest messages are dropped one at a time until the buffer fits
context = memory.format_context()
| Parameter | Default | Description |
|---|
max_tokens | 4000 | Token budget for the buffer (must be >= 1) |
chars_per_token | 4 | Characters per token for estimation |
| Method | Returns | Description |
|---|
add(role, content) | None | Append a message and trim if over budget |
get_messages() | list[dict] | Current message history (sync) |
format_context() | str | Formatted conversation string (sync) |
clear() | None | Clear all messages |
BufferMemory is the simplest memory backend — an unbounded buffer that keeps all messages until cleared. No windowing, no trimming, no LLM calls.
from synapsekit.memory.buffer import BufferMemory
memory = BufferMemory()
memory.add("user", "Hello!")
memory.add("assistant", "Hi there!")
messages = memory.get_messages()
context = memory.format_context()
print(len(memory))
memory.clear()
| Method | Returns | Description |
|---|
add(role, content) | None | Append a message |
get_messages() | list[dict] | All messages (copy) |
format_context() | str | Formatted conversation string |
clear() | None | Clear all messages |
EntityMemory uses an LLM to extract named entities from each message and maintains running descriptions. Useful for tracking people, places, organizations, and concepts across a conversation.
from synapsekit.memory.entity import EntityMemory
memory = EntityMemory(llm=llm, max_entities=50)
await memory.add("user", "Alice works at Acme Corp in Paris.")
await memory.add("assistant", "That's great! Acme Corp is a tech company.")
entities = memory.get_entities()
- When
add() is called, the LLM extracts entity names from the message
- For each entity, the LLM generates or updates a running description
- Entities are stored in an
OrderedDict — most recently updated at the end
- When the entity count exceeds
max_entities, the oldest entities are evicted
format_context() includes both entities and messages:
context = memory.format_context()
| Parameter | Default | Description |
|---|
llm | — | LLM instance for entity extraction and summarization |
max_entities | 50 | Maximum entities to track (oldest evicted first) |
| Method | Returns | Description |
|---|
add(role, content) | None | Add message and extract/update entities (async) |
get_messages() | list[dict] | All messages (copy) |
get_entities() | dict[str, str] | Entity name → description mapping |
format_context() | str | Entities section + messages |
clear() | None | Clear messages and entities |
RedisConversationMemory provides persistent conversation memory backed by Redis. Messages survive process restarts and support multiple concurrent conversations.
pip install synapsekit[redis]
from synapsekit.memory.redis import RedisConversationMemory
memory = RedisConversationMemory(
url="redis://localhost:6379",
conversation_id="user-42",
window=10,
)
memory.add("user", "What is SynapseKit?")
memory.add("assistant", "An async-first RAG framework.")
messages = memory.get_messages()
context = memory.format_context()
conversations = memory.list_conversations()
memory.clear()
memory.close()
| Parameter | Default | Description |
|---|
url | "redis://localhost:6379" | Redis connection URL |
conversation_id | "default" | Conversation identifier for multi-conversation support |
window | None | Max message pairs to keep (None = unlimited) |
prefix | "synapsekit:memory:" | Redis key prefix for namespacing |
| Method | Description |
|---|
add(role, content, metadata=None) | Append a message |
get_messages() | Return all messages for this conversation |
format_context() | Flatten history to a plain string |
clear() | Delete all messages for this conversation |
list_conversations() | List all conversation IDs |
close() | Close the Redis connection |