Conversation Memory

ConversationMemory maintains a sliding window of recent messages for multi-turn conversations.

Usage

from synapsekit.memory import ConversationMemory

memory = ConversationMemory(window=10)

memory.add("user", "What is SynapseKit?")
memory.add("assistant", "SynapseKit is an async-first RAG framework.")

memory.add("user", "How do I install it?")
memory.add("assistant", "Run: pip install synapsekit[openai]")

# Get full history as a list of dicts
history = memory.get_messages()
# [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}, ...]

# Clear memory
memory.clear()

Parameters

Parameter	Default	Description
`window`	`10`	Max number of messages to keep in the sliding window

When the window fills up, the oldest messages are dropped automatically.

HybridMemory

HybridMemory combines a sliding window of recent messages with an LLM-generated summary of older messages. This gives the model both precise recent context and compressed long-term context, reducing token usage while preserving important information.

Usage

from synapsekit.memory.hybrid import HybridMemory

memory = HybridMemory(llm=llm, window=5, summary_max_tokens=200)

memory.add("user", "What is SynapseKit?")
memory.add("assistant", "SynapseKit is an async-first RAG framework.")
memory.add("user", "How do I install it?")
memory.add("assistant", "Run: pip install synapsekit")
# ... many more messages ...

Getting messages with summary

When the conversation exceeds the window size, older messages are summarized:

# If <= window*2 messages, returns all messages unchanged
messages = await memory.get_messages_with_summary()

# If > window*2 messages, older messages are replaced by a summary:
# [
#   {"role": "system", "content": "Summary of earlier conversation:\n..."},
#   {"role": "user", "content": "most recent question"},
#   {"role": "assistant", "content": "most recent answer"},
#   ...
# ]

Formatting for prompts

Use format_context() to get a formatted string suitable for prompt injection:

context = await memory.format_context()
# "System: Summary of earlier conversation:\n...\nUser: latest question\nAssistant: latest answer"

Other methods

# All messages (no summary)
all_msgs = memory.get_messages()

# Only the recent window
recent = memory.get_recent_messages()

# Current summary (empty until first summarization)
print(memory.summary)

# Message count
print(len(memory))

# Clear everything
memory.clear()

Parameters

Parameter	Default	Description
`llm`	—	LLM instance used for summarization
`window`	`5`	Number of message pairs to keep in the recent window
`summary_max_tokens`	`200`	Max tokens for the summary

Methods

Method	Returns	Description
`add(role, content)`	`None`	Append a message
`get_messages()`	`list[dict]`	All messages, no summary
`get_recent_messages()`	`list[dict]`	Only the recent window of messages
`get_messages_with_summary()`	`list[dict]`	Recent messages + summary of older ones (async)
`format_context()`	`str`	Formatted conversation string with summary (async)
`clear()`	`None`	Clear all messages and summary

SQLiteConversationMemory

SQLiteConversationMemory persists chat history to SQLite. Messages survive process restarts. Supports multiple conversations via conversation_id and an optional sliding window.

Usage

from synapsekit.memory.sqlite import SQLiteConversationMemory

memory = SQLiteConversationMemory(
    db_path="chat.db",
    conversation_id="user-1",
    window=10,  # Optional: keep only last N message pairs
)

memory.add("user", "Hello!")
memory.add("assistant", "Hi there!")

# Messages are persisted to disk
messages = memory.get_messages()
# [{"role": "user", "content": "Hello!"}, {"role": "assistant", "content": "Hi there!"}]

Multiple conversations

# Each conversation_id has its own history
user1_memory = SQLiteConversationMemory(db_path="chat.db", conversation_id="user-1")
user2_memory = SQLiteConversationMemory(db_path="chat.db", conversation_id="user-2")

# List all conversations in the database
conversations = user1_memory.list_conversations()
# ["user-1", "user-2"]

Metadata support

memory.add("user", "Hello!", metadata={"timestamp": "2026-03-13", "source": "web"})

messages = memory.get_messages()
# [{"role": "user", "content": "Hello!", "metadata": {"timestamp": "2026-03-13", "source": "web"}}]

Formatting for prompts

context = memory.format_context()
# "User: Hello!\nAssistant: Hi there!"

Parameters

Parameter	Default	Description
`db_path`	`"conversations.db"`	Path to the SQLite database file
`conversation_id`	`"default"`	Identifier for this conversation
`window`	`None`	Max message pairs to keep (oldest are deleted)

Methods

Method	Returns	Description
`add(role, content, metadata=None)`	`None`	Append a message
`get_messages()`	`list[dict]`	All messages for this conversation
`format_context()`	`str`	Formatted conversation string
`clear()`	`None`	Delete all messages for this conversation
`list_conversations()`	`list[str]`	All conversation IDs in the database
`close()`	`None`	Close the database connection

SummaryBufferMemory

SummaryBufferMemory tracks approximate token count and progressively summarizes older messages when the buffer exceeds a token limit. Unlike HybridMemory (fixed window), this uses token estimation to decide when to summarize.

Usage

from synapsekit.memory.summary_buffer import SummaryBufferMemory

memory = SummaryBufferMemory(
    llm=llm,
    max_tokens=2000,
    chars_per_token=4,  # Estimation ratio
)

memory.add("user", "Hello!")
memory.add("assistant", "Hi there!")

# When under token limit, returns all messages as-is
messages = await memory.get_messages()

# When over limit, oldest message pairs are summarized
# and replaced with a system message containing the summary

How summarization works

When get_messages() is called, the buffer's token count is estimated
If under max_tokens, all messages are returned unchanged
If over max_tokens, the oldest 2 messages are summarized by the LLM
The summary is stored and prepended as a system message
This repeats until the buffer is under the limit (keeping at least 2 messages)

Formatting for prompts

# Sync, no summarization — just flattens current state
context = memory.format_context()
# "Summary: ...\nUser: latest question\nAssistant: latest answer"

Parameters

Parameter	Default	Description
`llm`	—	LLM instance used for summarization
`max_tokens`	`2000`	Token budget for the buffer (must be >= 100)
`chars_per_token`	`4`	Characters per token for estimation

Methods

Method	Returns	Description
`add(role, content)`	`None`	Append a message
`get_messages()`	`list[dict]`	Messages with automatic summarization (async)
`format_context()`	`str`	Formatted string, sync, no summarization
`clear()`	`None`	Clear all messages and summary
`summary`	`str`	The current running summary (property)

TokenBufferMemory

TokenBufferMemory tracks approximate token count and drops the oldest messages when the buffer exceeds a token limit. Unlike SummaryBufferMemory, this does not use an LLM — it simply discards the oldest messages to stay within budget.

Usage

from synapsekit.memory.token_buffer import TokenBufferMemory

memory = TokenBufferMemory(
    max_tokens=4000,
    chars_per_token=4,  # Estimation ratio
)

memory.add("user", "Hello!")
memory.add("assistant", "Hi there! How can I help you today?")

# Messages are returned as-is (no async, no LLM needed)
messages = memory.get_messages()
# [{"role": "user", "content": "Hello!"}, {"role": "assistant", "content": "..."}]

How it works

When add() is called, the message is appended to the buffer
The total token count is estimated (len(content) // chars_per_token)
If the total exceeds max_tokens, the oldest messages are dropped one at a time until the buffer fits

Formatting for prompts

context = memory.format_context()
# "User: Hello!\nAssistant: Hi there! How can I help you today?"

Parameters

Parameter	Default	Description
`max_tokens`	`4000`	Token budget for the buffer (must be >= 1)
`chars_per_token`	`4`	Characters per token for estimation

Methods

Method	Returns	Description
`add(role, content)`	`None`	Append a message and trim if over budget
`get_messages()`	`list[dict]`	Current message history (sync)
`format_context()`	`str`	Formatted conversation string (sync)
`clear()`	`None`	Clear all messages

BufferMemory

BufferMemory is the simplest memory backend — an unbounded buffer that keeps all messages until cleared. No windowing, no trimming, no LLM calls.

Usage

from synapsekit.memory.buffer import BufferMemory

memory = BufferMemory()

memory.add("user", "Hello!")
memory.add("assistant", "Hi there!")

messages = memory.get_messages()
# [{"role": "user", "content": "Hello!"}, {"role": "assistant", "content": "Hi there!"}]

context = memory.format_context()
# "User: Hello!\nAssistant: Hi there!"

print(len(memory))  # 2
memory.clear()

Methods

Method	Returns	Description
`add(role, content)`	`None`	Append a message
`get_messages()`	`list[dict]`	All messages (copy)
`format_context()`	`str`	Formatted conversation string
`clear()`	`None`	Clear all messages

EntityMemory

EntityMemory uses an LLM to extract named entities from each message and maintains running descriptions. Useful for tracking people, places, organizations, and concepts across a conversation.

Usage

from synapsekit.memory.entity import EntityMemory

memory = EntityMemory(llm=llm, max_entities=50)

await memory.add("user", "Alice works at Acme Corp in Paris.")
await memory.add("assistant", "That's great! Acme Corp is a tech company.")

# View tracked entities
entities = memory.get_entities()
# {"Alice": "A person who works at Acme Corp in Paris.",
#  "Acme Corp": "A tech company located in Paris.",
#  "Paris": "A city where Acme Corp is located."}

How it works

When add() is called, the LLM extracts entity names from the message
For each entity, the LLM generates or updates a running description
Entities are stored in an OrderedDict — most recently updated at the end
When the entity count exceeds max_entities, the oldest entities are evicted

Formatting for prompts

format_context() includes both entities and messages:

context = memory.format_context()
# "Known entities:
#   - Alice: A person who works at Acme Corp in Paris.
#   - Acme Corp: A tech company located in Paris.
#
# User: Alice works at Acme Corp in Paris.
# Assistant: That's great! Acme Corp is a tech company."

Parameters

Parameter	Default	Description
`llm`	—	LLM instance for entity extraction and summarization
`max_entities`	`50`	Maximum entities to track (oldest evicted first)

Methods

Method	Returns	Description
`add(role, content)`	`None`	Add message and extract/update entities (async)
`get_messages()`	`list[dict]`	All messages (copy)
`get_entities()`	`dict[str, str]`	Entity name → description mapping
`format_context()`	`str`	Entities section + messages
`clear()`	`None`	Clear messages and entities

RedisConversationMemory

RedisConversationMemory provides persistent conversation memory backed by Redis. Messages survive process restarts and support multiple concurrent conversations.

pip install synapsekit[redis]

Usage

from synapsekit.memory.redis import RedisConversationMemory

memory = RedisConversationMemory(
    url="redis://localhost:6379",
    conversation_id="user-42",
    window=10,
)

memory.add("user", "What is SynapseKit?")
memory.add("assistant", "An async-first RAG framework.")

messages = memory.get_messages()
context = memory.format_context()

# List all conversations
conversations = memory.list_conversations()

# Clean up
memory.clear()
memory.close()

Parameters

Parameter	Default	Description
`url`	`"redis://localhost:6379"`	Redis connection URL
`conversation_id`	`"default"`	Conversation identifier for multi-conversation support
`window`	`None`	Max message pairs to keep (None = unlimited)
`prefix`	`"synapsekit:memory:"`	Redis key prefix for namespacing

Methods

Method	Description
`add(role, content, metadata=None)`	Append a message
`get_messages()`	Return all messages for this conversation
`format_context()`	Flatten history to a plain string
`clear()`	Delete all messages for this conversation
`list_conversations()`	List all conversation IDs
`close()`	Close the Redis connection

Usage​

Parameters​

HybridMemory​

Usage​

Getting messages with summary​

Formatting for prompts​

Other methods​

Parameters​

Methods​

SQLiteConversationMemory​

Usage​

Multiple conversations​

Metadata support​

Formatting for prompts​

Parameters​

Methods​

SummaryBufferMemory​

Usage​

How summarization works​

Formatting for prompts​

Parameters​

Methods​

TokenBufferMemory​

Usage​

How it works​

Formatting for prompts​

Parameters​

Methods​

BufferMemory​

Usage​

Methods​

EntityMemory​

Usage​

How it works​

Formatting for prompts​

Parameters​

Methods​

RedisConversationMemory​

Usage​

Parameters​

Methods​

Usage

Parameters

HybridMemory

Usage

Getting messages with summary

Formatting for prompts

Other methods

Parameters

Methods

SQLiteConversationMemory

Usage

Multiple conversations

Metadata support

Formatting for prompts

Parameters

Methods

SummaryBufferMemory

Usage

How summarization works

Formatting for prompts

Parameters

Methods

TokenBufferMemory

Usage

How it works

Formatting for prompts

Parameters

Methods

BufferMemory

Usage

Methods

EntityMemory

Usage

How it works

Formatting for prompts

Parameters

Methods

RedisConversationMemory

Usage

Parameters

Methods