Skip to main content

Conversation Memory

ConversationMemory maintains a sliding window of recent messages for multi-turn conversations.

Usage

from synapsekit.memory import ConversationMemory

memory = ConversationMemory(window_size=10)

memory.add_user("What is SynapseKit?")
memory.add_assistant("SynapseKit is an async-first RAG framework.")

memory.add_user("How do I install it?")
memory.add_assistant("Run: pip install synapsekit[openai]")

# Get full history as a list of dicts
history = memory.get()
# [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}, ...]

# Clear memory
memory.clear()

Parameters

ParameterDefaultDescription
window_size10Max number of message pairs to keep

When the window fills up, the oldest messages are dropped automatically.


HybridMemory

HybridMemory combines a sliding window of recent messages with an LLM-generated summary of older messages. This gives the model both precise recent context and compressed long-term context, reducing token usage while preserving important information.

Usage

from synapsekit.memory.hybrid import HybridMemory

memory = HybridMemory(llm=llm, window=5, summary_max_tokens=200)

memory.add("user", "What is SynapseKit?")
memory.add("assistant", "SynapseKit is an async-first RAG framework.")
memory.add("user", "How do I install it?")
memory.add("assistant", "Run: pip install synapsekit")
# ... many more messages ...

Getting messages with summary

When the conversation exceeds the window size, older messages are summarized:

# If <= window*2 messages, returns all messages unchanged
messages = await memory.get_messages_with_summary()

# If > window*2 messages, older messages are replaced by a summary:
# [
# {"role": "system", "content": "Summary of earlier conversation:\n..."},
# {"role": "user", "content": "most recent question"},
# {"role": "assistant", "content": "most recent answer"},
# ...
# ]

Formatting for prompts

Use format_context() to get a formatted string suitable for prompt injection:

context = await memory.format_context()
# "System: Summary of earlier conversation:\n...\nUser: latest question\nAssistant: latest answer"

Other methods

# All messages (no summary)
all_msgs = memory.get_messages()

# Only the recent window
recent = memory.get_recent_messages()

# Current summary (empty until first summarization)
print(memory.summary)

# Message count
print(len(memory))

# Clear everything
memory.clear()

Parameters

ParameterDefaultDescription
llmLLM instance used for summarization
window5Number of message pairs to keep in the recent window
summary_max_tokens200Max tokens for the summary

Methods

MethodReturnsDescription
add(role, content)NoneAppend a message
get_messages()list[dict]All messages, no summary
get_recent_messages()list[dict]Only the recent window of messages
get_messages_with_summary()list[dict]Recent messages + summary of older ones (async)
format_context()strFormatted conversation string with summary (async)
clear()NoneClear all messages and summary

SQLiteConversationMemory

SQLiteConversationMemory persists chat history to SQLite. Messages survive process restarts. Supports multiple conversations via conversation_id and an optional sliding window.

Usage

from synapsekit.memory.sqlite import SQLiteConversationMemory

memory = SQLiteConversationMemory(
db_path="chat.db",
conversation_id="user-1",
window=10, # Optional: keep only last N message pairs
)

memory.add("user", "Hello!")
memory.add("assistant", "Hi there!")

# Messages are persisted to disk
messages = memory.get_messages()
# [{"role": "user", "content": "Hello!"}, {"role": "assistant", "content": "Hi there!"}]

Multiple conversations

# Each conversation_id has its own history
user1_memory = SQLiteConversationMemory(db_path="chat.db", conversation_id="user-1")
user2_memory = SQLiteConversationMemory(db_path="chat.db", conversation_id="user-2")

# List all conversations in the database
conversations = user1_memory.list_conversations()
# ["user-1", "user-2"]

Metadata support

memory.add("user", "Hello!", metadata={"timestamp": "2026-03-13", "source": "web"})

messages = memory.get_messages()
# [{"role": "user", "content": "Hello!", "metadata": {"timestamp": "2026-03-13", "source": "web"}}]

Formatting for prompts

context = memory.format_context()
# "User: Hello!\nAssistant: Hi there!"

Parameters

ParameterDefaultDescription
db_path"conversations.db"Path to the SQLite database file
conversation_id"default"Identifier for this conversation
windowNoneMax message pairs to keep (oldest are deleted)

Methods

MethodReturnsDescription
add(role, content, metadata=None)NoneAppend a message
get_messages()list[dict]All messages for this conversation
format_context()strFormatted conversation string
clear()NoneDelete all messages for this conversation
list_conversations()list[str]All conversation IDs in the database
close()NoneClose the database connection

SummaryBufferMemory

SummaryBufferMemory tracks approximate token count and progressively summarizes older messages when the buffer exceeds a token limit. Unlike HybridMemory (fixed window), this uses token estimation to decide when to summarize.

Usage

from synapsekit.memory.summary_buffer import SummaryBufferMemory

memory = SummaryBufferMemory(
llm=llm,
max_tokens=2000,
chars_per_token=4, # Estimation ratio
)

memory.add("user", "Hello!")
memory.add("assistant", "Hi there!")

# When under token limit, returns all messages as-is
messages = await memory.get_messages()

# When over limit, oldest message pairs are summarized
# and replaced with a system message containing the summary

How summarization works

  1. When get_messages() is called, the buffer's token count is estimated
  2. If under max_tokens, all messages are returned unchanged
  3. If over max_tokens, the oldest 2 messages are summarized by the LLM
  4. The summary is stored and prepended as a system message
  5. This repeats until the buffer is under the limit (keeping at least 2 messages)

Formatting for prompts

# Sync, no summarization — just flattens current state
context = memory.format_context()
# "Summary: ...\nUser: latest question\nAssistant: latest answer"

Parameters

ParameterDefaultDescription
llmLLM instance used for summarization
max_tokens2000Token budget for the buffer (must be >= 100)
chars_per_token4Characters per token for estimation

Methods

MethodReturnsDescription
add(role, content)NoneAppend a message
get_messages()list[dict]Messages with automatic summarization (async)
format_context()strFormatted string, sync, no summarization
clear()NoneClear all messages and summary
summarystrThe current running summary (property)

TokenBufferMemory

TokenBufferMemory tracks approximate token count and drops the oldest messages when the buffer exceeds a token limit. Unlike SummaryBufferMemory, this does not use an LLM — it simply discards the oldest messages to stay within budget.

Usage

from synapsekit.memory.token_buffer import TokenBufferMemory

memory = TokenBufferMemory(
max_tokens=4000,
chars_per_token=4, # Estimation ratio
)

memory.add("user", "Hello!")
memory.add("assistant", "Hi there! How can I help you today?")

# Messages are returned as-is (no async, no LLM needed)
messages = memory.get_messages()
# [{"role": "user", "content": "Hello!"}, {"role": "assistant", "content": "..."}]

How it works

  1. When add() is called, the message is appended to the buffer
  2. The total token count is estimated (len(content) // chars_per_token)
  3. If the total exceeds max_tokens, the oldest messages are dropped one at a time until the buffer fits

Formatting for prompts

context = memory.format_context()
# "User: Hello!\nAssistant: Hi there! How can I help you today?"

Parameters

ParameterDefaultDescription
max_tokens4000Token budget for the buffer (must be >= 1)
chars_per_token4Characters per token for estimation

Methods

MethodReturnsDescription
add(role, content)NoneAppend a message and trim if over budget
get_messages()list[dict]Current message history (sync)
format_context()strFormatted conversation string (sync)
clear()NoneClear all messages

BufferMemory

BufferMemory is the simplest memory backend — an unbounded buffer that keeps all messages until cleared. No windowing, no trimming, no LLM calls.

Usage

from synapsekit.memory.buffer import BufferMemory

memory = BufferMemory()

memory.add("user", "Hello!")
memory.add("assistant", "Hi there!")

messages = memory.get_messages()
# [{"role": "user", "content": "Hello!"}, {"role": "assistant", "content": "Hi there!"}]

context = memory.format_context()
# "User: Hello!\nAssistant: Hi there!"

print(len(memory)) # 2
memory.clear()

Methods

MethodReturnsDescription
add(role, content)NoneAppend a message
get_messages()list[dict]All messages (copy)
format_context()strFormatted conversation string
clear()NoneClear all messages

EntityMemory

EntityMemory uses an LLM to extract named entities from each message and maintains running descriptions. Useful for tracking people, places, organizations, and concepts across a conversation.

Usage

from synapsekit.memory.entity import EntityMemory

memory = EntityMemory(llm=llm, max_entities=50)

await memory.add("user", "Alice works at Acme Corp in Paris.")
await memory.add("assistant", "That's great! Acme Corp is a tech company.")

# View tracked entities
entities = memory.get_entities()
# {"Alice": "A person who works at Acme Corp in Paris.",
# "Acme Corp": "A tech company located in Paris.",
# "Paris": "A city where Acme Corp is located."}

How it works

  1. When add() is called, the LLM extracts entity names from the message
  2. For each entity, the LLM generates or updates a running description
  3. Entities are stored in an OrderedDict — most recently updated at the end
  4. When the entity count exceeds max_entities, the oldest entities are evicted

Formatting for prompts

format_context() includes both entities and messages:

context = memory.format_context()
# "Known entities:
# - Alice: A person who works at Acme Corp in Paris.
# - Acme Corp: A tech company located in Paris.
#
# User: Alice works at Acme Corp in Paris.
# Assistant: That's great! Acme Corp is a tech company."

Parameters

ParameterDefaultDescription
llmLLM instance for entity extraction and summarization
max_entities50Maximum entities to track (oldest evicted first)

Methods

MethodReturnsDescription
add(role, content)NoneAdd message and extract/update entities (async)
get_messages()list[dict]All messages (copy)
get_entities()dict[str, str]Entity name → description mapping
format_context()strEntities section + messages
clear()NoneClear messages and entities

RedisConversationMemory

RedisConversationMemory provides persistent conversation memory backed by Redis. Messages survive process restarts and support multiple concurrent conversations.

pip install synapsekit[redis]

Usage

from synapsekit.memory.redis import RedisConversationMemory

memory = RedisConversationMemory(
url="redis://localhost:6379",
conversation_id="user-42",
window=10,
)

memory.add("user", "What is SynapseKit?")
memory.add("assistant", "An async-first RAG framework.")

messages = memory.get_messages()
context = memory.format_context()

# List all conversations
conversations = memory.list_conversations()

# Clean up
memory.clear()
memory.close()

Parameters

ParameterDefaultDescription
url"redis://localhost:6379"Redis connection URL
conversation_id"default"Conversation identifier for multi-conversation support
windowNoneMax message pairs to keep (None = unlimited)
prefix"synapsekit:memory:"Redis key prefix for namespacing

Methods

MethodDescription
add(role, content, metadata=None)Append a message
get_messages()Return all messages for this conversation
format_context()Flatten history to a plain string
clear()Delete all messages for this conversation
list_conversations()List all conversation IDs
close()Close the Redis connection