Advanced Retrieval

Standard vector search is a starting point, not a ceiling. These seven guides cover the retrieval strategies that close the gap between a proof-of-concept RAG pipeline and one that holds up in production — when queries are ambiguous, documents are large, or the cost of a wrong answer is high.

Each guide is self-contained, includes a Google Colab notebook, and assumes you already have a working RAG pipeline. If you're new to RAG, start with RAG Fundamentals first.

Guides in this section

Guide	Strategy	Best for	Difficulty
RAG Fusion	Multi-query + Reciprocal Rank Fusion	Queries with ambiguous or varied phrasing	Intermediate
GraphRAG	Entity extraction + relationship graph	Documents with dense entity relationships	Advanced
Self-RAG	Relevance grading + hallucination detection	High-stakes answers requiring verifiable grounding	Advanced
Parent Document Retriever	Small chunks retrieved, large chunks returned	Long documents where chunk context is lost	Intermediate
Query Decomposition	Sub-query generation + synthesis	Multi-part or multi-hop questions	Intermediate
Cross-Encoder Reranking	Retrieve wide, rerank narrow	Any pipeline where precision matters more than recall	Intermediate
Adaptive RAG	Complexity routing to fast/strong LLM	Cost-sensitive pipelines with mixed query difficulty	Advanced

When to use which strategy

Your answers are inconsistent across phrasings — use RAG Fusion. Generating multiple query variants and fusing their ranked results smooths out sensitivity to wording.

Your documents are about entities and their relationships — use GraphRAG. A relationship graph lets you answer multi-hop questions that defeat flat vector search.

You need to cite sources and cannot tolerate hallucinations — use Self-RAG. The grading loop surfaces low-relevance retrievals and detects unsupported claims before they reach the user.

Your chunks are too small to give the LLM enough context — use the Parent Document Retriever. Retrieve on fine-grained child chunks, but pass the full parent chunk to the model.

Your users ask compound questions — use Query Decomposition. Breaking a question into focused sub-queries produces richer, more accurate retrieval than searching the compound question directly.

Your top-k results are noisy — use Cross-Encoder Reranking. Retrieve a large candidate set cheaply, then score each candidate against the query precisely.

Your query volume is high and queries vary in complexity — use Adaptive RAG. Route simple queries to a fast, cheap model and reserve a stronger model for queries that need it.

Prerequisites

Python 3.10+
pip install synapsekit
An OpenAI API key (set as OPENAI_API_KEY)
Familiarity with the RAG Fundamentals guide series

Common imports

Every guide in this section shares the same import baseline:

import asyncio
import os

from synapsekit import RAG
from synapsekit.llms.openai import OpenAILLM
from synapsekit.embeddings.openai import OpenAIEmbeddings
from synapsekit.vectorstores.memory import InMemoryVectorStore

Guides in this section​

When to use which strategy​

Prerequisites​

Common imports​

Guides in this section

When to use which strategy

Prerequisites

Common imports