RAG Pipeline

ContextBrain provides the retrieval backend for RAG. ContextRouter’s Cortex Pipeline orchestrates the full RAG flow, while Brain handles the storage and search.

RAG Architecture

brain-rag

Hybrid Search

Brain combines two search strategies:

1. Vector Similarity (Dense)

Semantic search using pgvector cosine similarity:

SELECT content, embedding <=> query_embedding AS distance
FROM knowledge
WHERE tenant_id = $1
ORDER BY embedding <=> query_embedding
LIMIT $2;

2. Full-Text Search (Sparse)

Keyword matching using PostgreSQL tsvector:

SELECT content, ts_rank(search_vector, plainto_tsquery($1)) AS rank
FROM knowledge
WHERE search_vector @@ plainto_tsquery($1)
ORDER BY rank DESC;

Results from both paths are merged and reranked.

Ingestion

The ingestion pipeline in contextbrain/ingestion/rag/ handles:

Document processing — split, chunk, clean
Embedding generation — via configured embedder (OpenAI or local)
Storage — vector + metadata into PostgreSQL
Taxonomy classification — automatic category assignment

News Engine

Brain also supports a specialised news engine:

from contextcore import ContextUnit, context_unit_pb2

# Store news fact
unit = ContextUnit(payload={
    "tenant_id": "my_project",
    "title": "AI Breakthrough",
    "content": "Researchers achieved...",
    "category": "technology",
    "source_url": "https://...",
})
stub.UpsertNewsItem(unit.to_protobuf(context_unit_pb2))

# Retrieve news (streaming)
unit = ContextUnit(payload={
    "tenant_id": "my_project",
    "category": "technology",
    "limit": 10,
})
for item_pb in stub.GetNewsItems(unit.to_protobuf(context_unit_pb2)):
    item = ContextUnit.from_protobuf(item_pb)
    print(item.payload)