Knowledge, Context, and Retrieval

Memory: From Raw Data to Contextual Evidence

In MRP-VM, knowledge is an actively governed store of Context Units expressed in Context CNL. See DS008.

The Knowledge Base Ingest Pipeline

Semantic Chunking (DS018) — segmenting documents into coherent fragments by Markdown headings or sliding window
Context Normalization — enriching each chunk with pragmatic Roles (Comparison, Explanation, Narrative...) and UtilityActs via the active processing strategy
Indexing — adding units to BM25 inverted index + HDC/VSA vector cache
Persistence (DS010) — atomic writes to data/kb/

Two Evidence Stores

Store	Scope	Lifetime	Source
Session Context	Per-session	Until session expires (30min)	Extracted from user messages each turn
Persistent KB	Global	Until source deleted	Ingested documents (.md, .txt)

Both stores are searched by the same retrieval strategies. Session evidence gets a 1.15× boost. See Session Lifecycle.

Retrieval Strategies

Two complementary strategies find evidence. For full technical details see Retrieval Algorithms.

Strategy	Type	Strength
BM25	Lexical (term frequency)	Precise on exact token matches
HDC/VSA	Associative (hypervectors)	Structural similarity when lexical overlap is partial

KB Plugins

KB plugins control the precision/recall tradeoff:

KB Plugin	Strategies	Results	Use Case
`kb-fast`	BM25 only	Top 3	Simple questions, max precision
`kb-balanced`	BM25 + HDC escalation	Top 7	Default, good tradeoff
`kb-thinkingdb`	BM25 + bounded symbolic expansion	Top 8	Multi-hop or relation-sensitive retrieval

kb-thinkingdb is the current symbolic expansion path specified in DS025. A separate symbolic execution strategy ThinkingDB for stronger constraint-based pruning is still future work.

← Pipeline Plugins →