Knowledge, Context, and Retrieval

Memory: From Raw Data to Contextual Evidence

In MRP-VM, knowledge is an actively governed store of Context Units expressed in Context CNL. See DS008.

The Knowledge Base Ingest Pipeline

  1. Semantic Chunking (DS018) — segmenting documents into coherent fragments by Markdown headings or sliding window
  2. Context Normalization — enriching each chunk with pragmatic Roles (Comparison, Explanation, Narrative...) and UtilityActs via the active processing strategy
  3. Indexing — adding units to BM25 inverted index + HDC/VSA vector cache
  4. Persistence (DS010) — atomic writes to data/kb/

Two Evidence Stores

StoreScopeLifetimeSource
Session ContextPer-sessionUntil session expires (30min)Extracted from user messages each turn
Persistent KBGlobalUntil source deletedIngested documents (.md, .txt)

Both stores are searched by the same retrieval strategies. Session evidence gets a 1.15× boost. See Session Lifecycle.

Retrieval Strategies

Two complementary strategies find evidence. For full technical details see Retrieval Algorithms.

StrategyTypeStrength
BM25Lexical (term frequency)Precise on exact token matches
HDC/VSAAssociative (hypervectors)Structural similarity when lexical overlap is partial

KB Plugins

KB plugins control the precision/recall tradeoff:

KB PluginStrategiesResultsUse Case
kb-fastBM25 onlyTop 3Simple questions, max precision
kb-balancedBM25 + HDC escalationTop 7Default, good tradeoff
kb-thinkingdbBM25 + bounded symbolic expansionTop 8Multi-hop or relation-sensitive retrieval

kb-thinkingdb is the current symbolic expansion path specified in DS025. A separate symbolic execution strategy ThinkingDB for stronger constraint-based pruning is still future work.