Meta-Rational Planning DS029

The mrp-plan-plugin is the decision layer that orders plugin execution per request. It decides which seed detector, KB plugin, and goal solver to try first — and in what fallback order.

Why "Meta-Rational"?

The planner reasons about reasoning strategies rather than reasoning about the task directly. It selects which combination of plugins is most likely to succeed for a given request, based on task signals, historical outcomes, and plugin self-descriptions.

Built-in Planners

ID	Style	When Used
`planner-default`	Adaptive cheap-first	Primary planner. Tries cheap plugins first, escalates on failure.
`planner-depth`	Heavy-first fallback	Used when `planner-default` fails. Starts with expensive deep plugins.

How the Planner Decides

The planner combines four information sources to rank candidates per stage:

Task signals — regex-based cues from the user message: depth requests, speed requests, symbolic cues, retrieval-heavy patterns, topic tags (legal, literature, procedural, technical, symbolic)
Plugin self-description — plannerHints in the plugin descriptor: supported acts, topic tags, evidence style, preferred depth, expected latency/cost
Historical EWMA stats — per-plugin success rate, latency, LLM cost, sufficiency rate from the shared stats store
Default priors — cheap-first ordering from config/plugins.json

plannerHints Fields

Field	Type	Purpose
`supportedActs`	string[]	Pragmatic acts this plugin handles well
`topicTags`	string[]	Domain tags: legal, literature, procedural, technical, symbolic
`evidenceStyle`	string[]	Retrieval approach: hybrid, symbolic-facts, lexical-only
`preferredDepth`	string	shallow \| medium \| deep
`fallbackRole`	string	cheap-probe \| default \| heavy-recovery
`expectedLatencyMs`	number	Estimated latency for scoring
`expectedLLMCalls`	number	Estimated LLM calls for scoring
`relativeCost`	number	0–1 cost factor for budget-aware ranking
`confidenceWhenMatched`	number	0–1 self-assessed confidence when task matches

Adaptive Learning

After each request, the planner records outcomes via EWMA (Exponentially Weighted Moving Average):

utility = successEWMA + 0.15 × sufficiencyEWMA
        - latencyPenalty - llmCostPenalty

Stats are tracked per plugin ID: attempts, successes, failures, insufficiency count, latency EWMA, LLM call EWMA. Cold-start priors favor cheaper plugins.

Planner-level stats are also recorded, allowing the engine to rank planner candidates when no explicit planner is pinned.

Cross-Planner Backtracking

The engine maintains a plannerFallbackOrder (default: planner-default → planner-depth). If the first planner's plan fails with a retryable error, the engine tries the next planner with the remaining budget. Explicit planner pinning disables this backtracking.

Weak Outcome Recovery

If a planner produces only a no-context answer after insufficient KB retrieval, the engine escalates to the next planner. If all planners produce weak results, the best weak answer is accepted rather than returning an error.

Configuration

// config/plugins.json
{
  "defaultPlannerPlugin": "planner-default",
  "plannerFallbackOrder": ["planner-default", "planner-depth"],
  "defaultSeedDetectorPlan": ["sd-symbolic", "sd-llm-fast", "sd-llm-deep"],
  "defaultKBPlan": ["kb-fast", "kb-balanced", "kb-thinkingdb"],
  "defaultGoalSolverPlan": ["gs-symbolic", "gs-llm-fast", "gs-llm-deep"]
}

← Plugin Types Evaluation →