Meta-Rational Planning DS029

The mrp-plan-plugin is the decision layer that orders plugin execution per request. It decides which seed detector, KB plugin, and goal solver to try first — and in what fallback order.

Why "Meta-Rational"?

The planner reasons about reasoning strategies rather than reasoning about the task directly. It selects which combination of plugins is most likely to succeed for a given request, based on task signals, historical outcomes, and plugin self-descriptions.

Built-in Planners

IDStyleWhen Used
planner-defaultAdaptive cheap-firstPrimary planner. Tries cheap plugins first, escalates on failure.
planner-depthHeavy-first fallbackUsed when planner-default fails. Starts with expensive deep plugins.

How the Planner Decides

The planner combines four information sources to rank candidates per stage:

  1. Task signals — regex-based cues from the user message: depth requests, speed requests, symbolic cues, retrieval-heavy patterns, topic tags (legal, literature, procedural, technical, symbolic)
  2. Plugin self-descriptionplannerHints in the plugin descriptor: supported acts, topic tags, evidence style, preferred depth, expected latency/cost
  3. Historical EWMA stats — per-plugin success rate, latency, LLM cost, sufficiency rate from the shared stats store
  4. Default priors — cheap-first ordering from config/plugins.json

plannerHints Fields

FieldTypePurpose
supportedActsstring[]Pragmatic acts this plugin handles well
topicTagsstring[]Domain tags: legal, literature, procedural, technical, symbolic
evidenceStylestring[]Retrieval approach: hybrid, symbolic-facts, lexical-only
preferredDepthstringshallow | medium | deep
fallbackRolestringcheap-probe | default | heavy-recovery
expectedLatencyMsnumberEstimated latency for scoring
expectedLLMCallsnumberEstimated LLM calls for scoring
relativeCostnumber0–1 cost factor for budget-aware ranking
confidenceWhenMatchednumber0–1 self-assessed confidence when task matches

Adaptive Learning

After each request, the planner records outcomes via EWMA (Exponentially Weighted Moving Average):

utility = successEWMA + 0.15 × sufficiencyEWMA
        - latencyPenalty - llmCostPenalty

Stats are tracked per plugin ID: attempts, successes, failures, insufficiency count, latency EWMA, LLM call EWMA. Cold-start priors favor cheaper plugins.

Planner-level stats are also recorded, allowing the engine to rank planner candidates when no explicit planner is pinned.

Cross-Planner Backtracking

The engine maintains a plannerFallbackOrder (default: planner-default → planner-depth). If the first planner's plan fails with a retryable error, the engine tries the next planner with the remaining budget. Explicit planner pinning disables this backtracking.

Weak Outcome Recovery

If a planner produces only a no-context answer after insufficient KB retrieval, the engine escalates to the next planner. If all planners produce weak results, the best weak answer is accepted rather than returning an error.

Configuration

// config/plugins.json
{
  "defaultPlannerPlugin": "planner-default",
  "plannerFallbackOrder": ["planner-default", "planner-depth"],
  "defaultSeedDetectorPlan": ["sd-symbolic", "sd-llm-fast", "sd-llm-deep"],
  "defaultKBPlan": ["kb-fast", "kb-balanced", "kb-thinkingdb"],
  "defaultGoalSolverPlan": ["gs-symbolic", "gs-llm-fast", "gs-llm-deep"]
}