LLM Response Cache IMPLEMENTED

All LLM calls go through a disk cache in data/cache/. Same prompt plus same model means same response body, returned immediately. Spec: DS015.

How it works

Before calling the LLM, compute SHA-256(model + "\n" + prompt)
Check whether data/cache/{hash}.json exists
If yes, return the cached response body
If no, call the LLM and persist the response body

Cache file format

{
  "mode": "test-fast",
  "prompt": "You are a seed detection engine...\n\nCurrent request:\nCompare X and Y for CPU-only deployment",
  "response": "# Intent CNL\n@i1 intent compare \"X and Y for CPU-only deployment\"\n@i2 set $i1 output comparative_recommendation\n@s1 seed $i1 explore locate \"tradeoffs\"\n@s2 set $s1 domain general\n@s3 set $s1 evidenceNeed structural\n@s4 set $s1 state active\n",
  "cachedAt": "2026-03-27T12:00:00Z"
}

The cache stores the raw prompt and raw model response so prompt migrations, validation failures, and SOP formatting issues remain debuggable.

What gets cached

Scenario	Cached?
Successful, non-empty LLM response	✅ Yes
LLM error (timeout, 429, network)	❌ No
Empty response	❌ No
Validation failure (bad SOP/CNL)	✅ Yes — the raw model text is still useful for debugging

Cache stability

To keep cache keys stable across restarts and evaluation reruns:

session ids are excluded from LLM-facing prompts
source ids are normalized in synthesis prompts where appropriate
control prompts rely on stable SOP examples and deterministic section wrappers

Performance impact

Scenario	Without cache	With cache
Full eval	minutes	seconds
Single llm-assisted request	seconds	tens of milliseconds
Context ingest normalization	seconds	near-instant on replay

Configuration

// config/llm.json
{
  "cache": true,
  "cacheDir": "data/cache"
}

Clearing the cache

rm -f data/cache/*.json

← Evaluation Agentic Loops →