LLM Response Cache IMPLEMENTED

All LLM calls go through a disk cache in data/cache/. Same prompt plus same model means same response body, returned immediately. Spec: DS015.

How it works

  1. Before calling the LLM, compute SHA-256(model + "\n" + prompt)
  2. Check whether data/cache/{hash}.json exists
  3. If yes, return the cached response body
  4. If no, call the LLM and persist the response body

Cache file format

{
  "mode": "test-fast",
  "prompt": "You are a seed detection engine...\n\nCurrent request:\nCompare X and Y for CPU-only deployment",
  "response": "# Intent CNL\n@i1 intent compare \"X and Y for CPU-only deployment\"\n@i2 set $i1 output comparative_recommendation\n@s1 seed $i1 explore locate \"tradeoffs\"\n@s2 set $s1 domain general\n@s3 set $s1 evidenceNeed structural\n@s4 set $s1 state active\n",
  "cachedAt": "2026-03-27T12:00:00Z"
}

The cache stores the raw prompt and raw model response so prompt migrations, validation failures, and SOP formatting issues remain debuggable.

What gets cached

ScenarioCached?
Successful, non-empty LLM response✅ Yes
LLM error (timeout, 429, network)❌ No
Empty response❌ No
Validation failure (bad SOP/CNL)✅ Yes — the raw model text is still useful for debugging

Cache stability

To keep cache keys stable across restarts and evaluation reruns:

Performance impact

ScenarioWithout cacheWith cache
Full evalminutesseconds
Single llm-assisted requestsecondstens of milliseconds
Context ingest normalizationsecondsnear-instant on replay

Configuration

// config/llm.json
{
  "cache": true,
  "cacheDir": "data/cache"
}

Clearing the cache

rm -f data/cache/*.json