All LLM calls go through a disk cache in data/cache/. Same prompt plus same model means same response body, returned immediately. Spec: DS015.
SHA-256(model + "\n" + prompt)data/cache/{hash}.json exists{
"mode": "test-fast",
"prompt": "You are a seed detection engine...\n\nCurrent request:\nCompare X and Y for CPU-only deployment",
"response": "# Intent CNL\n@i1 intent compare \"X and Y for CPU-only deployment\"\n@i2 set $i1 output comparative_recommendation\n@s1 seed $i1 explore locate \"tradeoffs\"\n@s2 set $s1 domain general\n@s3 set $s1 evidenceNeed structural\n@s4 set $s1 state active\n",
"cachedAt": "2026-03-27T12:00:00Z"
}
The cache stores the raw prompt and raw model response so prompt migrations, validation failures, and SOP formatting issues remain debuggable.
| Scenario | Cached? |
|---|---|
| Successful, non-empty LLM response | ✅ Yes |
| LLM error (timeout, 429, network) | ❌ No |
| Empty response | ❌ No |
| Validation failure (bad SOP/CNL) | ✅ Yes — the raw model text is still useful for debugging |
To keep cache keys stable across restarts and evaluation reruns:
| Scenario | Without cache | With cache |
|---|---|---|
| Full eval | minutes | seconds |
| Single llm-assisted request | seconds | tens of milliseconds |
| Context ingest normalization | seconds | near-instant on replay |
// config/llm.json
{
"cache": true,
"cacheDir": "data/cache"
}
rm -f data/cache/*.json