Hybrid Memory — Phases Roadmap
This document tracks the phased cleanup and improvement plan from the combined recommendations report. Phase 1 and Phase 2 are done; Phase 3 is planned.
Phase 1: Cleanup & Stabilization (done)
-
Domain converters removed from builtin
Home Assistant, ESPHome, Victron VRM, and Zigbee2MQTT converters are no longer shipped with the memory plugin. Use a separate plugin (e.g.openclaw-ha-converters) andregisterConverter()to add them back. -
HyDE / query expansion removed from use (Phase 1)
HyDE was a major source of timeouts (1,491/day). In 2026.3.140+ the Phase 1 migration forcesqueryExpansion.enabled: falsefor all configs — you cannot turn it back on in this version. The code path remains for a future opt-in; config and recall paths still checkqueryExpansion.enabledand skip the LLM call when false. -
Non-core features disabled by default
Frustration detection, Hebbian link strengthening on recall, and all optional modules (nightly cycle, passive observer, workflow tracking, self-extension, crystallization, verification, provenance, aliases, cross-agent learning, reranking, contextual variants, documents) are off in every preset. Enable only what you need. -
Hebbian on read path is opt-in
New config:graph.strengthenOnRecall(defaultfalse). Whentrue, facts recalled together get RELATED_TO links strengthened; whenfalse, the read path no longer mutates the graph. -
VectorDB single long-lived connection
The plugin no longer callsopen()/removeSession()per agent session. The VectorDB connection is kept open until pluginstop(), reducing reconnects and refcount issues.
Phase 2: Performance & Stability (done)
Focus: performance and stability without large structural changes.
| Priority | Task | Rationale | Status |
|---|---|---|---|
| 1 | Hard degradation mode | When main-lane queue depth > 10 or recall latency > 5s, skip enrichment and use FTS-only + HOT facts. Add a degraded flag to recall result for observability. | Done — recallInFlightRef, degradationQueueDepth/degradationMaxLatencyMs, FTS-only+HOT path, <!-- recall degraded: queue|latency --> marker. |
| 2 | Per-stage timing in recall pipeline | Wrap each stage (FTS, embed, vector, graph, rerank, pack) in a timer and log totals at debug. Essential for finding bottlenecks. | Done — FTS, embed, vector, merge timed in auto-recall path; debug log with totals. |
| 3 | Decompose hooks.ts into staged pipeline | Replace the monolithic hook with 5 named stages (setup, recall, injection, capture, cleanup), each in its own file with config toggle and timeout. Dispatcher stays <200 lines. | Done — All stages in lifecycle/stage-*.ts (setup, recall, injection, capture, cleanup); session state in session-state.ts; active-task, auth-failure, credential-hint, frustration in separate stage modules. Dispatcher hooks.ts <200 lines. |
| 4 | Reduce prompt injections to max 3 blocks | Merge recalled context into one <recalled-context> block; keep <active-task> if present; allow one optional warning block. Everything else tool-accessible only. | Done — Single <recalled-context> via wrapRecalledContext; active-task and one optional warning remain separate. |
| 5 | Agent detection: downgrade to debug or fix | If agentId is missing, log at debug (not warn) to cut noise; separately fix payload so agentId is present where expected. | Done — Both agent-detection messages log at api.logger.debug. |
| 6 | Replace module-level mutable state with PluginContext | Pass a PluginContext object into subsystems instead of relying on 16+ module-level variables in index.ts. Prepares for concurrency and testing. | Done — Single pluginContext built in index.ts, passed to registerLifecycleHooks and registerTools. |
| 7 | Cleanup cron jobs for removed/disabled features | Remove or disable scheduled jobs that only served functionality that has been removed or is now off by default (e.g. nightly cycle, passive observer, cross-agent learning), so they do not run unnecessarily. Can be done in Phase 2 or 3. | Done — nightly-dream-cycle gated by featureGate: "nightlyCycle.enabled"; not installed when disabled. |
Phase 3: Modularization (suggested)
Focus: optional features as modules or separate plugins.
| Area | Suggested action | Status |
|---|---|---|
| Domain converters | Already removed from builtin. Ship as optional plugin openclaw-ha-converters (or similar). | Done — Built-in registry is empty; implementations remain in-tree for tests; use registerConverter() to add back. |
| Analysis & maintenance | Dream cycle, monthly review, topic clusters, knowledge gaps, cross-agent learning, retrieval-aliases generation → optional “analysis” module, triggered by cron/CLI only. | — |
| Learning & procedures | Procedure extraction, workflow tracking, pattern detection, trajectory tracking, reinforcement extraction → optional “learning” module; procedure injection in core stays but capped and off by default. | — |
| Self-extension | Skill crystallization, tool proposals, self-correction extraction, persona proposals, contextual variants → optional “self-extension” module, batch/CLI only. | — |
| Observability | Issue store, verification store, provenance, memory diagnostics, context audit, cost tracking, health dashboard → optional “observability” module. | — |
| Stable internal API | Define a well-typed MemoryPluginAPI that optional modules depend on, to avoid circular deps and make modules testable. | Done — api/memory-plugin-api.ts defines MemoryPluginAPI; index builds one implementation; registerTools and registerLifecycleHooks accept it; optional modules can depend on this type only. |
Success metrics (from recommendations)
After Phase 1+2, targets:
| Metric | Before Phase 1 | Target |
|---|---|---|
| HyDE timeouts/day | 1,491 | 0 (Phase 1: forced off in 2026.3.140+) |
| Recall pipeline timeouts/day | 808 | <50 |
| VectorDB reconnects/day | 490 | <10 |
| VectorDB refcount underflows/day | 148 | 0 |
| Main-lane waits >60s | 1,748 | <100 |
| Agent detection warnings/day | 2,296 | <50 or debug-only |
| Prompt injection blocks per turn | Up to 11 | Max 3 |
hooks.ts lines | 2,580 | <200 (dispatcher) + stage files |
After Phase 3: core plugin ~35–40 source files, ~15K–20K lines; non-core features cannot cause recall failure.
Recall hot path (default / tight ship)
The reports called out overlapping recall features causing delays, lag, and diminishing returns. After Phase 1+2 the default path is optimized:
Always off (forced or default):
- HyDE / query expansion — forced off (2026.3.140+). No LLM call on recall.
- Ambient multi-query — default off (no preset enables it). Topic-shift multi-query and issue retrieval only run if user enables
ambient.enabled. - Frustration detection — forced off. No frustration hint injection.
- Hebbian on read — forced off. No graph mutation during recall.
- Reranking, cross-agent learning, contextual variants, etc. — forced off.
When overloaded (hard degradation):
- If queue depth > 10 or recall latency > 5s → FTS-only + HOT facts only. No vector, graph, procedures, or ambient.
Capped / bounded:
- Procedure injection —
procedures.maxInjectionTokens(default 500). Cannot dominate the prompt. - Prompt blocks — max 3: one
<recalled-context>, optional<active-task>, one optional warning.
Still on by preset (user can turn off):
- Entity lookup — enhanced/complete presets set
autoRecall.entityLookup.enabled: true. Local/minimal keep it off. Whenentitiesis empty, defaultautoFromFactsloads distinct entity names from the facts DB (capped); see CONFIGURATION-MODES.md. - Graph in recall — minimal+ have
graph.useInRecall: true(zero-LLM expansion from seeds). - Procedures — minimal+ have procedures enabled (but capped).
- HOT tier — minimal+ have memory tiering (bounded by
hotMaxTokens).
Essential mode = local-only, zero LLM/API calls.
Local preset sets retrieval.strategies: ["fts5"]. Recall and capture then use only SQLite FTS and local files — no embedding, no vector search, no HyDE, no chat LLM. You get persistent structured memory, auto-capture, auto-recall by keyword, and WAL — still well above vanilla OpenClaw (which has no durable memory). Minimal/enhanced/complete add semantic (embedding + LanceDB) and the optional layers above; Minimal uses only nano/flash-tier LLM for distill and auto-classify. The worst latency sources (HyDE, ambient, frustration, Hebbian) remain removed or default-off.
Recommendations alignment (combined report)
Compared with hybrid-memory-combined-recommendations.md:
Phase 1 (all done): 1.1 HyDE → forced off. 1.2 Hard degradation → done (Phase 2). 1.3 Ambient by default → default off (no preset enables it). 1.4 Frustration by default → forced off. 1.5 Hebbian by default → strengthenOnRecall: false. 1.6 Agent detection → debug log. 1.7 Per-stage timing → done.
Phase 2 (all done): 2.1 Decompose hooks → staged pipeline. 2.2 Max 3 blocks → single <recalled-context>. 2.3 VectorDB lifecycle → single long-lived connection. 2.4 PluginContext → single pluginContext + Phase 3 MemoryPluginAPI.
Optional fast fixes (done):
- Credential auto-detect: Forced
credentials.autoDetect: falsein Phase 1 migration (2026.3.140+). User must set explicitly to enable; aligns with “make auto-detect opt-in”. - Procedure injection cap: Added
procedures.maxInjectionTokens(default 500). Procedure block is trimmed from the end until within cap before injection so procedure context cannot dominate recall.