Features — Categories, Decay, Tags, and Auto-Classify

The plugin makes your agent more personal and tuned by classifying what it stores, decaying old content, and improving recall over time. This page is the technical reference for how that works: categories, decay, tagging, and LLM auto-classify. For the big-picture “why” and benefits, see the README.

Detailed reference for the memory-hybrid plugin’s classification, decay, tagging, and LLM auto-classify features.

Feature documentation (by topic)

Feature	Document	Description
Persona proposals	PERSONA-PROPOSALS.md	Agent self-evolution with human approval: propose identity file changes, review/apply via CLI
Auto-tagging	AUTO-TAGGING.md	Regex-inferred topic tags, built-in patterns, tag-filtered search and recall
Decay & pruning	DECAY-AND-PRUNING.md	Decay classes, TTLs, refresh-on-access, hard/soft prune, when they run
Reflection	REFLECTION.md	Pattern synthesis from facts (reflect, reflect-rules, reflect-meta)
Graph memory	GRAPH-MEMORY.md	Typed links between facts, spreading activation
Contacts, orgs & NER	MULTILINGUAL-SUPPORT.md, GRAPH-MEMORY.md	`memory_directory` tool; SQLite tables for people/orgs; multilingual PERSON/ORG extraction (franc + LLM) when `graph.enabled`; CLI `enrich-entities` (#985–#987)
Session distillation	SESSION-DISTILLATION.md	Extracting facts from session logs
Procedural memory	PROCEDURAL-MEMORY.md	Procedure tagging, recall, auto-skills, promotion gates, telemetry (issue #23)
Skill pipelines	SKILL-PIPELINES.md	Crystallization vs procedure promotion, shared lifecycle, operator playbooks
Credentials	CREDENTIALS.md	Opt-in encrypted credential vault
WAL	WAL-CRASH-RESILIENCE.md	Write-ahead log for crash resilience
Conflicting memories	CONFLICTING-MEMORIES.md	Classify-before-write (ADD/UPDATE/DELETE/NOOP), supersession, bi-temporal
Automatic categories	AUTOMATIC-CATEGORIES.md	Category discovery from “other” facts (LLM labels, threshold, .discovered-categories.json)
Dynamic derived data	DYNAMIC-DERIVED-DATA.md	Index: tags, categories, decay, entity/key/value, conflicting memories
Dynamic salience	DYNAMIC-SALIENCE.md	Access-based importance — access boost, time decay, Hebbian co-recall links
Memory scoping	MEMORY-SCOPING.md	Global, user-private, agent-specific, session-scoped memories; privacy in multi-user environments
Memory tiering	MEMORY-TIERING.md	Hot/warm/cold tiers, compaction (tasks→COLD, preferences→WARM, blockers→HOT), `hybrid-mem compact`
Retrieval directives	CONFIGURATION.md (autoRecall.retrievalDirectives)	Targeted recall by entity mention, keywords, task type, or session start. Added in 2026.3.70.
Workflow crystallization & self-extension	CONFIGURATION.md, release notes 2026.3.70	Tool-sequence patterns, skill proposals (`memory_crystallize`), tool proposals (`memory_propose_tool`). Added in 2026.3.70.
Future-date decay protection	CONFIGURATION.md	Facts with future dates have their decay frozen until the date passes. Default: enabled. (#144)
Episodic event log (Layer 1)	event-log.md	Append-only session event journal; raw capture layer for Dream Cycle consolidation. (#150)
Local embeddings (Ollama/ONNX)	CONFIGURATION.md	Run embeddings locally without an API key using Ollama or ONNX providers. (#153)
Local LLM pre-filtering (Ollama)	CONFIGURATION.md	Two-tier session triage: uses a local model (e.g. qwen3) to filter out uninteresting sessions before sending to the cloud LLM. (#290)
Sensor Sweep (Event Bus)	CONFIGURATION.md	Cron-based background data collection (Garmin, GitHub, Session History, System Health, HA anomalies, Weather, Yarbo) without LLM overhead. Events stored in `event-bus.db`. (#236)
Multi-model embedding registry	CONFIGURATION.md	Embed each fact with multiple models in parallel; merge results via RRF at recall. (#158)
Contextual variants at index time	CONFIGURATION.md	LLM-generated alternative phrasings embedded alongside facts to improve recall. (#159)
LLM re-ranking	CONFIGURATION.md	Re-order RRF fusion results with an LLM for higher-precision recall. (#161)
Verification store	CONFIGURATION.md	Integrity layer for critical facts: `memory_verify`, `memory_verified_list`, re-verification schedule. (#162)
Provenance tracing	CONFIGURATION.md	Full origin chain from any fact back to its source events via `memory_provenance`. (#163)
Document ingestion	CONFIGURATION.md	Ingest PDF, DOCX, HTML, images, and more as searchable fact chunks (`memory_ingest_document`, `memory_ingest_folder`). (#206)
Mission Control dashboard	CONFIGURATION.md	Real-time web dashboard (port 7700) — memory stats, cron jobs, task queue, agent status, git activity, and 7-day LLM cost tracking. Auto-refreshes every 60s. (#309)
Cost optimization playbook	COST-OPTIMIZATION-PLAYBOOK.md	Concrete low-cost rollout plan: mode-by-use-case, cheap-first model tiers, distill guardrails, token budgets, local embeddings, weekly auditing.
Generated skill validation	SKILL-PIPELINES.md, PROCEDURAL-MEMORY.md	Content gates, static validation, `skills rescan` / quarantine, `staticValidation` semantics (2026.5.190+)

Category	Typical content	Examples
`preference`	Likes, dislikes, working-style choices	“I prefer dark mode”, “I hate tabs”
`fact`	Biographical or factual statements	“My birthday is Nov 13”, “lives in Prague”
`decision`	Architectural or process decisions with rationale	“Decided to use Postgres because …”
`entity`	Named things: people, projects, tools, identifiers	“John’s email is john@example.com”
`pattern`	Behavioral patterns synthesized by the reflection layer	“User consistently favors composition over inheritance”
`rule`	Actionable one-line rules from reflection	“Always suggest composition over inheritance”
`other`	Anything the heuristics can’t classify	Catch-all; reclassified later by auto-classify

Classification Pipeline

Every fact passes through up to three stages:

conversation text
      |
      v
 1. Auto-capture filter          shouldCapture() — regex triggers
    (hot path, no LLM)           + sensitive-content exclusion
      |
      v
 2. Heuristic classification     detectCategory() — fast regex
    (hot path, no LLM)           matching on the text
      |
      v
 3. LLM auto-classify            Runs in background:
    (background, cheap LLM)      daily batch + 5 min after startup
      |
      v
  stored fact

Stage 1 — Auto-capture filter. shouldCapture() checks regex triggers (e.g. “remember”, “prefer”, “decided”, email/phone patterns) and rejects sensitive content (passwords, API keys, SSNs, credit cards) and messages that are too short/long or look like structured markup.

Stage 2 — Heuristic classification. detectCategory() runs a fast regex pass — no LLM call. Anything that doesn’t match falls through to "other".

Stage 3 — LLM auto-classify. Background job periodically queries all "other" facts and sends them to a cheap LLM in batches. When category discovery is enabled, it first groups by free-form topic labels.

Manual override. The memory_store tool accepts an explicit category parameter that bypasses heuristic detection.

Heuristic detection patterns

Category	Patterns matched
`decision`	`decided`, `chose`, `went with`, `selected`, `always use`, `never use`, `will use`
`preference`	`prefer`, `like`, `love`, `hate`, `want`
`entity`	Phone numbers (`+` followed by 10+ digits), email addresses, `is called`
`fact`	`born`, `birthday`, `lives`, `works`, `is`, `are`, `has`, `have`
`other`	Everything else (fallback)

Structured field extraction

After category detection, extractStructuredFields() extracts entity / key / value triples:

Pattern	Extracted fields	Example
`decided/chose X because Y`	entity=`decision`, key=`X`, value=`Y`	“Decided to use Postgres because JSONB”
`always/never X`	entity=`convention`, key=`X`	“Always use strict mode”
`X's Y is Z` / `My Y is Z`	entity=`X`/`user`, key=`Y`, value=`Z`	“My birthday is Nov 13”
`I prefer/like/hate X`	entity=`user`, key=`prefer`/`like`/`hate`, value=`X`	“I prefer dark mode”
Email found	key=`email`, value=address	“john@example.com”
Phone found	key=`phone`, value=number	“+1234567890”

Adding heuristic patterns for custom categories

The built-in detectCategory() only recognizes a subset of the default categories (not pattern or rule, which are assigned by the reflection layer). To add a heuristic for a custom category, edit detectCategory() in index.ts:

// Before the final return:
if (/research|paper|study|journal|arxiv/i.test(lower)) return "research";
return "other";

Without this, custom categories are only assigned via explicit memory_store calls or the LLM auto-classifier.

Auto-Classify (LLM Reclassification)

How it works

No inline LLM calls. During auto-capture, facts are classified by fast heuristics only.
Background batch job. If autoClassify.enabled is true:
- Once on startup (5-minute delay).
- Then every 24 hours.
Safe. Only reclassifies facts currently categorized as "other".
Batched. Sent in batches of batchSize (default 20) with 500ms pause between batches.

LLM prompt

You are a memory classifier. Categorize each fact into exactly one category. Available categories: preference, fact, decision, entity (plus custom categories, minus “other”) Use “other” ONLY if no category fits at all. Respond with ONLY a JSON array of category strings.

CLI commands

Command	Description
`hybrid-mem classify --dry-run`	Preview classifications without applying
`hybrid-mem classify`	Run LLM auto-classify immediately
`hybrid-mem classify --limit N`	Classify at most N facts
`hybrid-mem classify --model M`	Override the LLM model
`hybrid-mem categories`	List all categories with fact counts

Decay and Pruning

No cron or external jobs are required. The plugin handles decay automatically: on gateway start (hard-delete expired) and every 60 minutes (hard prune + soft-decay confidence). Decay classes: permanent, durable (~3mo), normal (2w), short (2d), ephemeral (4h), and legacy classes (stable, active, session, checkpoint). Durable, normal, stable, and active facts get their expiry refreshed when recalled.

Manual controls: openclaw hybrid-mem prune (options: --soft, --dry-run), openclaw hybrid-mem backfill-decay to re-classify existing facts.

→ Full detail: DECAY-AND-PRUNING.md

Source date

Facts have an optional source_date (Unix seconds): when the fact originated, not when it was stored.

Source	`source_date`	`created_at`
Live capture	`null` (uses `created_at`)	Insertion time
Distillation from Jan 15 session	`2026-01-15` (Unix)	Feb 16 (insertion)
Backfill from `[2026-01-15]` prefix	Parsed from text	Insertion time

Ordering: Lookup, search, and recall use COALESCE(source_date, created_at) for temporal ordering.

memory_store tool: Optional sourceDate (ISO-8601 or Unix seconds). CLI: openclaw hybrid-mem store --text "..." --source-date 2026-01-15

Auto-tagging

Facts can have optional tags for topic filtering. When tags are omitted, the plugin infers tags from fact text (and entity) via regex patterns. Tag-filtered search/lookup and memory_recall(tag="…") use only SQLite with a tag filter. Manual override: pass tags to memory_store or hybrid-mem store --tags "a,b".

→ Full detail: AUTO-TAGGING.md

PERSONA-PROPOSALS.md — Persona proposals (agent self-evolution, human approval)
AUTO-TAGGING.md — Auto-tagging (patterns, storage, filtering)
DECAY-AND-PRUNING.md — Decay classes, TTLs, pruning
DEEP-DIVE.md — Storage internals, search algorithms, tags, links, deduplication
CONFIGURATION.md — Config reference for all features
CLI-REFERENCE.md — All CLI commands
ARCHITECTURE.md — System design overview
REFLECTION.md — Reflection layer (pattern synthesis from facts)
GRAPH-MEMORY.md — Graph-based spreading activation (fact linking)
CREDENTIALS.md — Credential vault (opt-in encrypted store)
SESSION-DISTILLATION.md — Extracting facts from session logs
PROCEDURAL-MEMORY.md — Procedural memory (procedure tagging, recall, auto-skills)
CONFLICTING-MEMORIES.md — Conflicting/contradictory memories (classify-before-write, supersession)
AUTOMATIC-CATEGORIES.md — Automatic category discovery
DYNAMIC-DERIVED-DATA.md — Overview of tags, categories, decay, and other derived data
DYNAMIC-SALIENCE.md — Access-based importance

Features — Categories, Decay, Tags, and Auto-Classify

Feature documentation (by topic)

Categories

Default categories

Custom categories

Category discovery

Classification Pipeline

Heuristic detection patterns

Structured field extraction

Adding heuristic patterns for custom categories

Auto-Classify (LLM Reclassification)

How it works

LLM prompt

CLI commands

Decay and Pruning

Source date

Auto-tagging

Features — Categories, Decay, Tags, and Auto-Classify

Feature documentation (by topic)

Categories

Default categories

Custom categories

Category discovery

Classification Pipeline

Heuristic detection patterns

Structured field extraction

Adding heuristic patterns for custom categories

Auto-Classify (LLM Reclassification)

How it works

LLM prompt

CLI commands

Decay and Pruning

Source date

Auto-tagging

Related docs