Enhancement Issue: Centralized Timer Lifecycle Management
To file this as a GitHub issue, copy the content below into a new issue on the repository.
Title: Enhancement: Centralized Timer Lifecycle Management
Labels: enhancement, refactoring, technical-debt
Priority: Medium
Effort: 6-8 weeks
Summary
Create a centralized TimerManager utility class to consolidate timer lifecycle management across the plugin, preventing race conditions and ensuring consistent cleanup patterns.
Motivation
The codebase currently has 150+ timer call sites (setTimeout, setInterval, setImmediate) scattered across multiple files with inconsistent lifecycle management patterns. Recent fixes (see branch claude/identify-critical-bug-scope) addressed critical race conditions where timers accessed closed database connections during shutdown, but each component still manages timers independently, making the code error-prone and difficult to audit.
Current Problems
- Inconsistent Patterns: Each component implements its own timer management with varying levels of safety
- Error-Prone: Easy to forget database connection checks or cleanup handlers
- Hard to Audit: No centralized view of active timers
- Shutdown Complexity: Each timer requires manual
clearInterval/clearTimeoutcalls - Race Conditions: Deferred operations can execute after databases close
Current Workarounds
After recent fixes, most timers now have:
- Manual
shuttingDownchecks - Manual
factsDb.isOpen()checks - Manual cleanup in shutdown handlers
- Error suppression for “database connection is not open”
This works but is repetitive and maintenance-heavy.
Proposed Solution
Create a TimerManager class that:
Core Features
- Tracks Active Timers: Maintains a registry of all scheduled operations
- Lifecycle-Aware: Checks database connection state before executing callbacks
- Automatic Cleanup: Clears all timers on shutdown
- Type-Safe APIs: Provides TypeScript-friendly scheduling methods
- Graceful Degradation: Handles shutdown gracefully without errors
API Design
interface TimerManagerOptions {
shuttingDown: () => boolean;
isDbOpen?: () => boolean;
logger?: { warn?: (msg: string) => void; debug?: (msg: string) => void };
}
class TimerManager {
constructor(options: TimerManagerOptions);
// Schedule a one-time callback
schedule(callback: () => void | Promise<void>, delayMs: number): TimerHandle;
// Schedule a repeating callback
scheduleInterval(callback: () => void | Promise<void>, intervalMs: number): TimerHandle;
// Schedule for next event loop tick
scheduleImmediate(callback: () => void | Promise<void>): TimerHandle;
// Clear a specific timer
clear(handle: TimerHandle): void;
// Clear all timers (called during shutdown)
clearAll(): void;
// Get stats for monitoring
getStats(): { active: number; scheduled: number; immediate: number; interval: number };
}
interface TimerHandle {
readonly id: string;
readonly type: 'timeout' | 'interval' | 'immediate';
cancel(): void;
}
Migration Strategy
Phase 1: Create Utility (Week 1)
- Create
extensions/memory-hybrid/utils/timer-manager.ts - Add comprehensive unit tests
- Document API and usage patterns
Phase 2: Migrate High-Risk Sites (Week 2-3)
Priority order based on risk:
lifecycle/stage-injection.ts- Hebbian link strengtheningindex.ts- Cron verificationservices/python-bridge.ts- Request timerslifecycle/stage-cleanup.ts- Stale session sweep
Phase 3: Migrate Plugin Service Timers (Week 4-5)
setup/plugin-service.ts- Prune timersetup/plugin-service.ts- Classify timersetup/plugin-service.ts- Language keywords timersetup/plugin-service.ts- Passive observer timersetup/plugin-service.ts- Watchdog timer
Phase 4: Migrate Remaining Sites (Week 6-8)
- Auto-classifier delays
- Embedding migration delays
- CLI command timeouts
- Test utilities (if beneficial)
Benefits
For Developers
- Single Source of Truth: One place to understand timer behavior
- Reduced Boilerplate: No need to manually check shutdown state
- Type Safety: TypeScript ensures correct usage
- Easy Testing: Mock the TimerManager for tests
For Operations
- Better Observability:
getStats()shows active timer count - Cleaner Shutdown: Automatic cleanup prevents orphaned timers
- Reduced Errors: Fewer “database connection is not open” errors in logs
For Maintenance
- Easier Audits: All timer call sites use the same API
- Consistent Patterns: Same error handling everywhere
- Future-Proof: Easy to add features (e.g., rate limiting, retry logic)
Testing Requirements
- Unit Tests:
- Timer creation and cleanup
- Shutdown state checks
- Database connection checks
- Error handling and suppression
- Stats reporting
- Integration Tests:
- Plugin reload with active timers
- Graceful shutdown scenarios
- Database connection loss during timer execution
- Performance Tests:
- Overhead of wrapped callbacks
- Memory usage with many timers
- Cleanup performance
Success Metrics
- Zero “database connection is not open” errors from timer callbacks
- All timers cleared within 5 seconds of shutdown initiation
- <1ms overhead per timer callback
- 100% test coverage for TimerManager
- All high-risk timer sites migrated
Related
- Timer race condition fixes in branch
claude/identify-critical-bug-scope - See
TIMER_RACE_CONDITIONS_ANALYSIS.mdfor complete technical analysis - Builds on fixes for race conditions in Hebbian links, cron verification, Python bridge, etc.
Out of Scope
- Changing Node.js timer behavior
- Replacing
async/awaitpatterns - Modifying test utilities (separate effort)
Questions for Discussion
- Should TimerManager be a singleton or per-context instance?
- Should it integrate with GlitchTip for timer telemetry?
- Should it support priority scheduling?
- Should it provide timer coalescing for performance?
Implementation Note: This is an optional enhancement that builds on the immediate race condition fixes. The current fixes work well, but this consolidation would make the codebase more maintainable long-term.