Full History Pattern
The Full History pattern maintains a complete record of all interactions, contexts, and states throughout an agent’s lifecycle. This approach provides maximum information retention but requires careful implementation to handle scale and performance challenges.
Overview
The Full History pattern stores every piece of information the agent encounters, including:
- Complete conversation transcripts
- All system states and context switches
- User inputs, agent outputs, and intermediate reasoning steps
- Environmental observations and action results
- Error states and recovery attempts
This pattern is ideal when you need complete auditability, complex reasoning chains, or when the cost of losing information exceeds storage and processing overhead.
Architecture
class FullHistoryMemory:
def __init__(self):
self.interactions = []
self.states = []
self.context_stack = []
self.metadata = {}
def store_interaction(self, user_input, agent_response, context):
interaction = {
'timestamp': datetime.now(),
'user_input': user_input,
'agent_response': agent_response,
'context': context,
'state_id': len(self.states)
}
self.interactions.append(interaction)
def store_state(self, state_data):
state = {
'timestamp': datetime.now(),
'data': state_data,
'interaction_id': len(self.interactions)
}
self.states.append(state)
def retrieve_full_context(self):
return {
'interactions': self.interactions,
'states': self.states,
'context_stack': self.context_stack
}Implementation Considerations
Storage Strategy
- Append-only logs: Use immutable storage patterns for data integrity
- Compression: Implement compression for older entries
- Partitioning: Split data by time periods or conversation sessions
- Indexing: Create indices on timestamps, interaction types, and key entities
Memory Management
class OptimizedFullHistory:
def __init__(self, max_memory_mb=1000):
self.max_memory = max_memory_mb * 1024 * 1024
self.hot_memory = [] # Recent interactions in memory
self.cold_storage = None # Database/file storage
def manage_memory(self):
if self.get_memory_usage() > self.max_memory:
# Move oldest 20% to cold storage
cutoff = len(self.hot_memory) // 5
self.cold_storage.store_batch(self.hot_memory[:cutoff])
self.hot_memory = self.hot_memory[cutoff:]Data Structure Optimization
- Use efficient serialization formats (Protocol Buffers, MessagePack)
- Implement lazy loading for historical data
- Consider columnar storage for analytical queries
- Use memory mapping for large datasets
Performance Characteristics
Pros
- Complete Information: No data loss, perfect recall
- Audit Trail: Full traceability of agent decisions
- Complex Reasoning: Can reference any past interaction
- Debugging: Excellent for troubleshooting and analysis
- Learning: Rich dataset for training and improvement
Cons
- Storage Cost: Linear growth with usage
- Query Latency: Slower retrieval as history grows
- Memory Pressure: RAM usage increases over time
- Processing Overhead: More data to filter and process
Performance Metrics
# Typical performance characteristics
STORAGE_GROWTH = "O(n)" # Linear with interactions
RETRIEVAL_TIME = "O(log n)" # With proper indexing
MEMORY_USAGE = "O(k)" # Where k is hot memory limit
QUERY_COMPLEXITY = "O(n)" # Worst case full scanWhen to Use
Ideal Scenarios
- High-stakes applications where information loss is critical
- Complex multi-turn conversations requiring deep context
- Compliance requirements needing complete audit trails
- Research applications where data richness matters
- Long-running agents with evolving capabilities
Not Recommended For
- Resource-constrained environments with storage/memory limits
- High-throughput systems where latency is critical
- Simple task-oriented agents with limited context needs
- Privacy-sensitive applications requiring data minimization
Implementation Example
class ProductionFullHistory:
def __init__(self, config):
self.db = self._init_database(config.db_url)
self.cache = LRUCache(maxsize=config.cache_size)
self.compression = config.enable_compression
def store(self, interaction_data):
# Store with automatic partitioning
partition_key = self._get_partition(interaction_data.timestamp)
if self.compression:
interaction_data = self._compress(interaction_data)
self.db.insert(
table=f"interactions_{partition_key}",
data=interaction_data
)
# Update cache
self.cache[interaction_data.id] = interaction_data
def retrieve(self, query_params):
# Try cache first
if query_params.interaction_id in self.cache:
return self.cache[query_params.interaction_id]
# Query database with optimized indices
results = self.db.query(
query=self._build_query(query_params),
use_index=True
)
return self._decompress_results(results)Best Practices
Data Lifecycle Management
- Tiered Storage: Move old data to cheaper storage tiers
- Compression Policies: Compress data older than X days
- Retention Policies: Define legal/business retention requirements
- Backup Strategy: Implement incremental backups
Query Optimization
# Index strategy for common queries
CREATE INDEX idx_timestamp ON interactions (timestamp);
CREATE INDEX idx_user_session ON interactions (user_id, session_id);
CREATE INDEX idx_content_type ON interactions (interaction_type);
CREATE INDEX idx_entities ON interactions USING gin(extracted_entities);Monitoring and Alerting
- Track storage growth rates
- Monitor query performance
- Alert on memory pressure
- Watch for data corruption
Integration with Other Patterns
Hybrid Approach
Combine with other patterns for optimization:
class HybridMemory:
def __init__(self):
self.full_history = FullHistoryMemory()
self.vector_store = VectorMemory() # For semantic search
self.sliding_window = SlidingWindowMemory() # For recent context
def retrieve(self, query):
# Use sliding window for immediate context
recent = self.sliding_window.get_recent(limit=10)
# Use vector search for semantic relevance
relevant = self.vector_store.similarity_search(query.text)
# Fall back to full history for comprehensive search
historical = self.full_history.search(query)
return self._merge_results(recent, relevant, historical)Testing and Validation
Unit Tests
def test_full_history_storage():
memory = FullHistoryMemory()
# Test storage
memory.store_interaction("Hello", "Hi there", {"mood": "friendly"})
assert len(memory.interactions) == 1
# Test retrieval
context = memory.retrieve_full_context()
assert len(context['interactions']) == 1
def test_memory_management():
memory = OptimizedFullHistory(max_memory_mb=1)
# Fill memory beyond limit
for i in range(1000):
memory.store_large_interaction(f"Data {i}")
# Verify memory constraints are respected
assert memory.get_memory_usage() <= memory.max_memoryPerformance Benchmarks
- Measure storage growth over time
- Test query performance at different scales
- Validate memory management effectiveness
- Monitor compression ratios
Migration and Scaling
Data Migration
def migrate_to_full_history(existing_memory):
"""Migrate from other memory patterns to full history"""
full_history = FullHistoryMemory()
# Convert existing interactions
for interaction in existing_memory.get_all():
full_history.store_interaction(
interaction.input,
interaction.output,
interaction.context
)
return full_historyHorizontal Scaling
- Shard by user ID or time ranges
- Use distributed databases for storage
- Implement read replicas for query distribution
- Consider event sourcing patterns
Related Patterns
- Vector Retrieval: Combine for semantic search over full history
- Sliding Window: Use together for recent + historical context
- Graph Memory: Build knowledge graphs from full history
- Hierarchical Memory: Organize full history into levels
Next Steps
- Assess your storage and performance requirements
- Choose appropriate database technology
- Implement data lifecycle policies
- Set up monitoring and alerting
- Consider hybrid approaches for optimization
- Plan for scaling and migration strategies
The Full History pattern provides the most comprehensive approach to agent memory but requires careful engineering to handle scale and performance effectively.