Skip to Content
DocumentationPatterns & ArchitectureFull History

Full History Pattern

The Full History pattern maintains a complete record of all interactions, contexts, and states throughout an agent’s lifecycle. This approach provides maximum information retention but requires careful implementation to handle scale and performance challenges.

Overview

The Full History pattern stores every piece of information the agent encounters, including:

  • Complete conversation transcripts
  • All system states and context switches
  • User inputs, agent outputs, and intermediate reasoning steps
  • Environmental observations and action results
  • Error states and recovery attempts

This pattern is ideal when you need complete auditability, complex reasoning chains, or when the cost of losing information exceeds storage and processing overhead.

Architecture

class FullHistoryMemory: def __init__(self): self.interactions = [] self.states = [] self.context_stack = [] self.metadata = {} def store_interaction(self, user_input, agent_response, context): interaction = { 'timestamp': datetime.now(), 'user_input': user_input, 'agent_response': agent_response, 'context': context, 'state_id': len(self.states) } self.interactions.append(interaction) def store_state(self, state_data): state = { 'timestamp': datetime.now(), 'data': state_data, 'interaction_id': len(self.interactions) } self.states.append(state) def retrieve_full_context(self): return { 'interactions': self.interactions, 'states': self.states, 'context_stack': self.context_stack }

Implementation Considerations

Storage Strategy

  • Append-only logs: Use immutable storage patterns for data integrity
  • Compression: Implement compression for older entries
  • Partitioning: Split data by time periods or conversation sessions
  • Indexing: Create indices on timestamps, interaction types, and key entities

Memory Management

class OptimizedFullHistory: def __init__(self, max_memory_mb=1000): self.max_memory = max_memory_mb * 1024 * 1024 self.hot_memory = [] # Recent interactions in memory self.cold_storage = None # Database/file storage def manage_memory(self): if self.get_memory_usage() > self.max_memory: # Move oldest 20% to cold storage cutoff = len(self.hot_memory) // 5 self.cold_storage.store_batch(self.hot_memory[:cutoff]) self.hot_memory = self.hot_memory[cutoff:]

Data Structure Optimization

  • Use efficient serialization formats (Protocol Buffers, MessagePack)
  • Implement lazy loading for historical data
  • Consider columnar storage for analytical queries
  • Use memory mapping for large datasets

Performance Characteristics

Pros

  • Complete Information: No data loss, perfect recall
  • Audit Trail: Full traceability of agent decisions
  • Complex Reasoning: Can reference any past interaction
  • Debugging: Excellent for troubleshooting and analysis
  • Learning: Rich dataset for training and improvement

Cons

  • Storage Cost: Linear growth with usage
  • Query Latency: Slower retrieval as history grows
  • Memory Pressure: RAM usage increases over time
  • Processing Overhead: More data to filter and process

Performance Metrics

# Typical performance characteristics STORAGE_GROWTH = "O(n)" # Linear with interactions RETRIEVAL_TIME = "O(log n)" # With proper indexing MEMORY_USAGE = "O(k)" # Where k is hot memory limit QUERY_COMPLEXITY = "O(n)" # Worst case full scan

When to Use

Ideal Scenarios

  • High-stakes applications where information loss is critical
  • Complex multi-turn conversations requiring deep context
  • Compliance requirements needing complete audit trails
  • Research applications where data richness matters
  • Long-running agents with evolving capabilities
  • Resource-constrained environments with storage/memory limits
  • High-throughput systems where latency is critical
  • Simple task-oriented agents with limited context needs
  • Privacy-sensitive applications requiring data minimization

Implementation Example

class ProductionFullHistory: def __init__(self, config): self.db = self._init_database(config.db_url) self.cache = LRUCache(maxsize=config.cache_size) self.compression = config.enable_compression def store(self, interaction_data): # Store with automatic partitioning partition_key = self._get_partition(interaction_data.timestamp) if self.compression: interaction_data = self._compress(interaction_data) self.db.insert( table=f"interactions_{partition_key}", data=interaction_data ) # Update cache self.cache[interaction_data.id] = interaction_data def retrieve(self, query_params): # Try cache first if query_params.interaction_id in self.cache: return self.cache[query_params.interaction_id] # Query database with optimized indices results = self.db.query( query=self._build_query(query_params), use_index=True ) return self._decompress_results(results)

Best Practices

Data Lifecycle Management

  1. Tiered Storage: Move old data to cheaper storage tiers
  2. Compression Policies: Compress data older than X days
  3. Retention Policies: Define legal/business retention requirements
  4. Backup Strategy: Implement incremental backups

Query Optimization

# Index strategy for common queries CREATE INDEX idx_timestamp ON interactions (timestamp); CREATE INDEX idx_user_session ON interactions (user_id, session_id); CREATE INDEX idx_content_type ON interactions (interaction_type); CREATE INDEX idx_entities ON interactions USING gin(extracted_entities);

Monitoring and Alerting

  • Track storage growth rates
  • Monitor query performance
  • Alert on memory pressure
  • Watch for data corruption

Integration with Other Patterns

Hybrid Approach

Combine with other patterns for optimization:

class HybridMemory: def __init__(self): self.full_history = FullHistoryMemory() self.vector_store = VectorMemory() # For semantic search self.sliding_window = SlidingWindowMemory() # For recent context def retrieve(self, query): # Use sliding window for immediate context recent = self.sliding_window.get_recent(limit=10) # Use vector search for semantic relevance relevant = self.vector_store.similarity_search(query.text) # Fall back to full history for comprehensive search historical = self.full_history.search(query) return self._merge_results(recent, relevant, historical)

Testing and Validation

Unit Tests

def test_full_history_storage(): memory = FullHistoryMemory() # Test storage memory.store_interaction("Hello", "Hi there", {"mood": "friendly"}) assert len(memory.interactions) == 1 # Test retrieval context = memory.retrieve_full_context() assert len(context['interactions']) == 1 def test_memory_management(): memory = OptimizedFullHistory(max_memory_mb=1) # Fill memory beyond limit for i in range(1000): memory.store_large_interaction(f"Data {i}") # Verify memory constraints are respected assert memory.get_memory_usage() <= memory.max_memory

Performance Benchmarks

  • Measure storage growth over time
  • Test query performance at different scales
  • Validate memory management effectiveness
  • Monitor compression ratios

Migration and Scaling

Data Migration

def migrate_to_full_history(existing_memory): """Migrate from other memory patterns to full history""" full_history = FullHistoryMemory() # Convert existing interactions for interaction in existing_memory.get_all(): full_history.store_interaction( interaction.input, interaction.output, interaction.context ) return full_history

Horizontal Scaling

  • Shard by user ID or time ranges
  • Use distributed databases for storage
  • Implement read replicas for query distribution
  • Consider event sourcing patterns

Next Steps

  1. Assess your storage and performance requirements
  2. Choose appropriate database technology
  3. Implement data lifecycle policies
  4. Set up monitoring and alerting
  5. Consider hybrid approaches for optimization
  6. Plan for scaling and migration strategies

The Full History pattern provides the most comprehensive approach to agent memory but requires careful engineering to handle scale and performance effectively.