Document-Reading Agents
and read-write memory
Document-Writing Agents
Giving LLMs memory lets them remember who they are talking to.
Read-write memory also allows LLMs to "take notes" and read a document or book one chapter at a time - while remembering things from the past.
I walk through how to:
1. Build a document data base
2. Let LLMs read chapters from it one by one.
3. Let LLMs use Cntl+F !
4. Let LLMs store notes on their reading in local memory (i.e. in context).
Cheers, Ronan
🛠 Explore Fine-tuning, Inference, Vision, Audio, and Evaluation Tools
💡 Consulting (Technical Assistance OR Market Insights)
Building Document-Reading Agents with Read-Write Memory
A technical overview of implementing document reading capabilities for LLMs by combining read-write memory with basic search and retrieval functions.
Core Memory Components
The system uses three distinct memory types:
Local context memory (recent conversation turns)
Read-only disk memory (conversation history database)
Read-write memory (new addition for persistent notes)
The read-write memory allows the LLM to:
Store user preferences and details
Take notes while reading documents
Maintain information between conversations
Use first-in-first-out (FIFO) memory management
Document Database Implementation
Key features:
Documents stored as markdown files
Simple keyword search (Ctrl+F style)
Line-by-line reading capability
JSON-based document metadata storage
Document conversion using marker-pdf library
Technical Details
Memory allocation:
Read-write memory: 4,000 tokens
Read-only memory: 16,000 tokens
Total context: 32,000 tokens
User messages: 250 tokens
Assistant responses: 500 tokens
Command structure:
XML-style tags for operations
Search syntax: query
Read syntax: filename:start_line:end_line
Write syntax: content
Practical Limitations
Current constraints:
Basic keyword search may require multiple iterations
Models struggle with comprehensive document review
Context length limits document chunk size
Memory management needed for long conversations
Human oversight recommended for thorough analysis
System Architecture
Core components:
Memory manager for FIFO operations
Document database interface
Tag-based command processor
Context window management
Stop token handling for commands
The system demonstrates how basic tools like keyword search and line-by-line reading can enable document comprehension tasks, though with some limitations requiring human oversight.

