Document-Reading Agents

and read-write memory

Mar 06, 2025

Document-Writing Agents

Giving LLMs memory lets them remember who they are talking to.

Read-write memory also allows LLMs to "take notes" and read a document or book one chapter at a time - while remembering things from the past.

I walk through how to:

1. Build a document data base

2. Let LLMs read chapters from it one by one.

3. Let LLMs use Cntl+F !

4. Let LLMs store notes on their reading in local memory (i.e. in context).

Cheers, Ronan

🛠 Explore Fine-tuning, Inference, Vision, Audio, and Evaluation Tools

💡 Consulting (Technical Assistance OR Market Insights)

🤝 Join the Trelis Team

💸 Grants Program

Building Document-Reading Agents with Read-Write Memory

A technical overview of implementing document reading capabilities for LLMs by combining read-write memory with basic search and retrieval functions.

Core Memory Components

The system uses three distinct memory types:

Local context memory (recent conversation turns)
Read-only disk memory (conversation history database)
Read-write memory (new addition for persistent notes)

The read-write memory allows the LLM to:

Store user preferences and details
Take notes while reading documents
Maintain information between conversations
Use first-in-first-out (FIFO) memory management

Document Database Implementation

Key features:

Documents stored as markdown files
Simple keyword search (Ctrl+F style)
Line-by-line reading capability
JSON-based document metadata storage
Document conversion using marker-pdf library

Technical Details

Memory allocation:

Read-write memory: 4,000 tokens
Read-only memory: 16,000 tokens
Total context: 32,000 tokens
User messages: 250 tokens
Assistant responses: 500 tokens

Command structure:

XML-style tags for operations
Search syntax: query
Read syntax: filename:start_line:end_line
Write syntax: content

Practical Limitations

Current constraints:

Basic keyword search may require multiple iterations
Models struggle with comprehensive document review
Context length limits document chunk size
Memory management needed for long conversations
Human oversight recommended for thorough analysis

System Architecture

Core components:

Memory manager for FIFO operations
Document database interface
Tag-based command processor
Context window management
Stop token handling for commands

The system demonstrates how basic tools like keyword search and line-by-line reading can enable document comprehension tasks, though with some limitations requiring human oversight.

Trelis Research

Discussion about this post

Ready for more?