Chapter 26: Semantic Memory and Workspace I/O #
"Memory is the treasury and guardian of all things." -- Cicero
An agent that forgets everything after each conversation is little more than a stateless function. It cannot build relationships with users, recall past decisions, learn from experience, or maintain a knowledge base that grows over time. In production, memory is what separates a useful assistant from a toy demo.
Neam addresses this with a three-tier memory architecture that gives agents access to volatile session history, persistent workspace files, and indexed semantic search -- all through native language constructs. You do not need to wire up databases, manage file systems manually, or implement vector search from scratch. You declare what you need, and the runtime handles the rest.
This chapter teaches you how to use all three memory tiers. You will learn the workspace filesystem, native read/write/append functions, semantic memory configuration, search modes, session history access, automatic memory injection, pre-compaction flush, and memory isolation rules. By the end, you will build a complete knowledge-aware personal assistant that stores notes, recalls them semantically, and maintains continuity across sessions.
Specifically, you will learn to:
- Distinguish the three memory tiers and when to use each
- Configure agent-scoped workspaces with path security
- Read, write, and append files using native workspace functions
- Configure semantic memory with backend, search mode, and flush options
- Use
memory_search()to perform hybrid, vector, or keyword searches - Access session history with
session_history() - Control automatic memory injection into agent prompts
- Implement pre-compaction flush for long-running conversations
- Apply memory isolation rules across different execution contexts
Imagine hiring an executive assistant who has perfect amnesia. Every morning, you walk in and they have no idea who you are, what projects you are working on, or what you discussed yesterday. You would spend half your day re-explaining context. Now imagine that assistant has a notebook (workspace), a searchable filing cabinet (semantic index), and a short-term memory of your current conversation (session history). That is the difference between a stateless agent and a memory-equipped one. In production, agents that remember are agents that deliver value.
26.1 The Three-Tier Memory Architecture #
Neam provides three distinct memory tiers, each optimized for a different access pattern. Understanding when to use each tier is essential for building agents that remember effectively without wasting resources.
How Each Tier Serves Different Recall Needs #
Each memory tier answers a different kind of question:
| Tier | What It Stores | Access Pattern | Lifetime | Speed |
|---|---|---|---|---|
| Session History | Conversation messages (role + content) | Sequential, recent-first | Current session only | Instant (in-memory) |
| Workspace Files | Arbitrary text files | Path-based read/write | Persistent across sessions | Fast (disk I/O) |
| Semantic Index | Chunked, embedded file content | Query-based similarity search | Persistent, re-indexed on change | Moderate (search + rank) |
Session history is your agent's short-term memory. It holds the current conversation and is ideal for questions like "What did the user just say?" or "What was the third message in this session?" It is volatile -- when the session ends, the history is gone unless you explicitly save it.
Workspace files are your agent's persistent notebook. They survive restarts and are ideal for storing structured data, configuration, user preferences, accumulated notes, and any information the agent needs to recall by path. You know the file name and you read it directly.
Semantic search is your agent's searchable filing cabinet. When workspace files grow large or numerous, you cannot read every file on every query. The semantic index chunks your workspace files, embeds them as vectors, and lets you search by meaning. You do not need to know the file name -- you describe what you are looking for, and the index returns the most relevant chunks.
Agent Type Support #
Not all agent types support all three tiers:
| Agent Type | Session History | Workspace | Semantic Index |
|---|---|---|---|
claw agent |
Yes | Yes | Yes |
forge agent |
No | Yes | No |
agent (stateless) |
No | No | No |
Claw agents are the fully memory-equipped agent type. They maintain conversation sessions,
read and write workspace files, and run semantic searches against their indexed content.
Forge agents have workspace access for reading and writing build artifacts, plans, and
iteration logs, but they do not maintain session history (each iteration starts fresh) and
do not use semantic search. Stateless agents have no memory at all -- each .ask() call
is independent.
Think of a knowledge worker at their desk. Session history is their short-term memory of the conversation happening right now -- they remember what was said in the last few minutes, but it fades quickly. Workspace files are the notebook on their desk -- they can write things down, flip back to a specific page, and the notebook persists overnight. Semantic search is the searchable filing cabinet behind them -- it holds thousands of documents they could never keep in their head or on their desk, but they can search it by topic and pull out the right folder in seconds.
26.2 Workspace Concept #
A workspace is an agent-scoped filesystem root. Every claw agent and forge agent that
declares a workspace field gets its own isolated directory on disk where it can read,
write, and append files. The workspace is the foundation of persistent memory in Neam.
Declaring a Workspace #
The workspace field specifies the path to the agent's workspace directory:
claw agent Librarian {
provider: "openai"
model: "gpt-4o"
system: "You are a librarian who remembers everything users tell you."
workspace: "./data/librarian"
channels: [
{ type: "cli" }
]
}
When this agent starts, the runtime creates the directory ./data/librarian if it does
not already exist. All workspace operations are rooted at this path.
For forge agents, the workspace serves a similar role:
forge agent Builder {
provider: "openai"
model: "gpt-4o"
system: "You are a code builder."
workspace: "./data/builder"
plan: "Build a REST API"
verify: "cargo test"
}
Path Security #
Workspace paths are strictly sandboxed. All paths passed to workspace functions are resolved relative to the workspace root, and any attempt to escape the sandbox is rejected at runtime.
The following rules are enforced:
- All paths are relative. Absolute paths (starting with
/) are rejected. - Parent traversal is rejected. Any path containing
..is rejected. - Null bytes are rejected. Paths containing
\0are rejected. - Symbolic links are not followed outside the workspace root.
// These work:
workspace_write("notes/meeting.md", content); // OK
workspace_read("data/users.json"); // OK
workspace_append("logs/activity.log", entry); // OK
// These are rejected at runtime:
workspace_read("../../../etc/passwd"); // ERROR: path traversal
workspace_read("/etc/passwd"); // ERROR: absolute path
workspace_write("notes/../../../secrets", data); // ERROR: path traversal
When a path violation is detected, the runtime returns nil for read operations and
false for write/append operations, and logs a security warning.
Do not use absolute paths in workspace functions. Even if the absolute path points
inside the workspace directory, it will be rejected. Always use relative paths. If you
need to construct paths dynamically, use string concatenation and ensure the result
never starts with / or contains ...
Workspace Directory Structure #
A typical workspace directory grows organically as the agent operates. Here is a common layout:
The sessions/ directory is commonly used to persist conversation summaries. The index/
directory is managed by the semantic memory subsystem (if enabled). The agent can create
any directory structure it needs -- the runtime creates parent directories automatically
when writing files.
26.3 Native Workspace Functions #
Neam provides three native functions for workspace I/O. These functions are available
inside any agent that has a workspace field declared. They operate relative to the
agent's workspace root and enforce path security automatically.
workspace_read(path) #
Reads the contents of a file from the workspace. Returns the file content as a string
on success, or nil if the file does not exist or the path is invalid.
Signature:
workspace_read(path: string) → string | nil
Example:
claw agent NoteTaker {
provider: "openai"
model: "gpt-4o"
system: "You are a note-taking assistant."
workspace: "./data/notes"
channels: [{ type: "cli" }]
}
impl NoteTaker {
skill read_note(name: string) -> string {
description: "Read a saved note by name"
let content = workspace_read("notes/" + name + ".md");
if (content == nil) {
return "No note found with name: " + name;
}
return content;
}
}
workspace_write(path, content) #
Writes content to a file in the workspace. If the file already exists, it is
truncated (overwritten). If the file does not exist, it is created. If the parent
directories do not exist, they are created automatically. Returns true on success,
false on failure.
Signature:
workspace_write(path: string, content: string) → bool
Example:
impl NoteTaker {
skill save_note(name: string, content: string) -> string {
description: "Save a note with the given name and content"
let path = "notes/" + name + ".md";
let ok = workspace_write(path, content);
if (ok) {
return "Note saved: " + name;
}
return "Failed to save note: " + name;
}
}
When workspace_write is called, the runtime performs these steps:
- Validate the path (no
.., no absolute paths, no null bytes). - Resolve the full path:
workspace_root + "/" + path. - Create any missing parent directories.
- Open the file in write mode (truncate if exists, create if not).
- Write the content.
- Return
trueon success.
workspace_append(path, content) #
Appends content to the end of a file in the workspace. If the file does not exist, it is
created (equivalent to workspace_write for the first call). Returns true on success,
false on failure.
Signature:
workspace_append(path: string, content: string) → bool
Example:
impl NoteTaker {
skill log_interaction(summary: string) -> string {
description: "Append an interaction summary to the activity log"
let entry = "[" + timestamp() + "] " + summary + "\n";
let ok = workspace_append("logs/activity.log", entry);
if (ok) {
return "Logged.";
}
return "Failed to log interaction.";
}
}
workspace_append is ideal for log files, accumulating data over time, and any scenario
where you want to add content without overwriting existing data.
Error Handling Patterns #
All three workspace functions handle errors gracefully. They do not throw exceptions --
instead, they return sentinel values (nil for reads, false for writes/appends). This
design encourages defensive programming:
// Pattern 1: Check-and-proceed
let data = workspace_read("config.json");
if (data == nil) {
// File does not exist or path is invalid
// Use defaults or create the file
workspace_write("config.json", default_config);
data = default_config;
}
// Pattern 2: Write-with-verification
let success = workspace_write("output/report.md", report);
if (!success) {
emit "WARNING: Failed to write report to workspace.";
// Fall back to emitting the report directly
emit report;
}
// Pattern 3: Append-or-create log
fn log_event(event: string) -> bool {
return workspace_append("events.log", event + "\n");
}
Use workspace_write for data that should be replaced entirely on each update (user
preferences, current state, generated reports). Use workspace_append for data that
accumulates over time (logs, conversation summaries, event streams). A common mistake
is using workspace_write for log files, which causes you to lose all previous entries.
Security Summary #
| Operation | Path Traversal | Absolute Path | Null Bytes | Missing Parents |
|---|---|---|---|---|
workspace_read |
Rejected → nil |
Rejected → nil |
Rejected → nil |
Returns nil |
workspace_write |
Rejected → false |
Rejected → false |
Rejected → false |
Auto-created |
workspace_append |
Rejected → false |
Rejected → false |
Rejected → false |
Auto-created |
26.4 Semantic Memory Configuration #
While workspace files give agents persistent storage, semantic memory gives agents the ability to search that storage by meaning. Instead of reading files by path, the agent can ask a question and get back the most relevant chunks from across all its workspace files.
Semantic memory is configured with the semantic_memory block on a claw agent
declaration:
claw agent ResearchAssistant {
provider: "openai"
model: "gpt-4o"
system: "You are a research assistant with long-term memory."
workspace: "./data/research"
semantic_memory: {
backend: "sqlite"
search: "hybrid"
flush_on_compact: true
}
channels: [{ type: "cli" }]
}
Configuration Fields #
backend #
The storage backend for the semantic index. This determines where chunked, embedded content is stored.
| Value | Description | Best For |
|---|---|---|
"sqlite" |
SQLite database file in workspace | Most use cases, single-agent |
"local" |
In-memory with periodic flush to disk | High-throughput, ephemeral workloads |
"none" |
Semantic memory disabled | Agents that only need workspace files |
The "sqlite" backend stores the BM25 index and vector embeddings in a SQLite database
inside the workspace's index/ directory. It is persistent, crash-safe, and efficient
for workspaces with up to hundreds of thousands of chunks.
The "local" backend keeps everything in memory for maximum speed. It flushes the index
to disk periodically and on shutdown. If the process crashes, recently indexed content
may be lost.
search #
The search mode determines how queries are matched against indexed content.
| Value | Description | Ranking Method |
|---|---|---|
"hybrid" |
Combined vector and keyword search | 70% cosine + 30% BM25 |
"vector" |
Semantic similarity only | Cosine similarity |
"keyword" |
Lexical matching only | BM25 scoring |
"none" |
Indexing enabled, search disabled | N/A |
The "hybrid" mode is the default and recommended setting. It combines the strengths of
both vector search (understanding meaning and synonyms) and keyword search (exact term
matching and rare word retrieval).
flush_on_compact #
When set to true, the agent writes important facts from the current session to the
workspace before compaction occurs. This ensures that critical information survives the
summarization process that compaction performs. See Section 26.9 for details.
| Value | Behavior |
|---|---|
true |
Flush facts to workspace before compaction |
false |
No pre-compaction flush (default) |
Defaults Table #
If you omit the semantic_memory block entirely, no semantic memory is configured. If
you include the block, the following defaults apply for any omitted field:
| Field | Default Value |
|---|---|
backend |
"sqlite" |
search |
"hybrid" |
flush_on_compact |
false |
So the minimal semantic memory configuration is:
semantic_memory: {}
This gives you a SQLite-backed, hybrid-search semantic memory with no pre-compaction flush -- which is a reasonable starting point for most applications.
Semantic memory requires a workspace to be declared. If you specify a
semantic_memory block without a workspace field, the compiler will produce an
error. The semantic index is stored inside the workspace directory and indexes the
workspace files.
26.5 How Semantic Search Works #
Understanding the internals of semantic search helps you tune it for your use case. This section explains the indexing pipeline, chunking parameters, and the three search modes in detail.
MemoryIndex Architecture #
The MemoryIndex is the runtime component that manages the semantic index. It sits
between the workspace filesystem and the search functions:
Chunking #
When a file is added to the workspace (via workspace_write or workspace_append), the
MemoryIndex automatically chunks the file's content for indexing. The chunking
parameters are:
| Parameter | Value | Description |
|---|---|---|
| Chunk size | 400 characters | Maximum length of each chunk |
| Chunk overlap | 80 characters | Characters shared between adjacent chunks |
The overlap ensures that concepts spanning a chunk boundary are captured in at least one chunk. For example, if a paragraph about "quarterly revenue targets" straddles the boundary between chunk 3 and chunk 4, the 80-character overlap ensures that the full context appears in at least one of those chunks.
Example of chunking a 1000-character document:
Characters: 0─────────400 320───────720 640────────1000
├─ Chunk 1 ─┤
├─ Chunk 2 ─┤
├─ Chunk 3 ─┤
Chunk 1: characters 0-399 (400 chars)
Chunk 2: characters 320-719 (400 chars, overlaps 80 with chunk 1)
Chunk 3: characters 640-999 (360 chars, overlaps 80 with chunk 2)
Each chunk becomes a MemoryChunk record with the following fields:
file_path-- the workspace-relative path of the source filechunk_text-- the chunk contentchunk_index-- the ordinal position of this chunk within the fileembedding-- the vector embedding of the chunk text
Three Search Modes #
Hybrid Search (Default) #
Hybrid search combines vector similarity and keyword relevance. The final score for each chunk is computed as:
score = 0.7 * cosine_similarity(query_embedding, chunk_embedding)
+ 0.3 * bm25_score(query_terms, chunk_terms)
This weighting favors semantic understanding (70%) while still rewarding exact keyword matches (30%). Hybrid search excels when users mix natural-language questions with specific terms:
- "What did we discuss about the Q3 revenue targets?" -- The vector component matches chunks about financial goals and quarterly planning, while the BM25 component boosts chunks containing the exact terms "Q3" and "revenue."
Vector Search #
Vector search uses cosine similarity exclusively:
score = cosine_similarity(query_embedding, chunk_embedding)
Vector search understands synonyms, paraphrases, and conceptual similarity. It finds chunks about "quarterly financial goals" even if the query says "Q3 targets." However, it can miss chunks that contain rare or specific terms (product codes, acronyms, proper nouns) that are not well-represented in the embedding space.
Best for: Broad, conceptual queries where exact wording does not matter.
Keyword Search #
Keyword search uses the BM25 algorithm exclusively:
score = bm25(query_terms, chunk_terms)
BM25 (Best Matching 25) is a probabilistic information retrieval function that ranks documents based on term frequency, inverse document frequency, and document length normalization. It excels at finding exact matches and rare terms.
Best for: Queries with specific identifiers, product codes, error messages, or proper nouns that must match exactly.
Indexing Flow #
The complete indexing flow when a file is written to the workspace:
When a file is overwritten, the old chunks are removed from both indexes before the new chunks are inserted. When a file is appended to, the entire file is re-chunked and re-indexed (since chunk boundaries shift when content is added).
If you are appending many small entries to a single file (like a log), consider
writing each entry to a separate file instead. This avoids re-indexing the entire file
on every append. For example, write to logs/2025-01-15-001.txt,
logs/2025-01-15-002.txt, and so on. Each file is indexed independently, so only the
new file is processed.
26.6 memory_search() Function #
The memory_search() function is the primary interface for querying the semantic index.
It takes a natural-language query and returns the most relevant chunks from the agent's
indexed workspace files.
Signature #
memory_search(query: string, top_k?: int) → list of {file_path, chunk, score}
| Parameter | Type | Default | Description |
|---|---|---|---|
query |
string |
(required) | The search query in natural language |
top_k |
int |
5 |
Maximum number of results to return |
Return Value #
The function returns a list of maps, each containing:
| Field | Type | Description |
|---|---|---|
file_path |
string |
Workspace-relative path of the source file |
chunk |
string |
The matched chunk text |
score |
float |
Relevance score (0.0 to 1.0, higher is better) |
Results are sorted by score in descending order (most relevant first).
Code Examples #
Basic search:
impl ResearchAssistant {
skill recall(query: string) -> string {
description: "Search memory for information related to a query"
let results = memory_search(query);
if (len(results) == 0) {
return "I do not have any information about that topic.";
}
let response = "Here is what I found:\n\n";
for (result in results) {
response = response + "**" + result["file_path"] + "** ";
response = response + "(score: " + str(result["score"]) + ")\n";
response = response + result["chunk"] + "\n\n";
}
return response;
}
}
Search with custom top_k:
// Get the single most relevant chunk
let best = memory_search("deployment procedure", 1);
if (len(best) > 0) {
let answer = best[0]["chunk"];
emit "Most relevant: " + answer;
}
// Get a broad set of results for comprehensive answers
let broad = memory_search("user feedback on onboarding", 10);
emit "Found " + str(len(broad)) + " relevant chunks.";
Combining search with workspace read for full context:
impl ResearchAssistant {
skill deep_recall(query: string) -> string {
description: "Search memory and return the full source file"
let results = memory_search(query, 3);
if (len(results) == 0) {
return "No relevant information found.";
}
// Read the full file that contains the best match
let best_file = results[0]["file_path"];
let full_content = workspace_read(best_file);
if (full_content == nil) {
return "Found match in " + best_file + " but could not read file.";
}
return "Source: " + best_file + "\n\n" + full_content;
}
}
Interpreting Scores #
The score returned by memory_search() ranges from 0.0 to 1.0, but the interpretation
depends on the search mode:
| Score Range | Hybrid | Vector | Keyword |
|---|---|---|---|
| 0.9 -- 1.0 | Near-exact match on both meaning and terms | Very high semantic similarity | Near-exact term match |
| 0.7 -- 0.9 | Strong match, related content | Related content, correct topic | Many shared terms |
| 0.5 -- 0.7 | Moderate relevance, partially related | Same general domain | Some shared terms |
| 0.3 -- 0.5 | Weak match, tangentially related | Loosely related | Few shared terms |
| 0.0 -- 0.3 | Likely not relevant | Different topic | Almost no shared terms |
When building skills that surface memory search results to users, consider filtering results below a threshold. A score of 0.5 is a reasonable cutoff for hybrid search -- results below this are often noise rather than signal. For vector-only search, a cutoff of 0.6 is more appropriate because vector scores tend to be lower for truly unrelated content.
let results = memory_search(query, 10);
let filtered = [];
for (result in results) {
if (result["score"] >= 0.5) {
filtered = filtered + [result];
}
}
Choosing the Right Search Mode #
The search mode is set globally on the semantic_memory block, but understanding when
each mode excels helps you choose correctly:
| Scenario | Recommended Mode | Why |
|---|---|---|
| General-purpose assistant | "hybrid" |
Balances meaning and exact terms |
| Creative writing assistant | "vector" |
Prioritizes thematic similarity |
| Code assistant searching by function name | "keyword" |
Exact identifiers matter |
| Log analysis with error codes | "keyword" |
Specific codes must match exactly |
| Research assistant with varied queries | "hybrid" |
Handles both conceptual and specific queries |
26.7 session_history() Function #
While memory_search() queries persistent workspace content, session_history() queries
the volatile conversation history of the current (or a named) session. This is useful for
summarization, context extraction, and building agents that reference earlier parts of the
conversation.
Signature #
session_history(session_key?: string, limit?: int) → list of {role, content}
| Parameter | Type | Default | Description |
|---|---|---|---|
session_key |
string |
"default" |
The session identifier to query |
limit |
int |
All messages | Maximum number of messages to return (most recent first) |
Return Value #
The function returns a list of maps, each containing:
| Field | Type | Description |
|---|---|---|
role |
string |
"user", "assistant", or "system" |
content |
string |
The message content |
Messages are returned in chronological order (oldest first), but the limit parameter
selects the N most recent messages.
Code Examples #
Get the entire conversation history:
impl Librarian {
skill show_history() -> string {
description: "Show the conversation history for this session"
let history = session_history();
if (len(history) == 0) {
return "No conversation history yet.";
}
let output = "Conversation so far:\n\n";
for (msg in history) {
output = output + "**" + msg["role"] + "**: " + msg["content"] + "\n\n";
}
return output;
}
}
Get the last 5 messages:
let recent = session_history("default", 5);
for (msg in recent) {
emit msg["role"] + ": " + msg["content"];
}
Access a named session:
Claw agents can maintain multiple named sessions (for example, one per user or one per channel). You can query any session by key:
// Get history from a specific user's session
let user_session = session_history("user_42", 10);
// Get history from a group channel session
let group_session = session_history("group_engineering", 20);
Use Cases #
Summarization: Build a skill that summarizes the conversation so far, useful for handoff to another agent or for saving a session summary to the workspace.
impl Librarian {
skill summarize_session() -> string {
description: "Create a summary of the current conversation"
let history = session_history();
if (len(history) < 3) {
return "Not enough conversation to summarize.";
}
// Build a summary prompt from the history
let transcript = "";
for (msg in history) {
transcript = transcript + msg["role"] + ": " + msg["content"] + "\n";
}
// Ask the agent itself to summarize
let summary = self.ask("Summarize this conversation in 3 bullet points:\n\n" + transcript);
// Save the summary to workspace for persistence
let filename = "sessions/summary_" + timestamp() + ".md";
workspace_write(filename, summary);
return summary;
}
}
Context extraction: Pull specific facts from the conversation to store in the workspace for long-term recall.
impl Librarian {
skill extract_facts() -> string {
description: "Extract key facts from the conversation and save them"
let history = session_history("default", 20);
let transcript = "";
for (msg in history) {
if (msg["role"] == "user") {
transcript = transcript + msg["content"] + "\n";
}
}
let facts = self.ask(
"Extract all factual statements from this text. " +
"Return each fact on its own line:\n\n" + transcript
);
workspace_append("facts/extracted.md", "\n## Session " + timestamp() + "\n" + facts);
return "Extracted and saved facts:\n" + facts;
}
}
session_history() is only available in claw agent declarations. Calling it from a
forge agent or a stateless agent will return an empty list, because those agent
types do not maintain session state.
26.8 Automatic Memory Injection #
One of the most powerful features of Neam's semantic memory system is automatic memory
injection. When a claw agent has semantic memory enabled (with search set to anything
other than "none"), the runtime automatically injects relevant memory context into the
system prompt before each .ask() call.
How It Works #
When a user sends a message to a claw agent with semantic memory enabled, the runtime performs these steps before the LLM call:
What Gets Injected #
The runtime retrieves the top 3 most relevant chunks from the semantic index using the user's message as the query. These are formatted into a structured block prepended to the system prompt:
[MEMORY CONTEXT]
The following information was retrieved from your memory. Use it to inform
your response if relevant, but do not reference it unless the user asks
about these topics.
Source: notes/project-alpha.md (score: 0.87)
> Project Alpha is scheduled for Q2 launch. The team consists of 4 engineers
> and 1 designer. Primary risk: API integration with legacy systems.
Source: meetings/standup-jan-15.md (score: 0.74)
> Discussed blockers: CI pipeline failing on integration tests. John to
> investigate by end of week.
Source: notes/team-roster.md (score: 0.62)
> Engineering team: Alice (lead), Bob, Carol, Dave. Designer: Eve.
[END MEMORY CONTEXT]
When Injection Occurs #
Automatic injection occurs on every .ask() call when all of the following conditions
are met:
- The agent is a
claw agent. - The
semantic_memoryblock is present. - The
searchfield is not"none". - The semantic index contains at least one chunk.
If the semantic index is empty (no workspace files have been written yet), injection is skipped silently.
Controlling Injection Behavior #
You can control automatic injection in several ways:
Disable injection entirely by setting search to "none":
semantic_memory: {
backend: "sqlite"
search: "none"
}
This still enables workspace file operations and manual memory_search() calls, but
the runtime will not automatically inject context into prompts.
Use manual search instead when you want full control over what context is injected.
Set search to "none" and build the context yourself in a skill:
claw agent CustomMemory {
provider: "openai"
model: "gpt-4o"
system: "You are an assistant with manual memory control."
workspace: "./data/custom"
semantic_memory: {
backend: "sqlite"
search: "none"
}
channels: [{ type: "cli" }]
}
impl CustomMemory {
skill answer_with_context(question: string) -> string {
description: "Answer a question using manually retrieved memory context"
// Search with custom top_k and threshold
let results = memory_search(question, 8);
let context = "";
for (result in results) {
if (result["score"] >= 0.6) {
context = context + result["chunk"] + "\n\n";
}
}
if (context == "") {
return self.ask(question);
}
let augmented = "Use this context to help answer:\n\n" + context +
"\n\nQuestion: " + question;
return self.ask(augmented);
}
}
Do not confuse automatic injection with RAG (Chapter 15). Automatic memory injection
searches the agent's own workspace files -- content the agent itself has written or
that was placed in its workspace directory. RAG searches an externally defined
knowledge block that you populate with documents. They serve different purposes:
memory injection is for the agent's personal accumulated knowledge, while RAG is for
external reference material.
26.9 Pre-Compaction Flush #
Long-running claw agent sessions accumulate conversation history that grows without bound. At some point, the runtime compacts the session by summarizing older messages and discarding the originals. This keeps the context window manageable but risks losing important details buried in the summarized portion.
Pre-compaction flush addresses this by writing critical facts to the workspace before
compaction occurs. When flush_on_compact is set to true, the agent gets a chance to
preserve important information in persistent storage before the volatile session history
is trimmed.
How Compaction Works #
Enabling Pre-Compaction Flush #
claw agent LongTermAssistant {
provider: "openai"
model: "gpt-4o"
system: "You are a long-running assistant that remembers everything important."
workspace: "./data/long-term"
semantic_memory: {
backend: "sqlite"
search: "hybrid"
flush_on_compact: true
}
channels: [{ type: "cli" }]
}
What Gets Flushed #
When compaction triggers and flush_on_compact is true, the runtime performs an
additional LLM call before summarization. This call asks the agent to extract key facts,
decisions, action items, and user preferences from the conversation segment that is about
to be compacted. The extracted content is appended to a flush file in the workspace:
Each flush file contains timestamped facts:
[Flush at 2025-01-15T14:30:00Z]
- User prefers responses in bullet point format
- Project Alpha deadline moved to March 15
- Budget approved for 3 additional engineers
- User's timezone is PST
- Blocker: CI pipeline integration tests failing since Monday
Because these files are in the workspace, they are automatically indexed by the semantic
memory system. Future memory_search() calls and automatic injection will find these
facts even though the original conversation messages have been compacted away.
Why This Matters for Long-Running Conversations #
Without pre-compaction flush, a claw agent that runs for hours or days loses detail progressively. The compaction summary captures the gist of the conversation, but specific numbers, dates, names, and action items are often lost in summarization.
With flush_on_compact: true, those details are extracted and persisted to the workspace
before they are summarized away. The agent's semantic memory grows richer over time,
and it can answer specific questions about past conversations even after the session
history has been compacted multiple times.
Imagine taking meeting notes. As the day goes on, your short-term memory of earlier meetings fades (compaction). Without notes, you would forget the specific action items and deadlines discussed at 9 AM by the time it is 4 PM. But if you jot down the key takeaways before each meeting fades from memory (flush), you can always refer back to your notes. Pre-compaction flush is the agent writing meeting notes before its short-term memory is summarized.
26.10 Memory Isolation Rules #
Memory isolation determines which agents can see which data. Neam enforces strict isolation boundaries to prevent agents from accidentally (or maliciously) accessing each other's memory.
Isolation by Context #
The following table shows what memory each agent has access to, depending on the execution context:
| Context | Session History | Workspace | Semantic Index |
|---|---|---|---|
| Private DM | Per-session | Per-agent | Per-agent |
| Group channel | Per-group | Per-agent | Per-agent |
| Cron job | No session | Per-agent | Per-agent |
| Forge agent | No session | Per-agent | No |
| Subagent (spawn) | No session | Inherited | No |
Let us examine each context in detail.
Private DM #
When a user interacts with a claw agent through a private channel (CLI, direct message), the session history is scoped to that specific session. The workspace and semantic index are scoped to the agent -- all sessions share the same workspace and index.
claw agent PersonalBot {
provider: "openai"
model: "gpt-4o"
system: "You are a personal assistant."
workspace: "./data/personal"
semantic_memory: { search: "hybrid" }
channels: [{ type: "cli" }]
}
If two users both interact with PersonalBot, they have separate session histories but
share the same workspace and semantic index. Notes written by one user are visible to
the other through memory_search().
If you need per-user isolation, use the user identifier in workspace paths:
workspace_write("users/" + user_id + "/notes.md", content). This partitions the
workspace logically even though the filesystem root is shared.
Group Channel #
In a group channel (Slack, Discord, or similar), the session history is shared across all participants in that group. The workspace and semantic index remain per-agent.
Cron Job #
When a claw agent runs on a cron schedule (no user-initiated message), there is no session history. The agent can still access its workspace and semantic index, which makes cron jobs ideal for periodic maintenance tasks like summarizing accumulated notes or cleaning up old files.
Forge Agent #
Forge agents have their own workspace but no session history (each iteration starts fresh) and no semantic index. They read and write files in their workspace for plan tracking, build artifacts, and iteration logs.
Subagent (Spawn) #
When a claw agent spawns a subagent using spawn(), the subagent inherits the
parent's workspace. This means the subagent can read files the parent wrote and write
files the parent can later read. However, the subagent has no session history of its own
and no semantic index access.
impl PersonalBot {
skill delegate_research(topic: string) -> string {
description: "Spawn a subagent to research a topic and save findings"
let researcher = spawn({
provider: "openai",
model: "gpt-4o",
system: "You are a research specialist. Save your findings to the workspace."
});
// The subagent inherits PersonalBot's workspace
// It can write files that PersonalBot can later read and search
let result = researcher.ask("Research " + topic + " and save a summary.");
return result;
}
}
Isolation Guarantees #
The following boundaries are never crossed:
- Agent A cannot read Agent B's workspace. Even if Agent A knows the filesystem path to Agent B's workspace, the runtime prevents cross-agent file access.
- Agent A cannot search Agent B's semantic index. Each agent's index is isolated.
- Session history from one session is not visible in another. Even for the same agent, different sessions have independent history.
- Forge agents cannot access claw agent memory, and vice versa. Each agent type operates in its own memory space.
Do not assume that spawning a subagent gives it access to the parent's semantic index. The subagent inherits the workspace filesystem but not the semantic index. If the subagent needs to search the parent's indexed content, have the parent perform the search and pass the results to the subagent as part of the prompt.
26.11 Real-World Example: Knowledge-Aware Personal Assistant #
Let us bring together everything from this chapter into a complete, working example. We will build a claw agent that functions as a knowledge-aware personal assistant. It can save notes to its workspace, recall notes using semantic search, and maintain continuity across sessions thanks to persistent storage and automatic memory injection.
The Complete Program #
// knowledge_assistant.neam
// A personal assistant with persistent memory and semantic search
claw agent Sage {
provider: "openai"
model: "gpt-4o"
temperature: 0.4
system: "You are Sage, a personal knowledge assistant. You help users
store, organize, and recall information. When users share facts,
preferences, or notes, save them. When users ask questions, search
your memory first. Be concise and helpful. If you recall relevant
information from memory, reference it naturally in your response."
workspace: "./data/sage"
semantic_memory: {
backend: "sqlite"
search: "hybrid"
flush_on_compact: true
}
channels: [
{ type: "cli" }
]
}
impl Sage {
skill save_note(title: string, content: string) -> string {
description: "Save a note with a title and content to persistent memory"
// Sanitize the title for use as a filename
let safe_title = title;
let path = "notes/" + safe_title + ".md";
// Build the note with metadata
let note = "# " + title + "\n\n";
note = note + "Saved: " + timestamp() + "\n\n";
note = note + content + "\n";
let ok = workspace_write(path, note);
if (!ok) {
return "Failed to save note: " + title;
}
return "Saved note: " + title + " (stored at " + path + ")";
}
skill recall_notes(query: string) -> string {
description: "Search memory for notes related to a query"
let results = memory_search(query, 5);
if (len(results) == 0) {
return "I could not find any notes matching your query.";
}
let response = "I found " + str(len(results)) + " relevant results:\n\n";
let i = 1;
for (result in results) {
response = response + "### Result " + str(i) + "\n";
response = response + "**Source:** " + result["file_path"] + "\n";
response = response + "**Relevance:** " + str(result["score"]) + "\n";
response = response + result["chunk"] + "\n\n";
i = i + 1;
}
return response;
}
skill list_notes() -> string {
description: "List all saved notes"
let index = workspace_read("notes/_index.txt");
if (index == nil) {
return "No notes saved yet.";
}
return "Saved notes:\n\n" + index;
}
skill save_preference(key: string, value: string) -> string {
description: "Save a user preference"
let prefs_raw = workspace_read("preferences.json");
let prefs = {};
if (prefs_raw != nil) {
prefs = parse_json(prefs_raw);
}
prefs[key] = value;
let ok = workspace_write("preferences.json", to_json(prefs));
if (ok) {
return "Preference saved: " + key + " = " + value;
}
return "Failed to save preference.";
}
skill get_session_summary() -> string {
description: "Summarize the current conversation session"
let history = session_history("default", 20);
if (len(history) < 2) {
return "Not enough conversation to summarize.";
}
let transcript = "";
for (msg in history) {
transcript = transcript + msg["role"] + ": " + msg["content"] + "\n";
}
let summary = self.ask(
"Summarize this conversation in 3-5 bullet points. " +
"Focus on key facts and decisions:\n\n" + transcript
);
// Persist the summary
let filename = "sessions/summary_" + timestamp() + ".md";
workspace_write(filename, summary);
return summary;
}
}
// Entry point
{
emit "Sage is ready. Type your messages below.";
emit "Skills: save_note, recall_notes, list_notes, save_preference, get_session_summary";
Sage.listen();
}
Usage Example #
Here is a sample interaction showing how the assistant stores and recalls information across sessions.
Session 1:
$ neam run knowledge_assistant.neam
Sage is ready. Type your messages below.
Skills: save_note, recall_notes, list_notes, save_preference, get_session_summary
> My project is called Helios. It is a solar panel monitoring system.
[Sage calls save_note("helios-project", "Project called Helios...")]
Saved note: helios-project. I have recorded that Helios is a solar panel
monitoring system.
> The tech stack is Rust for the backend, React for the frontend, and
PostgreSQL for the database.
[Sage calls save_note("helios-tech-stack", "Tech stack: Rust backend...")]
Saved note: helios-tech-stack. I have recorded the Helios tech stack.
> The launch deadline is March 15.
[Sage calls save_note("helios-deadline", "Launch deadline: March 15")]
Saved note: helios-deadline. Got it, March 15 launch deadline for Helios.
> /quit
Session 2 (next day):
$ neam run knowledge_assistant.neam
Sage is ready. Type your messages below.
> What do you know about my project?
[Sage receives automatic memory injection with top 3 relevant chunks]
[MEMORY CONTEXT includes helios-project, helios-tech-stack, helios-deadline]
Your project is Helios, a solar panel monitoring system. The tech stack is
Rust for the backend, React for the frontend, and PostgreSQL for the
database. The launch deadline is March 15.
> What database are we using?
[Sage calls recall_notes("database")]
Based on my records, Helios uses PostgreSQL for the database.
Notice that in Session 2, the agent has no session history from Session 1. But because
the notes were saved to the workspace and indexed semantically, the agent can recall
the information through automatic memory injection and explicit recall_notes calls.
Create the knowledge_assistant.neam file above and run it. Save several notes about
a topic you care about. Quit the program, restart it, and ask questions about those
notes. Observe how the semantic search retrieves relevant information even when your
question uses different wording than the original notes.
Summary #
In this chapter you learned:
-
Three-tier memory architecture: Session history (volatile, in-memory), workspace files (persistent, on-disk), and semantic search (indexed, queryable by meaning). Each tier serves a different recall pattern.
-
Workspace concept: Agent-scoped filesystem roots declared with the
workspacefield. All paths are relative and sandboxed -- parent traversal (..) and absolute paths are rejected at runtime. -
Native workspace functions:
workspace_read(path)returns file content ornil,workspace_write(path, content)creates or overwrites files (returningbool), andworkspace_append(path, content)adds to files without overwriting (returningbool). All three enforce path security and auto-create parent directories. -
Semantic memory configuration: The
semantic_memoryblock on claw agents with three fields --backend("sqlite","local","none"),search("hybrid","vector","keyword","none"), andflush_on_compact(true/false). -
How semantic search works: Files are chunked (400 chars, 80 overlap), embedded as vectors, and indexed in both a vector index (cosine similarity) and a BM25 index (keyword relevance). Hybrid search combines both with 70/30 weighting.
-
memory_search(): Queries the semantic index with a natural-language string and returns ranked results with
file_path,chunk, andscore. Defaulttop_kis 5. -
session_history(): Retrieves conversation messages from the current or a named session, useful for summarization and context extraction.
-
Automatic memory injection: When semantic memory is enabled, the runtime automatically prepends the top 3 relevant chunks to the system prompt before each
.ask()call, giving the agent contextual awareness without manual search. -
Pre-compaction flush: Setting
flush_on_compact: truecauses the agent to extract and persist key facts to the workspace before session compaction, preventing detail loss in long-running conversations. -
Memory isolation rules: Session history is per-session, workspace is per-agent, semantic index is per-agent. Subagents inherit workspace but not semantic index. Forge agents have workspace only. Cross-agent memory access is never permitted.
Exercises #
Exercise 26.1: Workspace Basics #
Create a claw agent with a workspace. Write three skills: one that saves a key-value pair to a JSON file, one that reads a value by key, and one that deletes a key. Test all three operations and verify that data persists across program restarts.
Exercise 26.2: Append-Only Log #
Build a claw agent that uses workspace_append to maintain an activity log. Every time
the user sends a message, the agent should append a timestamped entry to activity.log.
Write a show_log skill that reads and displays the log. Run 10 interactions and verify
the log contains all 10 entries.
Exercise 26.3: Search Mode Comparison #
Create three claw agents, each with a different search mode ("hybrid", "vector",
"keyword"). Write the same 5 notes to each agent's workspace. Then run the same 5
queries against each agent using memory_search(). Compare the results and scores across
search modes. Which mode performed best for factual queries? Which performed best for
conceptual queries?
Exercise 26.4: Session Summarization #
Build a claw agent that automatically summarizes the conversation every 10 messages using
session_history(). The summary should be saved to the workspace with workspace_write.
After 20 messages, verify that two summary files exist and that the semantic index can
find relevant content from both summaries via memory_search().
Exercise 26.5: Pre-Compaction Flush #
Create a claw agent with flush_on_compact: true. Have a long conversation (at least 30
messages) to trigger compaction. After compaction, verify that the compaction/ directory
contains a flush file. Then ask the agent about specific details from the early part of
the conversation. Does it recall them through the flushed content?
Exercise 26.6: Memory Isolation #
Create two claw agents (AgentA and AgentB) with separate workspaces. Write a note
using AgentA and attempt to read it from AgentB. Verify that AgentB cannot access
AgentA's workspace or semantic index. Then spawn a subagent from AgentA and verify
that the subagent can read files from AgentA's workspace.
Exercise 26.7: Knowledge Assistant Extension #
Extend the Sage example from Section 26.11 with the following features:
- A
tag_noteskill that adds tags to notes (stored as metadata in the file). - A
search_by_tagskill that searches only notes with a specific tag. - A
delete_noteskill that removes a note and triggers re-indexing.
Test the complete system with at least 10 notes across 3 different tags.
Exercise 26.8: Multi-Session Memory #
Build a claw agent that maintains separate named sessions for different "projects." The
agent should have skills to switch_project(name), save_to_project(content), and
recall_from_project(query). Each project uses a different workspace subdirectory and
session key. Verify that searching within one project does not return results from
another project's notes.