📖 21 min read

Chapter 26: Semantic Memory and Workspace I/O #

"Memory is the treasury and guardian of all things." -- Cicero

An agent that forgets everything after each conversation is little more than a stateless function. It cannot build relationships with users, recall past decisions, learn from experience, or maintain a knowledge base that grows over time. In production, memory is what separates a useful assistant from a toy demo.

Neam addresses this with a three-tier memory architecture that gives agents access to volatile session history, persistent workspace files, and indexed semantic search -- all through native language constructs. You do not need to wire up databases, manage file systems manually, or implement vector search from scratch. You declare what you need, and the runtime handles the rest.

This chapter teaches you how to use all three memory tiers. You will learn the workspace filesystem, native read/write/append functions, semantic memory configuration, search modes, session history access, automatic memory injection, pre-compaction flush, and memory isolation rules. By the end, you will build a complete knowledge-aware personal assistant that stores notes, recalls them semantically, and maintains continuity across sessions.

Specifically, you will learn to:

Distinguish the three memory tiers and when to use each
Configure agent-scoped workspaces with path security
Read, write, and append files using native workspace functions
Configure semantic memory with backend, search mode, and flush options
Use memory_search() to perform hybrid, vector, or keyword searches
Access session history with session_history()
Control automatic memory injection into agent prompts
Implement pre-compaction flush for long-running conversations
Apply memory isolation rules across different execution contexts

💠 Why This Matters

Imagine hiring an executive assistant who has perfect amnesia. Every morning, you walk in and they have no idea who you are, what projects you are working on, or what you discussed yesterday. You would spend half your day re-explaining context. Now imagine that assistant has a notebook (workspace), a searchable filing cabinet (semantic index), and a short-term memory of your current conversation (session history). That is the difference between a stateless agent and a memory-equipped one. In production, agents that remember are agents that deliver value.

26.1 The Three-Tier Memory Architecture #

Neam provides three distinct memory tiers, each optimized for a different access pattern. Understanding when to use each tier is essential for building agents that remember effectively without wasting resources.

msg 1

▶

msg 2

▶

msg 3

▶

msg 4

▶

msg 5

▼

notes.md

▶

prefs.

json

▶

data.

jsonl

▶

summary.

txt

▼

Hybrid

Ranking

How Each Tier Serves Different Recall Needs #

Each memory tier answers a different kind of question:

Tier	What It Stores	Access Pattern	Lifetime	Speed
Session History	Conversation messages (role + content)	Sequential, recent-first	Current session only	Instant (in-memory)
Workspace Files	Arbitrary text files	Path-based read/write	Persistent across sessions	Fast (disk I/O)
Semantic Index	Chunked, embedded file content	Query-based similarity search	Persistent, re-indexed on change	Moderate (search + rank)

Session history is your agent's short-term memory. It holds the current conversation and is ideal for questions like "What did the user just say?" or "What was the third message in this session?" It is volatile -- when the session ends, the history is gone unless you explicitly save it.

Workspace files are your agent's persistent notebook. They survive restarts and are ideal for storing structured data, configuration, user preferences, accumulated notes, and any information the agent needs to recall by path. You know the file name and you read it directly.

Semantic search is your agent's searchable filing cabinet. When workspace files grow large or numerous, you cannot read every file on every query. The semantic index chunks your workspace files, embeds them as vectors, and lets you search by meaning. You do not need to know the file name -- you describe what you are looking for, and the index returns the most relevant chunks.

Agent Type Support #

Not all agent types support all three tiers:

Agent Type	Session History	Workspace	Semantic Index
`claw agent`	Yes	Yes	Yes
`forge agent`	No	Yes	No
`agent` (stateless)	No	No	No

Claw agents are the fully memory-equipped agent type. They maintain conversation sessions, read and write workspace files, and run semantic searches against their indexed content. Forge agents have workspace access for reading and writing build artifacts, plans, and iteration logs, but they do not maintain session history (each iteration starts fresh) and do not use semantic search. Stateless agents have no memory at all -- each .ask() call is independent.

🌎 Real-World Analogy

Think of a knowledge worker at their desk. Session history is their short-term memory of the conversation happening right now -- they remember what was said in the last few minutes, but it fades quickly. Workspace files are the notebook on their desk -- they can write things down, flip back to a specific page, and the notebook persists overnight. Semantic search is the searchable filing cabinet behind them -- it holds thousands of documents they could never keep in their head or on their desk, but they can search it by topic and pull out the right folder in seconds.

26.2 Workspace Concept #

A workspace is an agent-scoped filesystem root. Every claw agent and forge agent that declares a workspace field gets its own isolated directory on disk where it can read, write, and append files. The workspace is the foundation of persistent memory in Neam.

Declaring a Workspace #

The workspace field specifies the path to the agent's workspace directory:

neam

claw agent Librarian {
  provider: "openai"
  model: "gpt-4o"
  system: "You are a librarian who remembers everything users tell you."
  workspace: "./data/librarian"

  channels: [
    { type: "cli" }
  ]
}

When this agent starts, the runtime creates the directory ./data/librarian if it does not already exist. All workspace operations are rooted at this path.

For forge agents, the workspace serves a similar role:

neam

forge agent Builder {
  provider: "openai"
  model: "gpt-4o"
  system: "You are a code builder."
  workspace: "./data/builder"
  plan: "Build a REST API"
  verify: "cargo test"
}

Path Security #

Workspace paths are strictly sandboxed. All paths passed to workspace functions are resolved relative to the workspace root, and any attempt to escape the sandbox is rejected at runtime.

The following rules are enforced:

All paths are relative. Absolute paths (starting with /) are rejected.
Parent traversal is rejected. Any path containing .. is rejected.
Null bytes are rejected. Paths containing \0 are rejected.
Symbolic links are not followed outside the workspace root.

neam

// These work:
workspace_write("notes/meeting.md", content);      // OK
workspace_read("data/users.json");                  // OK
workspace_append("logs/activity.log", entry);       // OK

// These are rejected at runtime:
workspace_read("../../../etc/passwd");              // ERROR: path traversal
workspace_read("/etc/passwd");                      // ERROR: absolute path
workspace_write("notes/../../../secrets", data);    // ERROR: path traversal

When a path violation is detected, the runtime returns nil for read operations and false for write/append operations, and logs a security warning.

❌ Common Mistake

Do not use absolute paths in workspace functions. Even if the absolute path points inside the workspace directory, it will be rejected. Always use relative paths. If you need to construct paths dynamically, use string concatenation and ensure the result never starts with / or contains ...

Workspace Directory Structure #

A typical workspace directory grows organically as the agent operates. Here is a common layout:

📁workspace_root/

├── 📁sessions/

│ ├── 📄2025-01-15_user42.jsonl

│ ├── 📄2025-01-16_user42.jsonl

│ └── 📄2025-01-16_user99.jsonl

├── 📁notes/

│ ├── 📄project-alpha.md

│ ├── 📄project-beta.md

│ └── 📄meeting-notes.md

├── 📁preferences/

│ └── 📄user-settings.json

├── 📁index/

│ ├── 📄vectors.idx

│ └── 📄bm25.idx

└── 📁compaction/

├── 📄flush-001.txt

└── 📄flush-002.txt

The sessions/ directory is commonly used to persist conversation summaries. The index/ directory is managed by the semantic memory subsystem (if enabled). The agent can create any directory structure it needs -- the runtime creates parent directories automatically when writing files.

26.3 Native Workspace Functions #

Neam provides three native functions for workspace I/O. These functions are available inside any agent that has a workspace field declared. They operate relative to the agent's workspace root and enforce path security automatically.

workspace_read(path) #

Reads the contents of a file from the workspace. Returns the file content as a string on success, or nil if the file does not exist or the path is invalid.

Signature:

neam

workspace_read(path: string) → string | nil

Example:

neam

claw agent NoteTaker {
  provider: "openai"
  model: "gpt-4o"
  system: "You are a note-taking assistant."
  workspace: "./data/notes"

  channels: [{ type: "cli" }]
}

impl NoteTaker {
  skill read_note(name: string) -> string {
    description: "Read a saved note by name"

    let content = workspace_read("notes/" + name + ".md");
    if (content == nil) {
      return "No note found with name: " + name;
    }
    return content;
  }
}

workspace_write(path, content) #

Writes content to a file in the workspace. If the file already exists, it is truncated (overwritten). If the file does not exist, it is created. If the parent directories do not exist, they are created automatically. Returns true on success, false on failure.

Signature:

neam

workspace_write(path: string, content: string) → bool

Example:

neam

impl NoteTaker {
  skill save_note(name: string, content: string) -> string {
    description: "Save a note with the given name and content"

    let path = "notes/" + name + ".md";
    let ok = workspace_write(path, content);
    if (ok) {
      return "Note saved: " + name;
    }
    return "Failed to save note: " + name;
  }
}

When workspace_write is called, the runtime performs these steps:

Validate the path (no .., no absolute paths, no null bytes).
Resolve the full path: workspace_root + "/" + path.
Create any missing parent directories.
Open the file in write mode (truncate if exists, create if not).
Write the content.
Return true on success.

workspace_append(path, content) #

Appends content to the end of a file in the workspace. If the file does not exist, it is created (equivalent to workspace_write for the first call). Returns true on success, false on failure.

Signature:

neam

workspace_append(path: string, content: string) → bool

Example:

neam

impl NoteTaker {
  skill log_interaction(summary: string) -> string {
    description: "Append an interaction summary to the activity log"

    let entry = "[" + timestamp() + "] " + summary + "\n";
    let ok = workspace_append("logs/activity.log", entry);
    if (ok) {
      return "Logged.";
    }
    return "Failed to log interaction.";
  }
}

workspace_append is ideal for log files, accumulating data over time, and any scenario where you want to add content without overwriting existing data.

Error Handling Patterns #

All three workspace functions handle errors gracefully. They do not throw exceptions -- instead, they return sentinel values (nil for reads, false for writes/appends). This design encourages defensive programming:

neam

// Pattern 1: Check-and-proceed
let data = workspace_read("config.json");
if (data == nil) {
  // File does not exist or path is invalid
  // Use defaults or create the file
  workspace_write("config.json", default_config);
  data = default_config;
}

// Pattern 2: Write-with-verification
let success = workspace_write("output/report.md", report);
if (!success) {
  emit "WARNING: Failed to write report to workspace.";
  // Fall back to emitting the report directly
  emit report;
}

// Pattern 3: Append-or-create log
fn log_event(event: string) -> bool {
  return workspace_append("events.log", event + "\n");
}

💡 Tip

Use workspace_write for data that should be replaced entirely on each update (user preferences, current state, generated reports). Use workspace_append for data that accumulates over time (logs, conversation summaries, event streams). A common mistake is using workspace_write for log files, which causes you to lose all previous entries.

Security Summary #

Operation	Path Traversal	Absolute Path	Null Bytes	Missing Parents
`workspace_read`	Rejected → `nil`	Rejected → `nil`	Rejected → `nil`	Returns `nil`
`workspace_write`	Rejected → `false`	Rejected → `false`	Rejected → `false`	Auto-created
`workspace_append`	Rejected → `false`	Rejected → `false`	Rejected → `false`	Auto-created

26.4 Semantic Memory Configuration #

While workspace files give agents persistent storage, semantic memory gives agents the ability to search that storage by meaning. Instead of reading files by path, the agent can ask a question and get back the most relevant chunks from across all its workspace files.

Semantic memory is configured with the semantic_memory block on a claw agent declaration:

neam

claw agent ResearchAssistant {
  provider: "openai"
  model: "gpt-4o"
  system: "You are a research assistant with long-term memory."
  workspace: "./data/research"

  semantic_memory: {
    backend: "sqlite"
    search: "hybrid"
    flush_on_compact: true
  }

  channels: [{ type: "cli" }]
}

Configuration Fields #

backend #

The storage backend for the semantic index. This determines where chunked, embedded content is stored.

Value	Description	Best For
`"sqlite"`	SQLite database file in workspace	Most use cases, single-agent
`"local"`	In-memory with periodic flush to disk	High-throughput, ephemeral workloads
`"none"`	Semantic memory disabled	Agents that only need workspace files

The "sqlite" backend stores the BM25 index and vector embeddings in a SQLite database inside the workspace's index/ directory. It is persistent, crash-safe, and efficient for workspaces with up to hundreds of thousands of chunks.

The "local" backend keeps everything in memory for maximum speed. It flushes the index to disk periodically and on shutdown. If the process crashes, recently indexed content may be lost.

search #

The search mode determines how queries are matched against indexed content.

Value	Description	Ranking Method
`"hybrid"`	Combined vector and keyword search	70% cosine + 30% BM25
`"vector"`	Semantic similarity only	Cosine similarity
`"keyword"`	Lexical matching only	BM25 scoring
`"none"`	Indexing enabled, search disabled	N/A

The "hybrid" mode is the default and recommended setting. It combines the strengths of both vector search (understanding meaning and synonyms) and keyword search (exact term matching and rare word retrieval).

flush_on_compact #

When set to true, the agent writes important facts from the current session to the workspace before compaction occurs. This ensures that critical information survives the summarization process that compaction performs. See Section 26.9 for details.

Value	Behavior
`true`	Flush facts to workspace before compaction
`false`	No pre-compaction flush (default)

Defaults Table #

If you omit the semantic_memory block entirely, no semantic memory is configured. If you include the block, the following defaults apply for any omitted field:

Field	Default Value
`backend`	`"sqlite"`
`search`	`"hybrid"`
`flush_on_compact`	`false`

So the minimal semantic memory configuration is:

neam

semantic_memory: {}

This gives you a SQLite-backed, hybrid-search semantic memory with no pre-compaction flush -- which is a reasonable starting point for most applications.

📝 Note

Semantic memory requires a workspace to be declared. If you specify a semantic_memory block without a workspace field, the compiler will produce an error. The semantic index is stored inside the workspace directory and indexes the workspace files.

26.5 How Semantic Search Works #

Understanding the internals of semantic search helps you tune it for your use case. This section explains the indexing pipeline, chunking parameters, and the three search modes in detail.

MemoryIndex Architecture #

The MemoryIndex is the runtime component that manages the semantic index. It sits between the workspace filesystem and the search functions:

▶

▼

file_path

chunk_text

chunk_index

embedding[]

▼

Vector

Index

(cosine)

▶

BM25

Index

(tf-idf)

Chunking #

When a file is added to the workspace (via workspace_write or workspace_append), the MemoryIndex automatically chunks the file's content for indexing. The chunking parameters are:

Parameter	Value	Description
Chunk size	400 characters	Maximum length of each chunk
Chunk overlap	80 characters	Characters shared between adjacent chunks

The overlap ensures that concepts spanning a chunk boundary are captured in at least one chunk. For example, if a paragraph about "quarterly revenue targets" straddles the boundary between chunk 3 and chunk 4, the 80-character overlap ensures that the full context appears in at least one of those chunks.

Example of chunking a 1000-character document:

Characters:  0─────────400  320───────720  640────────1000
             ├─ Chunk 1 ─┤
                   ├─ Chunk 2 ─┤
                          ├─ Chunk 3 ─┤

Chunk 1: characters 0-399   (400 chars)
Chunk 2: characters 320-719 (400 chars, overlaps 80 with chunk 1)
Chunk 3: characters 640-999 (360 chars, overlaps 80 with chunk 2)

Each chunk becomes a MemoryChunk record with the following fields:

file_path -- the workspace-relative path of the source file
chunk_text -- the chunk content
chunk_index -- the ordinal position of this chunk within the file
embedding -- the vector embedding of the chunk text

Three Search Modes #

Hybrid Search (Default) #

Hybrid search combines vector similarity and keyword relevance. The final score for each chunk is computed as:

text

score = 0.7 * cosine_similarity(query_embedding, chunk_embedding)
      + 0.3 * bm25_score(query_terms, chunk_terms)

This weighting favors semantic understanding (70%) while still rewarding exact keyword matches (30%). Hybrid search excels when users mix natural-language questions with specific terms:

"What did we discuss about the Q3 revenue targets?" -- The vector component matches chunks about financial goals and quarterly planning, while the BM25 component boosts chunks containing the exact terms "Q3" and "revenue."

Vector Search #

Vector search uses cosine similarity exclusively:

text

score = cosine_similarity(query_embedding, chunk_embedding)

Vector search understands synonyms, paraphrases, and conceptual similarity. It finds chunks about "quarterly financial goals" even if the query says "Q3 targets." However, it can miss chunks that contain rare or specific terms (product codes, acronyms, proper nouns) that are not well-represented in the embedding space.

Best for: Broad, conceptual queries where exact wording does not matter.

Keyword Search #

Keyword search uses the BM25 algorithm exclusively:

text

score = bm25(query_terms, chunk_terms)

BM25 (Best Matching 25) is a probabilistic information retrieval function that ranks documents based on term frequency, inverse document frequency, and document length normalization. It excels at finding exact matches and rare terms.

Best for: Queries with specific identifiers, product codes, error messages, or proper nouns that must match exactly.

Indexing Flow #

The complete indexing flow when a file is written to the workspace:

Update

Vector

Index

▶

Update

BM25

Index

When a file is overwritten, the old chunks are removed from both indexes before the new chunks are inserted. When a file is appended to, the entire file is re-chunked and re-indexed (since chunk boundaries shift when content is added).

💡 Tip

If you are appending many small entries to a single file (like a log), consider writing each entry to a separate file instead. This avoids re-indexing the entire file on every append. For example, write to logs/2025-01-15-001.txt, logs/2025-01-15-002.txt, and so on. Each file is indexed independently, so only the new file is processed.

26.6 memory_search() Function #

The memory_search() function is the primary interface for querying the semantic index. It takes a natural-language query and returns the most relevant chunks from the agent's indexed workspace files.

Signature #

neam

memory_search(query: string, top_k?: int) → list of {file_path, chunk, score}

Parameter	Type	Default	Description
`query`	`string`	(required)	The search query in natural language
`top_k`	`int`	`5`	Maximum number of results to return

Return Value #

The function returns a list of maps, each containing:

Field	Type	Description
`file_path`	`string`	Workspace-relative path of the source file
`chunk`	`string`	The matched chunk text
`score`	`float`	Relevance score (0.0 to 1.0, higher is better)

Results are sorted by score in descending order (most relevant first).

Code Examples #

Basic search:

neam

impl ResearchAssistant {
  skill recall(query: string) -> string {
    description: "Search memory for information related to a query"

    let results = memory_search(query);
    if (len(results) == 0) {
      return "I do not have any information about that topic.";
    }

    let response = "Here is what I found:\n\n";
    for (result in results) {
      response = response + "**" + result["file_path"] + "** ";
      response = response + "(score: " + str(result["score"]) + ")\n";
      response = response + result["chunk"] + "\n\n";
    }
    return response;
  }
}

Search with custom top_k:

neam

// Get the single most relevant chunk
let best = memory_search("deployment procedure", 1);
if (len(best) > 0) {
  let answer = best[0]["chunk"];
  emit "Most relevant: " + answer;
}

// Get a broad set of results for comprehensive answers
let broad = memory_search("user feedback on onboarding", 10);
emit "Found " + str(len(broad)) + " relevant chunks.";

Combining search with workspace read for full context:

neam

impl ResearchAssistant {
  skill deep_recall(query: string) -> string {
    description: "Search memory and return the full source file"

    let results = memory_search(query, 3);
    if (len(results) == 0) {
      return "No relevant information found.";
    }

    // Read the full file that contains the best match
    let best_file = results[0]["file_path"];
    let full_content = workspace_read(best_file);
    if (full_content == nil) {
      return "Found match in " + best_file + " but could not read file.";
    }

    return "Source: " + best_file + "\n\n" + full_content;
  }
}

Interpreting Scores #

The score returned by memory_search() ranges from 0.0 to 1.0, but the interpretation depends on the search mode:

Score Range	Hybrid	Vector	Keyword
0.9 -- 1.0	Near-exact match on both meaning and terms	Very high semantic similarity	Near-exact term match
0.7 -- 0.9	Strong match, related content	Related content, correct topic	Many shared terms
0.5 -- 0.7	Moderate relevance, partially related	Same general domain	Some shared terms
0.3 -- 0.5	Weak match, tangentially related	Loosely related	Few shared terms
0.0 -- 0.3	Likely not relevant	Different topic	Almost no shared terms

💡 Tip

When building skills that surface memory search results to users, consider filtering results below a threshold. A score of 0.5 is a reasonable cutoff for hybrid search -- results below this are often noise rather than signal. For vector-only search, a cutoff of 0.6 is more appropriate because vector scores tend to be lower for truly unrelated content.

neam

let results = memory_search(query, 10);
let filtered = [];
for (result in results) {
  if (result["score"] >= 0.5) {
    filtered = filtered + [result];
  }
}

Choosing the Right Search Mode #

The search mode is set globally on the semantic_memory block, but understanding when each mode excels helps you choose correctly:

Scenario	Recommended Mode	Why
General-purpose assistant	`"hybrid"`	Balances meaning and exact terms
Creative writing assistant	`"vector"`	Prioritizes thematic similarity
Code assistant searching by function name	`"keyword"`	Exact identifiers matter
Log analysis with error codes	`"keyword"`	Specific codes must match exactly
Research assistant with varied queries	`"hybrid"`	Handles both conceptual and specific queries

26.7 session_history() Function #

While memory_search() queries persistent workspace content, session_history() queries the volatile conversation history of the current (or a named) session. This is useful for summarization, context extraction, and building agents that reference earlier parts of the conversation.

Signature #

neam

session_history(session_key?: string, limit?: int) → list of {role, content}

Parameter	Type	Default	Description
`session_key`	`string`	`"default"`	The session identifier to query
`limit`	`int`	All messages	Maximum number of messages to return (most recent first)

Return Value #

The function returns a list of maps, each containing:

Field	Type	Description
`role`	`string`	`"user"`, `"assistant"`, or `"system"`
`content`	`string`	The message content

Messages are returned in chronological order (oldest first), but the limit parameter selects the N most recent messages.

Code Examples #

Get the entire conversation history:

neam

impl Librarian {
  skill show_history() -> string {
    description: "Show the conversation history for this session"

    let history = session_history();
    if (len(history) == 0) {
      return "No conversation history yet.";
    }

    let output = "Conversation so far:\n\n";
    for (msg in history) {
      output = output + "**" + msg["role"] + "**: " + msg["content"] + "\n\n";
    }
    return output;
  }
}

Get the last 5 messages:

neam

let recent = session_history("default", 5);
for (msg in recent) {
  emit msg["role"] + ": " + msg["content"];
}

Access a named session:

Claw agents can maintain multiple named sessions (for example, one per user or one per channel). You can query any session by key:

neam

// Get history from a specific user's session
let user_session = session_history("user_42", 10);

// Get history from a group channel session
let group_session = session_history("group_engineering", 20);

Use Cases #

Summarization: Build a skill that summarizes the conversation so far, useful for handoff to another agent or for saving a session summary to the workspace.

neam

impl Librarian {
  skill summarize_session() -> string {
    description: "Create a summary of the current conversation"

    let history = session_history();
    if (len(history) < 3) {
      return "Not enough conversation to summarize.";
    }

    // Build a summary prompt from the history
    let transcript = "";
    for (msg in history) {
      transcript = transcript + msg["role"] + ": " + msg["content"] + "\n";
    }

    // Ask the agent itself to summarize
    let summary = self.ask("Summarize this conversation in 3 bullet points:\n\n" + transcript);

    // Save the summary to workspace for persistence
    let filename = "sessions/summary_" + timestamp() + ".md";
    workspace_write(filename, summary);

    return summary;
  }
}

Context extraction: Pull specific facts from the conversation to store in the workspace for long-term recall.

neam

impl Librarian {
  skill extract_facts() -> string {
    description: "Extract key facts from the conversation and save them"

    let history = session_history("default", 20);
    let transcript = "";
    for (msg in history) {
      if (msg["role"] == "user") {
        transcript = transcript + msg["content"] + "\n";
      }
    }

    let facts = self.ask(
      "Extract all factual statements from this text. " +
      "Return each fact on its own line:\n\n" + transcript
    );

    workspace_append("facts/extracted.md", "\n## Session " + timestamp() + "\n" + facts);
    return "Extracted and saved facts:\n" + facts;
  }
}

📝 Note

session_history() is only available in claw agent declarations. Calling it from a forge agent or a stateless agent will return an empty list, because those agent types do not maintain session state.

26.8 Automatic Memory Injection #

One of the most powerful features of Neam's semantic memory system is automatic memory injection. When a claw agent has semantic memory enabled (with search set to anything other than "none"), the runtime automatically injects relevant memory context into the system prompt before each .ask() call.

How It Works #

When a user sends a message to a claw agent with semantic memory enabled, the runtime performs these steps before the LLM call:

memory_search()

top_k = 3

▼

Format results

as context block

▼

Prepend to system prompt:

[MEMORY CONTEXT]

The following information was

retrieved from your memory:

Source: notes/project-alpha.md

> Chunk content here...

Source: meetings/standup-jan-15.md

> Another chunk here...

[END MEMORY CONTEXT]

What Gets Injected #

The runtime retrieves the top 3 most relevant chunks from the semantic index using the user's message as the query. These are formatted into a structured block prepended to the system prompt:

text

[MEMORY CONTEXT]
The following information was retrieved from your memory. Use it to inform
your response if relevant, but do not reference it unless the user asks
about these topics.

Source: notes/project-alpha.md (score: 0.87)
> Project Alpha is scheduled for Q2 launch. The team consists of 4 engineers
> and 1 designer. Primary risk: API integration with legacy systems.

Source: meetings/standup-jan-15.md (score: 0.74)
> Discussed blockers: CI pipeline failing on integration tests. John to
> investigate by end of week.

Source: notes/team-roster.md (score: 0.62)
> Engineering team: Alice (lead), Bob, Carol, Dave. Designer: Eve.

[END MEMORY CONTEXT]

When Injection Occurs #

Automatic injection occurs on every .ask() call when all of the following conditions are met:

The agent is a claw agent.
The semantic_memory block is present.
The search field is not "none".
The semantic index contains at least one chunk.

If the semantic index is empty (no workspace files have been written yet), injection is skipped silently.

Controlling Injection Behavior #

You can control automatic injection in several ways:

Disable injection entirely by setting search to "none":

neam

semantic_memory: {
  backend: "sqlite"
  search: "none"
}

This still enables workspace file operations and manual memory_search() calls, but the runtime will not automatically inject context into prompts.

Use manual search instead when you want full control over what context is injected. Set search to "none" and build the context yourself in a skill:

neam

claw agent CustomMemory {
  provider: "openai"
  model: "gpt-4o"
  system: "You are an assistant with manual memory control."
  workspace: "./data/custom"

  semantic_memory: {
    backend: "sqlite"
    search: "none"
  }

  channels: [{ type: "cli" }]
}

impl CustomMemory {
  skill answer_with_context(question: string) -> string {
    description: "Answer a question using manually retrieved memory context"

    // Search with custom top_k and threshold
    let results = memory_search(question, 8);
    let context = "";
    for (result in results) {
      if (result["score"] >= 0.6) {
        context = context + result["chunk"] + "\n\n";
      }
    }

    if (context == "") {
      return self.ask(question);
    }

    let augmented = "Use this context to help answer:\n\n" + context +
                    "\n\nQuestion: " + question;
    return self.ask(augmented);
  }
}

❌ Common Mistake

Do not confuse automatic injection with RAG (Chapter 15). Automatic memory injection searches the agent's own workspace files -- content the agent itself has written or that was placed in its workspace directory. RAG searches an externally defined knowledge block that you populate with documents. They serve different purposes: memory injection is for the agent's personal accumulated knowledge, while RAG is for external reference material.

26.9 Pre-Compaction Flush #

Long-running claw agent sessions accumulate conversation history that grows without bound. At some point, the runtime compacts the session by summarizing older messages and discarding the originals. This keeps the context window manageable but risks losing important details buried in the summarized portion.

Pre-compaction flush addresses this by writing critical facts to the workspace before compaction occurs. When flush_on_compact is set to true, the agent gets a chance to preserve important information in persistent storage before the volatile session history is trimmed.

How Compaction Works #

Summarize Phase

Older messages →

condensed summary

▼

Compact Phase

Replace older

messages with

summary message

Enabling Pre-Compaction Flush #

neam

claw agent LongTermAssistant {
  provider: "openai"
  model: "gpt-4o"
  system: "You are a long-running assistant that remembers everything important."
  workspace: "./data/long-term"

  semantic_memory: {
    backend: "sqlite"
    search: "hybrid"
    flush_on_compact: true
  }

  channels: [{ type: "cli" }]
}

What Gets Flushed #

When compaction triggers and flush_on_compact is true, the runtime performs an additional LLM call before summarization. This call asks the agent to extract key facts, decisions, action items, and user preferences from the conversation segment that is about to be compacted. The extracted content is appended to a flush file in the workspace:

📁compaction/

├── 📄flush-001.txt ← First compaction flush

├── 📄flush-002.txt ← Second compaction flush

└── 📄flush-003.txt ← Third compaction flush

Each flush file contains timestamped facts:

text

[Flush at 2025-01-15T14:30:00Z]
- User prefers responses in bullet point format
- Project Alpha deadline moved to March 15
- Budget approved for 3 additional engineers
- User's timezone is PST
- Blocker: CI pipeline integration tests failing since Monday

Because these files are in the workspace, they are automatically indexed by the semantic memory system. Future memory_search() calls and automatic injection will find these facts even though the original conversation messages have been compacted away.

Why This Matters for Long-Running Conversations #

Without pre-compaction flush, a claw agent that runs for hours or days loses detail progressively. The compaction summary captures the gist of the conversation, but specific numbers, dates, names, and action items are often lost in summarization.

With flush_on_compact: true, those details are extracted and persisted to the workspace before they are summarized away. The agent's semantic memory grows richer over time, and it can answer specific questions about past conversations even after the session history has been compacted multiple times.

🌎 Real-World Analogy

Imagine taking meeting notes. As the day goes on, your short-term memory of earlier meetings fades (compaction). Without notes, you would forget the specific action items and deadlines discussed at 9 AM by the time it is 4 PM. But if you jot down the key takeaways before each meeting fades from memory (flush), you can always refer back to your notes. Pre-compaction flush is the agent writing meeting notes before its short-term memory is summarized.

26.10 Memory Isolation Rules #

Memory isolation determines which agents can see which data. Neam enforces strict isolation boundaries to prevent agents from accidentally (or maliciously) accessing each other's memory.

Isolation by Context #

The following table shows what memory each agent has access to, depending on the execution context:

Context	Session History	Workspace	Semantic Index
Private DM	Per-session	Per-agent	Per-agent
Group channel	Per-group	Per-agent	Per-agent
Cron job	No session	Per-agent	Per-agent
Forge agent	No session	Per-agent	No
Subagent (spawn)	No session	Inherited	No

Let us examine each context in detail.

Private DM #

When a user interacts with a claw agent through a private channel (CLI, direct message), the session history is scoped to that specific session. The workspace and semantic index are scoped to the agent -- all sessions share the same workspace and index.

neam

claw agent PersonalBot {
  provider: "openai"
  model: "gpt-4o"
  system: "You are a personal assistant."
  workspace: "./data/personal"
  semantic_memory: { search: "hybrid" }
  channels: [{ type: "cli" }]
}

If two users both interact with PersonalBot, they have separate session histories but share the same workspace and semantic index. Notes written by one user are visible to the other through memory_search().

💡 Tip

If you need per-user isolation, use the user identifier in workspace paths: workspace_write("users/" + user_id + "/notes.md", content). This partitions the workspace logically even though the filesystem root is shared.

Group Channel #

In a group channel (Slack, Discord, or similar), the session history is shared across all participants in that group. The workspace and semantic index remain per-agent.

Cron Job #

When a claw agent runs on a cron schedule (no user-initiated message), there is no session history. The agent can still access its workspace and semantic index, which makes cron jobs ideal for periodic maintenance tasks like summarizing accumulated notes or cleaning up old files.

Forge Agent #

Forge agents have their own workspace but no session history (each iteration starts fresh) and no semantic index. They read and write files in their workspace for plan tracking, build artifacts, and iteration logs.

Subagent (Spawn) #

When a claw agent spawns a subagent using spawn(), the subagent inherits the parent's workspace. This means the subagent can read files the parent wrote and write files the parent can later read. However, the subagent has no session history of its own and no semantic index access.

neam

impl PersonalBot {
  skill delegate_research(topic: string) -> string {
    description: "Spawn a subagent to research a topic and save findings"

    let researcher = spawn({
      provider: "openai",
      model: "gpt-4o",
      system: "You are a research specialist. Save your findings to the workspace."
    });

    // The subagent inherits PersonalBot's workspace
    // It can write files that PersonalBot can later read and search
    let result = researcher.ask("Research " + topic + " and save a summary.");
    return result;
  }
}

Isolation Guarantees #

The following boundaries are never crossed:

Agent A cannot read Agent B's workspace. Even if Agent A knows the filesystem path to Agent B's workspace, the runtime prevents cross-agent file access.
Agent A cannot search Agent B's semantic index. Each agent's index is isolated.
Session history from one session is not visible in another. Even for the same agent, different sessions have independent history.
Forge agents cannot access claw agent memory, and vice versa. Each agent type operates in its own memory space.

❌ Common Mistake

Do not assume that spawning a subagent gives it access to the parent's semantic index. The subagent inherits the workspace filesystem but not the semantic index. If the subagent needs to search the parent's indexed content, have the parent perform the search and pass the results to the subagent as part of the prompt.

26.11 Real-World Example: Knowledge-Aware Personal Assistant #

Let us bring together everything from this chapter into a complete, working example. We will build a claw agent that functions as a knowledge-aware personal assistant. It can save notes to its workspace, recall notes using semantic search, and maintain continuity across sessions thanks to persistent storage and automatic memory injection.

The Complete Program #

neam

// knowledge_assistant.neam
// A personal assistant with persistent memory and semantic search

claw agent Sage {
  provider: "openai"
  model: "gpt-4o"
  temperature: 0.4
  system: "You are Sage, a personal knowledge assistant. You help users
    store, organize, and recall information. When users share facts,
    preferences, or notes, save them. When users ask questions, search
    your memory first. Be concise and helpful. If you recall relevant
    information from memory, reference it naturally in your response."

  workspace: "./data/sage"

  semantic_memory: {
    backend: "sqlite"
    search: "hybrid"
    flush_on_compact: true
  }

  channels: [
    { type: "cli" }
  ]
}

impl Sage {
  skill save_note(title: string, content: string) -> string {
    description: "Save a note with a title and content to persistent memory"

    // Sanitize the title for use as a filename
    let safe_title = title;
    let path = "notes/" + safe_title + ".md";

    // Build the note with metadata
    let note = "# " + title + "\n\n";
    note = note + "Saved: " + timestamp() + "\n\n";
    note = note + content + "\n";

    let ok = workspace_write(path, note);
    if (!ok) {
      return "Failed to save note: " + title;
    }

    return "Saved note: " + title + " (stored at " + path + ")";
  }

  skill recall_notes(query: string) -> string {
    description: "Search memory for notes related to a query"

    let results = memory_search(query, 5);
    if (len(results) == 0) {
      return "I could not find any notes matching your query.";
    }

    let response = "I found " + str(len(results)) + " relevant results:\n\n";
    let i = 1;
    for (result in results) {
      response = response + "### Result " + str(i) + "\n";
      response = response + "**Source:** " + result["file_path"] + "\n";
      response = response + "**Relevance:** " + str(result["score"]) + "\n";
      response = response + result["chunk"] + "\n\n";
      i = i + 1;
    }
    return response;
  }

  skill list_notes() -> string {
    description: "List all saved notes"

    let index = workspace_read("notes/_index.txt");
    if (index == nil) {
      return "No notes saved yet.";
    }
    return "Saved notes:\n\n" + index;
  }

  skill save_preference(key: string, value: string) -> string {
    description: "Save a user preference"

    let prefs_raw = workspace_read("preferences.json");
    let prefs = {};
    if (prefs_raw != nil) {
      prefs = parse_json(prefs_raw);
    }

    prefs[key] = value;
    let ok = workspace_write("preferences.json", to_json(prefs));
    if (ok) {
      return "Preference saved: " + key + " = " + value;
    }
    return "Failed to save preference.";
  }

  skill get_session_summary() -> string {
    description: "Summarize the current conversation session"

    let history = session_history("default", 20);
    if (len(history) < 2) {
      return "Not enough conversation to summarize.";
    }

    let transcript = "";
    for (msg in history) {
      transcript = transcript + msg["role"] + ": " + msg["content"] + "\n";
    }

    let summary = self.ask(
      "Summarize this conversation in 3-5 bullet points. " +
      "Focus on key facts and decisions:\n\n" + transcript
    );

    // Persist the summary
    let filename = "sessions/summary_" + timestamp() + ".md";
    workspace_write(filename, summary);

    return summary;
  }
}

// Entry point
{
  emit "Sage is ready. Type your messages below.";
  emit "Skills: save_note, recall_notes, list_notes, save_preference, get_session_summary";
  Sage.listen();
}

Usage Example #

Here is a sample interaction showing how the assistant stores and recalls information across sessions.

Session 1:

bash

$ neam run knowledge_assistant.neam

Sage is ready. Type your messages below.
Skills: save_note, recall_notes, list_notes, save_preference, get_session_summary

> My project is called Helios. It is a solar panel monitoring system.
[Sage calls save_note("helios-project", "Project called Helios...")]
Saved note: helios-project. I have recorded that Helios is a solar panel
monitoring system.

> The tech stack is Rust for the backend, React for the frontend, and
  PostgreSQL for the database.
[Sage calls save_note("helios-tech-stack", "Tech stack: Rust backend...")]
Saved note: helios-tech-stack. I have recorded the Helios tech stack.

> The launch deadline is March 15.
[Sage calls save_note("helios-deadline", "Launch deadline: March 15")]
Saved note: helios-deadline. Got it, March 15 launch deadline for Helios.

> /quit

Session 2 (next day):

bash

$ neam run knowledge_assistant.neam

Sage is ready. Type your messages below.

> What do you know about my project?
[Sage receives automatic memory injection with top 3 relevant chunks]
[MEMORY CONTEXT includes helios-project, helios-tech-stack, helios-deadline]

Your project is Helios, a solar panel monitoring system. The tech stack is
Rust for the backend, React for the frontend, and PostgreSQL for the
database. The launch deadline is March 15.

> What database are we using?
[Sage calls recall_notes("database")]
Based on my records, Helios uses PostgreSQL for the database.

Notice that in Session 2, the agent has no session history from Session 1. But because the notes were saved to the workspace and indexed semantically, the agent can recall the information through automatic memory injection and explicit recall_notes calls.

🎯 Try It Yourself

Create the knowledge_assistant.neam file above and run it. Save several notes about a topic you care about. Quit the program, restart it, and ask questions about those notes. Observe how the semantic search retrieves relevant information even when your question uses different wording than the original notes.

Summary #

In this chapter you learned:

Three-tier memory architecture: Session history (volatile, in-memory), workspace files (persistent, on-disk), and semantic search (indexed, queryable by meaning). Each tier serves a different recall pattern.
Workspace concept: Agent-scoped filesystem roots declared with the workspace field. All paths are relative and sandboxed -- parent traversal (..) and absolute paths are rejected at runtime.
Native workspace functions: workspace_read(path) returns file content or nil, workspace_write(path, content) creates or overwrites files (returning bool), and workspace_append(path, content) adds to files without overwriting (returning bool). All three enforce path security and auto-create parent directories.
Semantic memory configuration: The semantic_memory block on claw agents with three fields -- backend ("sqlite", "local", "none"), search ("hybrid", "vector", "keyword", "none"), and flush_on_compact (true/false).
How semantic search works: Files are chunked (400 chars, 80 overlap), embedded as vectors, and indexed in both a vector index (cosine similarity) and a BM25 index (keyword relevance). Hybrid search combines both with 70/30 weighting.
memory_search(): Queries the semantic index with a natural-language string and returns ranked results with file_path, chunk, and score. Default top_k is 5.
session_history(): Retrieves conversation messages from the current or a named session, useful for summarization and context extraction.
Automatic memory injection: When semantic memory is enabled, the runtime automatically prepends the top 3 relevant chunks to the system prompt before each .ask() call, giving the agent contextual awareness without manual search.
Pre-compaction flush: Setting flush_on_compact: true causes the agent to extract and persist key facts to the workspace before session compaction, preventing detail loss in long-running conversations.
Memory isolation rules: Session history is per-session, workspace is per-agent, semantic index is per-agent. Subagents inherit workspace but not semantic index. Forge agents have workspace only. Cross-agent memory access is never permitted.

Exercises #

Exercise 26.1: Workspace Basics #

Create a claw agent with a workspace. Write three skills: one that saves a key-value pair to a JSON file, one that reads a value by key, and one that deletes a key. Test all three operations and verify that data persists across program restarts.

Exercise 26.2: Append-Only Log #

Build a claw agent that uses workspace_append to maintain an activity log. Every time the user sends a message, the agent should append a timestamped entry to activity.log. Write a show_log skill that reads and displays the log. Run 10 interactions and verify the log contains all 10 entries.

Exercise 26.3: Search Mode Comparison #

Create three claw agents, each with a different search mode ("hybrid", "vector", "keyword"). Write the same 5 notes to each agent's workspace. Then run the same 5 queries against each agent using memory_search(). Compare the results and scores across search modes. Which mode performed best for factual queries? Which performed best for conceptual queries?

Exercise 26.4: Session Summarization #

Build a claw agent that automatically summarizes the conversation every 10 messages using session_history(). The summary should be saved to the workspace with workspace_write. After 20 messages, verify that two summary files exist and that the semantic index can find relevant content from both summaries via memory_search().

Exercise 26.5: Pre-Compaction Flush #

Create a claw agent with flush_on_compact: true. Have a long conversation (at least 30 messages) to trigger compaction. After compaction, verify that the compaction/ directory contains a flush file. Then ask the agent about specific details from the early part of the conversation. Does it recall them through the flushed content?

Exercise 26.6: Memory Isolation #

Create two claw agents (AgentA and AgentB) with separate workspaces. Write a note using AgentA and attempt to read it from AgentB. Verify that AgentB cannot access AgentA's workspace or semantic index. Then spawn a subagent from AgentA and verify that the subagent can read files from AgentA's workspace.

Exercise 26.7: Knowledge Assistant Extension #

Extend the Sage example from Section 26.11 with the following features:

A tag_note skill that adds tags to notes (stored as metadata in the file).
A search_by_tag skill that searches only notes with a specific tag.
A delete_note skill that removes a note and triggers re-indexing.

Test the complete system with at least 10 notes across 3 different tags.

Exercise 26.8: Multi-Session Memory #

Build a claw agent that maintains separate named sessions for different "projects." The agent should have skills to switch_project(name), save_to_project(content), and recall_from_project(query). Each project uses a different workspace subdirectory and session key. Verify that searching within one project does not return results from another project's notes.