Chapter 1: What Is Neam? #
"The limits of my language mean the limits of my world." -- Ludwig Wittgenstein
1.1 The AI Agent Landscape Today #
The year is 2026. AI agents are no longer a research curiosity -- they are shipping in production systems at companies of every size. Customer service bots route tickets. Research assistants summarize papers. Code generators write and debug software. Voice assistants handle phone calls. Autonomous monitors watch infrastructure around the clock.
And yet, the tools we use to build these agents have barely evolved past their proof-of-concept origins.
The dominant approach today is to write AI agent logic in Python using one of several competing frameworks. Each framework has its own abstractions, its own execution model, and its own opinions about how agents should be structured. The result is a landscape that looks roughly like this:
Every framework sits on top of the same Python runtime, which provides no awareness of the AI agent domain. Python does not know what an "agent" is. It does not know what a "handoff" means. It cannot validate at compile time that a tool's parameter schema matches the function signature that implements it. It cannot check that a handoff target actually exists, that a knowledge base's embedding model is compatible with its vector store, or that a voice pipeline has both an STT and TTS provider configured.
All of these checks happen at runtime -- if they happen at all.
1.1.1 The Cost of Runtime Discovery #
Consider what happens when you deploy a LangChain agent that references a tool named
search_web, but you have misspelled it as search_Web in the agent configuration. In
Python, this is not a compile error. It is not even a startup error. It is an error that
occurs when a user sends a query that triggers tool use -- possibly hours or days after
deployment, possibly affecting a paying customer.
Now multiply this by every configuration surface in a production agent system: provider endpoints, API key environment variable names, model identifiers, temperature ranges, output schemas, handoff targets, guardrail definitions, voice pipeline configurations, RAG retrieval strategies, memory backends, and evaluation datasets. Each of these is a string or dictionary in Python, validated only by convention.
Neam eliminates this entire class of errors by moving validation to compile time. Neam
also provides security-first programming through native guard, guardchain, policy,
and budget declarations aligned with the OWASP Top 10 for Agentic Applications, plus a
comprehensive data type system with Tuple, Set, Range, TypedArray, Record, and Table
types, for ... in loops, the pipe operator, f-strings, destructuring, comprehensions,
and an iterator protocol with lazy evaluation -- making Neam a data-processing-capable
DSL alongside its agent-native foundations.
1.1.2 The Framework Lock-In Problem #
Each Python framework defines its own universe of abstractions. If you build a multi-agent system in LangChain and later want to switch to AutoGen's conversation model -- perhaps because it better fits your use case -- you are looking at a rewrite, not a refactor. The agent definitions, tool bindings, memory configurations, and orchestration logic are all framework-specific.
Neam addresses this by being the framework. When you write agent logic in Neam, you are writing in a language that understands agents natively. There is no underlying framework to switch. The language is the execution model.
1.1.3 The 2026 Agentic AI Market #
To understand why a purpose-built language matters, consider the market forces at play in 2026. The agentic AI landscape is not a niche research area -- it is a rapidly scaling industrial sector with real economic consequences for getting the tooling wrong.
Market Scale:
| Market Segment | 2025 Value | 2026 Projected | 2034 Projected | CAGR |
|---|---|---|---|---|
| Agentic AI (total) | $7.3--7.8B | $9.1--11.8B | $139--199B | 40.5--46.3% |
| Voice AI agents | $2.4B | ~$3.5B | $47.5B | 34.8% |
| RAG market | $2.33B | $3.33B | $67.4B | 35.3--42.7% |
| Conversational AI | $11.58B | ~$14.3B | $41.4B | 23.7% |
Sources: Fortune Business Insights, Precedence Research, MarketsandMarkets, Market.us
AI agent startups raised $3.8 billion in 2024 alone -- nearly tripling the previous year's investments. 88% of surveyed senior executives plan to increase their AI budgets within 12 months due to agentic AI (PwC, 2025).
Enterprise Adoption -- The Steepest Curve in History:
Gartner projects that by the end of 2026, 40% of enterprise applications will include task-specific AI agents, up from less than 5% in 2025 -- one of the steepest adoption curves ever observed in enterprise technology. Consider these statistics:
- 79% of organizations have adopted AI agents to some extent (2025)
- Adoption jumped from 11% to 42% in just six months
- Companies using AI agents report 55% higher operational efficiency and 35% cost reductions
- McKinsey estimates generative AI could add $2.6--4.4 trillion annually to global GDP
- Gartner projects agentic AI could drive $450 billion in enterprise software revenue by 2035
However, Gartner also warns that over 40% of agentic AI projects may be cancelled by 2027 due to escalating costs, unclear business value, or inadequate risk controls. Only about 130 of thousands of agentic AI vendors are considered "real" -- the rest engage in "agent washing" (rebranding chatbots and RPA as agents).
The high cancellation rate is driven by the same factors Neam directly addresses: cost (Neam's 10x smaller deployment footprint reduces infrastructure spend), unclear value (compile-time validation catches errors before production), and weak controls (first-class guardrails, structured output types, and provider validation reduce operational risk).
1.1.4 Protocol Standardization: MCP and A2A #
Two protocols have converged as industry standards for agentic interoperability, both now governed by the Linux Foundation:
Model Context Protocol (MCP):
- Released by Anthropic in November 2024 as an open standard
- OpenAI adopted MCP across the Agents SDK, Responses API, and ChatGPT desktop in March 2025
- By December 2025: 97M+ monthly SDK downloads, 10,000+ active MCP servers in production
- Donated to the Agentic AI Foundation (AAIF) under the Linux Foundation in December 2025, with founding members including Block, OpenAI, AWS, Google, and Microsoft
- Direct competitors Anthropic and OpenAI jointly released the MCP Apps Extension for interactive UI capabilities in early 2026
Agent-to-Agent Protocol (A2A):
- Unveiled by Google at Cloud Next in April 2025 with 50+ launch partners including Salesforce, SAP, Atlassian, PayPal, and Workday
- Linux Foundation launched the A2A project in June 2025 with AWS, Cisco, Google, Microsoft, Salesforce, SAP, and ServiceNow as founding members
- Ecosystem grew to 150+ organizations by mid-2025
- Key adoptions: Salesforce Agentforce, SAP Joule, Adobe, Zoom, and UiPath
Neam implements both protocols at the runtime level -- MCP as a client (connecting to 10,000+ available tool servers) and A2A as a server (agent cards, task lifecycle, SSE streaming). Both are compiled to dedicated bytecode opcodes, not library wrappers, enabling compile-time validation of protocol configurations. No other compiled language supports either protocol.
1.1.5 The Python Bottleneck #
Python commands over 80% of AI/ML development, yet its architectural limitations create compounding problems at scale:
| Limitation | Impact on AI Agent Workloads | Quantitative Measure |
|---|---|---|
| Global Interpreter Lock (GIL) | Cannot parallelize agent orchestration | 1 thread effective concurrency |
| Dynamic typing overhead | 10--100x per-operation overhead | 28 bytes per int vs. 4 in C |
| Interpretation speed | Control flow at Python speed | 50--100x slower than C for loops |
| Deployment packaging | Docker images 5--10 GB | No native binary compilation |
| Memory inefficiency | GC latency spikes | ~3x memory overhead for bookkeeping |
| No compile-time checks | Agent errors found only at runtime | 0/12 agentic errors caught |
PEP 703 (Python 3.13) introduced an experimental free-threaded build, but carries a 5--10% single-threaded penalty and remains opt-in. The fundamental issue -- that Python is an interpreted, dynamically-typed language -- cannot be resolved without breaking backward compatibility.
1.1.6 The Multi-Agent Framework Landscape #
The current ecosystem in 2026 is dominated by Python libraries, each with structural limitations:
| Framework | Architecture | Monthly Downloads | Limitation |
|---|---|---|---|
| LangGraph | Graph-based state machines | ~6.17M | No compile-time validation |
| CrewAI | Role-based collaboration | ~1.38M | YAML configs at runtime |
| OpenAI Agents SDK | Routine-based, prompt-driven | N/A | OpenAI model lock-in |
| AutoGen (Microsoft) | Multi-agent conversations | N/A | No compilation step |
| Google ADK | A2A-native development kit | N/A | Requires Google Cloud |
All existing frameworks share the same structural limitation: they are libraries on top of interpreted languages. This means agent topologies are validated only at runtime, deployment requires shipping Python environments (5--10 GB Docker images), the GIL prevents true parallel agent execution, and there are no static guarantees for handoff correctness, tool schema validity, or provider compatibility.
1.1.7 The Regulatory Imperative #
The EU AI Act achieves full enforcement on August 2, 2026, with penalties up to EUR 35 million or 7% of global revenue. Enterprise compliance requirements include:
- Full data lineage tracking for every AI output
- Human-in-the-loop checkpoints for safety-critical workflows
- Risk classification tags on every model
- Operational evidence (not just documentation) for audit purposes
Gartner predicts that by 2026, 70% of enterprises will integrate compliance-as-code into DevOps toolchains. Gartner also predicts that by end of 2026, "death by AI" legal claims will exceed 2,000 due to insufficient AI risk guardrails.
Neam's architecture directly addresses regulatory requirements: guard and
guardchain blocks provide declarative input/output validation, policy declarations
enforce capability control (allow/deny/confirm), budget declarations cap resource
consumption, memory blocks with SQLite persistence enable complete audit trails,
compile-time agent validation catches topology errors before deployment, and structured
audit logging with per-call token usage provides operational evidence. These security
features are aligned with all 10 OWASP agentic security domains and the MAESTRO
seven-layer defense model. This compliance is architectural, not bolt-on -- the
safety features compile to bytecode and are enforced by the VM.
1.1.8 The Cognitive Agent Paradigm #
The 2025--2026 period has seen a decisive shift from stateless request-response agents toward cognitive agents capable of structured reasoning, self-evaluation, and learning from experience. This paradigm is driven by several converging research threads:
- Chain-of-Thought (CoT) prompting (Wei et al., NeurIPS 2022) demonstrated that intermediate reasoning steps dramatically improve LLM performance
- Tree of Thoughts (Yao et al., NeurIPS 2023) explores multiple reasoning paths using search algorithms
- Reflexion (Shinn et al., NeurIPS 2023) showed agents can improve by reflecting on failures and maintaining episodic memory
- Self-Refine (Madaan et al., NeurIPS 2023) demonstrated iterative self-improvement through feedback loops, achieving 10--30% accuracy improvements
- DSPy (Khattab et al., 2024) introduced systematic LLM prompt optimization through compilation
- EvoAgentX (2025) demonstrated evolutionary prompt optimization where agents autonomously refine their instructions
McKinsey estimates that cognitive agent capabilities could unlock $1.2--1.8 trillion in additional enterprise value by 2030. Gartner identifies "self-improving AI agents" as a top-10 strategic technology trend for 2026.
Neam is the first compiled language to implement the full cognitive agent stack as language-level constructs: structured reasoning (4 strategies), self-reflection, experience-based learning, prompt evolution, and autonomous execution. What requires combining five separate Python libraries (DSPy, Reflexion, Self-Refine, EvoAgentX, Agent0) -- approximately 420 lines of integration code -- is expressed in 35 lines of declarative Neam agent properties with compile-time validation. Neam builds on this foundation with OWASP-aligned agentic security across ten security domains and a comprehensive data type system for processing structured LLM output.
1.2 What Makes Neam Different #
Neam differs from Python-based AI agent frameworks in several fundamental ways:
1.2.1 Compilation, Not Interpretation #
Neam source code is compiled to bytecode before execution. The compiler (neamc)
performs static analysis on your agent declarations, validates tool definitions, resolves
module imports, checks handoff targets, and produces a binary .neamb file that the VM
can execute efficiently.
This means errors are caught early. If you reference an agent that does not exist, the compiler tells you. If your structured output type is malformed, the compiler tells you. If your module import path is wrong, the compiler tells you. Before any LLM call is made, before any API credit is spent.
1.2.2 Domain-Specific Constructs #
In Neam, "agent," "skill," "knowledge," "voice," "runner," "guard," "guardchain,"
"policy," "budget," and "memory" are language keywords, not library classes. They have
dedicated syntax, dedicated validation rules, and dedicated runtime behavior. Skills
(the evolution of the earlier "tool" keyword) support impl blocks for inline logic
and extern skill declarations for binding to HTTP APIs, MCP servers, and Claude
built-in tools. This is not syntactic sugar over Python -- it is a fundamentally
different approach to expressing agent logic.
Consider the difference between defining an agent in Python (LangChain) and in Neam:
Python (LangChain):
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_openai import ChatOpenAI
from langchain.tools import Tool
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)
tools = [Tool(name="search", func=search_fn, description="Search the web")]
agent = create_openai_tools_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
result = executor.invoke({"input": "What is Neam?"})
Neam:
skill Search {
description: "Search the web",
params: [
{ name: "query", schema: { "type": "string", "description": "Search query" } }
],
impl: fun(query) {
return http_get("https://api.search.com/?q=" + query);
}
}
agent Assistant {
provider: "openai",
model: "gpt-4o-mini",
temperature: 0.7,
system: "You are a helpful assistant.",
skills: [Search]
}
{
let result = Assistant.ask("What is Neam?");
emit result;
}
The Neam version is not just shorter. It is validated at compile time. The compiler
verifies that Search is a valid tool, that Assistant references a known provider,
that the temperature is within range, and that the skills list contains only tool
declarations.
1.2.3 A Complete Toolchain #
Neam does not stop at the language. It ships with nine executables that form a complete development and deployment toolchain:
| Tool | Purpose |
|---|---|
neamc |
Compiler: .neam source to .neamb bytecode |
neam |
VM: executes .neamb bytecode |
neam-cli |
Interactive CLI, REPL, and watch mode |
neam-api |
HTTP API server with A2A protocol support |
neam-pkg |
Package manager for sharing agent components (registry at registry.neam.dev) |
neam-lsp |
Language Server Protocol server for IDE integration |
neam-dap |
Debug Adapter Protocol server for step-through debugging |
neam-gym |
Evaluation harness for benchmarking agents |
libneam |
Shared library (.dylib / .dll / .so) for embedding |
The neam-pkg package manager deserves special mention. Inspired by Cargo (Rust),
pip (Python), and npm (Node.js), it provides project scaffolding (neam-pkg init),
dependency resolution with a lockfile (neam.lock), version-constrained installs,
feature flags, and publishing to the central registry.neam.dev registry. Packages are
distributed as .neampkg archives containing source, checksums, and optional
cryptographic signatures. This gives Neam a governed ecosystem where packages must pass
authentication and evaluation before publishing -- the "One Hundred Percent" rule
requires a Gym certificate with pass_rate: 1.0.
Every tool in this chain understands the Neam domain. Section 1.4 explores each in detail.
1.2.4 Native Cognitive Features #
Neam supports cognitive capabilities as language-level features: reasoning strategies (chain-of-thought, plan-and-execute, tree-of-thought, self-consistency), self-reflection with quality thresholds, learning loops, prompt evolution, and autonomous goal-driven execution with budget controls. These are not third-party plugins -- they are built into the VM and controlled through agent declaration syntax. Neam also provides OWASP-aligned agentic security across ten domains and a comprehensive data type system for structured LLM output processing.
1.2.5 Multi-Provider by Design #
Neam supports seven LLM providers natively: OpenAI, Anthropic, Google Gemini, Ollama
(for local models), Azure OpenAI, AWS Bedrock, and GCP Vertex AI. You can use different
providers for different agents in the same program, and the VM handles the protocol
differences transparently. Switching providers requires changing only the provider and
model fields in an agent declaration.
1.2.6 Cloud-Native Deployment #
Neam extends into production cloud infrastructure without changing the
parser, compiler, or bytecode format. The same .neam source runs identically across
all deployment targets:
Cloud-native features are configured through neam.toml and enabled via build flags:
- Distributed state backends -- PostgreSQL, Redis, DynamoDB, CosmosDB, and Firestore replace SQLite for production persistence, with distributed locking for multi-instance deployments.
- LLM gateway -- An in-process gateway with circuit breaking, retry with exponential backoff, provider fallback chains, response caching, and rate limiting.
- OpenTelemetry export -- Structured tracing and metrics exported via OTLP, with privacy modes and diagnostic triage for production observability.
- Multi-cloud deployment -- Terraform generation for AWS, GCP, and Azure with provider-native services (ECS/Fargate, Cloud Run, Azure Container Apps).
- Kubernetes production manifests -- StatefulSets, ConfigMaps, KEDA autoscaling,
health checks (
/health,/ready,/startup), network policies, and disruption budgets. - Secrets management -- Integration with AWS Secrets Manager, GCP Secret Manager, Azure Key Vault, and HashiCorp Vault.
These features are opt-in. Cloud features activate when you add the corresponding
configuration to neam.toml and enable the appropriate build flags.
Neam also provides native tool calling, extern skill bindings (HTTP and MCP), OWASP-aligned
agentic security across ten security domains, and a comprehensive data type system with
six new types (Tuple, Set, Range, TypedArray, Record, Table), for ... in loops, the
pipe operator, f-strings, destructuring, comprehensions, an iterator protocol with lazy
evaluation, broadcasting, and window functions. These additions transform Neam into a
data-processing-capable DSL alongside its agent-native foundations.
1.2.7 Security-First Programming #
Neam implements a comprehensive agentic security framework covering 10 security domains aligned with the OWASP Top 10 for Agentic Applications (2026) and the MAESTRO seven-layer defense model. Security in Neam is not an afterthought or a middleware layer -- it is built into the language syntax and enforced at compile time.
Real-world attacks against agentic AI systems are no longer theoretical. In 2025 alone:
| Date | Incident | Impact |
|---|---|---|
| Jul 2025 | Amazon Q supply chain -- malicious PR with aws s3 rm, terminate-instances |
1M developers exposed |
| Sep 2025 | Malicious Postmark MCP server -- BCC'd all emails to attacker | Full email exfiltration |
| Oct 2025 | Backdoored MCP packages -- dual reverse shells, 86K downloads | Remote code execution |
| Nov 2025 | Claude Desktop extensions RCE -- AppleScript injection (CVSS 8.9) | SSH keys, AWS creds, passwords |
Every one of these attacks maps directly to a Neam attack surface (MCP servers, external skills, bash execution, tool chaining). Neam's v0.6.9 security domains address each.
The ten domains are:
- Structured Audit Logging (D1) -- Every tool call, guard decision, and policy enforcement is logged as a structured JSON event with trace IDs for full forensic reconstruction.
- Tool Permission Model (D2) --
policydeclarations specify allow/deny/confirm lists for skills, withdefault_deny: trueas the recommended default. - Prompt Injection Defense (D3) --
guardblocks filter inputs and outputs to detect and block goal-hijack attacks before they reach the LLM. - Network and SSRF Protection (D4) -- URL allowlists and network sandboxing prevent agents from accessing internal infrastructure.
- Rate Limiting and Throttling (D5) --
budgetdeclarations cap token usage, API calls, and cost per agent or per session. - MCP and Supply Chain Hardening (D6) -- External MCP servers run in sandboxed profiles with capability restrictions.
- Credential Isolation (D7) -- Secrets are managed through provider integrations (AWS Secrets Manager, GCP Secret Manager, Azure Key Vault, HashiCorp Vault) and never appear in logs or child process environments.
- Input Validation and Boundaries (D8) -- Size limits, type constraints, and boundary checks are enforced on all agent inputs.
- Behavioral Monitoring (D9) -- Runtime anomaly detection identifies agents that deviate from expected behavior patterns.
- Human-in-the-Loop Controls (D10) -- The
sensitive: truemarker on skills requires explicit human approval before execution.
Here is a minimal example showing policy, guard, and budget working together:
guard InputFilter {
description: "Block prompt injection attempts"
on_observation(text) {
if (text.contains("ignore previous instructions")) {
return false;
}
return true;
}
}
policy StrictPolicy {
allow: [search_docs]
confirm: [send_email]
default_deny: true
}
budget SessionBudget {
api_calls: 100
tokens: 50000
cost_usd: 1.00
}
agent SecureBot {
provider: "openai"
model: "gpt-4o"
system: "You are a secure assistant."
skills: [search_docs, send_email]
policy: StrictPolicy
input_guardrails: [InputFilter]
}
All of these declarations -- guard, policy, and budget -- compile to bytecode
and are enforced by the VM. A policy violation does not produce a Python exception that
can be caught and ignored; it produces a VM-level block that cannot be bypassed by
application code.
1.2.8 External Skill Bindings #
Neam supports extern skill declarations for binding to external tools
without writing inline implementation code. Three binding types are supported:
- HTTP APIs -- Bind any REST endpoint as a skill with compile-time schema validation.
- MCP servers -- Connect to any of the 10,000+ MCP tool servers with sandboxed execution profiles and supply chain verification.
- Claude built-in tools -- Access Anthropic's native tool capabilities (web search, code execution, file handling) as first-class Neam skills.
External skill bindings are validated at compile time: the compiler checks that endpoint URLs are well-formed, that MCP server configurations are complete, and that parameter schemas match the declared types.
1.3 Neam's Architecture #
Neam follows a classical language implementation architecture adapted for the AI agent domain. Source code flows through four stages: parsing, AST construction, compilation, and execution.
1.3.0a Neam Program Anatomy #
Before examining each stage in detail, here is how a complete Neam program is structured. Every construct -- skills, knowledge bases, guards, agents, and the main block -- has a defined place:
┌═══════════════════════════════════════════════════════════════════════════┐
│ NEAM PROGRAM ANATOMY │
│═══════════════════════════════════════════════════════════════════════════│
│ │
│ ┌── Skills (Local Tools) ───────┐ ┌── External Skills ──────────────┐ │
│ │ skill calculate { │ │ extern skill get_weather { │ │
│ │ description: "..." │ │ binding: http { ... } │ │
│ │ params: { expr: string } │ │ } │ │
│ │ impl(expr) { return ...; } │ │ │ │
│ │ } │ │ extern skill bash_tool { │ │
│ └────────────────────────────────┘ │ binding: claude_builtin { │ │
│ │ type: "bash_20241022" │ │
│ ┌── Knowledge (RAG) ────────────┐ │ } │ │
│ │ knowledge Docs { │ │ } │ │
│ │ vector_store: "usearch" │ └────────────────────────────────┘ │
│ │ sources: [{ type: "file", │ │
│ │ path: "./docs.md" }] │ ┌── MCP Servers ────────────────┐ │
│ │ retrieval_strategy: "basic"│ │ mcp_server fs { │ │
│ │ } │ │ command: "npx" │ │
│ └────────────────────────────────┘ │ args: ["-y", "..."] │ │
│ │ } │ │
│ ┌── Guards & Budgets ───────────┐ │ adopt fs.* │ │
│ │ guard Safety { ... } │ └────────────────────────────────┘ │
│ │ guardchain Chain = [Safety]; │ │
│ │ budget Limit { │ ┌── Agents ─────────────────────┐ │
│ │ api_calls: 30 │ │ agent Coder { │ │
│ │ tokens: 200000 │ │ provider: "openai" │ │
│ │ } │ │ model: "gpt-4o" │ │
│ └────────────────────────────────┘ │ system: "You are ..." │ │
│ │ skills: [calculate, ...] │ │
│ ┌── Main Program ───────────────┐ │ connected_knowledge: [Docs] │ │
│ │ { │ │ guards: [Chain] │ │
│ │ let r = Coder.ask("..."); │ │ budget: Limit │ │
│ │ emit r; │ │ } │ │
│ │ } │ └────────────────────────────────┘ │
│ └────────────────────────────────┘ │
│ │
└══════════════════════════════════════════════════════════════════════════┘
1.3.0b Request Lifecycle -- From HTTP to Agent Response #
When Neam runs as an API server (via neam-api), every incoming request follows a
well-defined lifecycle. Understanding this flow is essential for production deployment:
Client neam-api Neam VM LLM Provider
│ │ │ │
│ POST /api/v1/agent/ask │ │ │
│ { agent: "Coder", │ │ │
│ message: "..." } │ │ │
│────────────────────────>│ │ │
│ │ │ │
│ │ pipeline.compile() │ │
│ │ (JIT compile .neam) │ │
│ │────────────────────────>│ │
│ │ │ │
│ │ │ POST /v1/chat/completions
│ │ │ (with tools array) │
│ │ │─────────────────────────>│
│ │ │ │
│ │ │ tool_call: calculate() │
│ │ │<─────────────────────────│
│ │ │ │
│ │ │ [execute skill locally] │
│ │ │ return tool result │
│ │ │─────────────────────────>│
│ │ │ │
│ │ │ final text response │
│ │ VM result │<─────────────────────────│
│ │<────────────────────────│ │
│ │ │ │
│ { response: "..." } │ │ │
│<────────────────────────│ │ │
Key insight: your .neam source files ship inside the container. neam-api compiles
them at request time (JIT). Update agent logic = update source files = rebuild container.
No separate build artifact is needed for API deployments.
This JIT model means the neamc compiler binary must be present in
the Docker image alongside neam-api. Chapter 20 covers the multi-stage Docker build
that achieves this while keeping images at ~32 MB.
1.3.1 The Parser #
Neam uses tree-sitter as its parsing foundation. Tree-sitter is an incremental parsing library originally developed for code editors. It provides error recovery (so the parser can produce a partial AST even when the source contains syntax errors), incremental reparsing (so only changed regions are re-parsed), and a well-defined grammar format.
The tree-sitter grammar defines Neam's syntax: agent declarations, skill definitions, extern skill bindings, knowledge bases, voice pipelines, memory configurations, guard/guardchain/policy/budget declarations, cognitive features, module declarations, imports, functions, control flow, expressions, and all the syntactic details that make Neam's source readable.
1.3.2 The Compiler #
The compiler (neamc) walks the AST produced by the parser and emits bytecode. During
compilation, it performs several critical tasks:
- Module resolution: Resolves
importstatements, builds a dependency graph, and checks for circular imports. - Declaration validation: Verifies that agent declarations reference valid providers,
that handoff targets exist, that skill parameters have valid schemas, that
knowledge base configurations are complete, that
policydeclarations reference only defined skills, and thatbudgetlimits are well-formed. - Type checking: Performs type inference and checks type constraints.
- Constant folding: Evaluates constant expressions at compile time.
- Bytecode generation: Produces a sequence of bytecode instructions that the VM can execute.
The compiler is invoked from the command line:
./neamc hello.neam -o hello.neamb
1.3.3 The Bytecode Format #
Compiled Neam programs are stored in .neamb files with the following binary layout:
| Section | Description |
|---|---|
| Magic | NEAM (4 bytes) -- identifies the file as Neam bytecode |
| Version | Bytecode format version number |
| Manifest | JSON metadata (agent names, tool names, entry point) |
| Constants | Constant pool: strings, numbers, and other literals |
| Code | Bytecode instructions (84 opcodes) |
| Source map | Line number mappings for debugging and error reporting |
The bytecode format is designed for fast loading and compact representation. The manifest
section enables tools like neam-api and neam-gym to inspect a compiled bundle without
executing it -- for example, to discover which agents are defined in a file.
1.3.4 The Virtual Machine #
The Neam VM is a stack-based virtual machine implemented in C++20. It executes bytecode instructions and provides native implementations for all domain-specific operations:
- LLM calls: The VM integrates HTTP clients for OpenAI, Anthropic, Gemini, and
Ollama APIs. When an agent's
.ask()method is called, the VM constructs the appropriate API request, handles streaming, parses the response, and manages conversation history. - RAG: The VM implements seven retrieval strategies using the uSearch HNSW index for vector similarity search and embedding providers for document vectorization.
- Skills: When an LLM requests a skill call, the VM dispatches to the skill's
implblock (for inline skills) or to the external binding (forextern skilldeclarations targeting HTTP APIs, MCP servers, or Claude built-in tools), executes the operation, and returns the result to the LLM. - Security enforcement: The VM enforces
guardblocks (input/output filtering),policydeclarations (skill access control), andbudgetdeclarations (resource limits) at the bytecode level. Policy violations and budget exhaustion produce VM-level blocks that cannot be bypassed by application code. - Cognitive features: The VM implements reasoning strategy injection, reflection scoring, learning review, prompt evolution, and autonomous scheduling.
- Voice: The VM manages STT and TTS provider calls for voice pipelines.
- Memory: The VM provides SQLite-backed persistent memory with checkpoints and rewind.
- Tracing: Every LLM call, skill call, guard decision, policy enforcement, and significant event is logged to JSONL trace files with structured audit events.
1.4 The Toolchain #
Neam ships with nine executables and a shared library. This section describes each tool's purpose, primary use cases, and key command-line options.
1.4.1 neamc -- The Compiler #
The compiler reads .neam source files and produces .neamb bytecode:
./neamc program.neam -o program.neamb
Key behaviors:
- Resolves all
importstatements and compiles the full module graph - Validates all agent, skill, knowledge, voice, memory, runner, guard, guardchain, policy, and budget declarations
- Validates
extern skillbindings (HTTP endpoint URLs, MCP server configurations, Claude built-in tool references) - Reports errors with file, line, and column information
- Produces a self-contained
.neambfile that includes all compiled modules
1.4.2 neam -- The VM Runner #
The VM runner loads and executes compiled bytecode:
./neam program.neamb
The VM handles all runtime operations: LLM API calls, file I/O, HTTP requests, voice transcription and synthesis, RAG retrieval, cognitive reasoning, and autonomous scheduling.
1.4.3 neam-cli -- Interactive CLI #
The CLI provides a convenient interface for development:
./neam-cli program.neam # Compile and run in one step
./neam-cli --watch program.neam # Watch mode: recompile on file changes
Watch mode is particularly useful during development. When you save a .neam file, the
CLI automatically recompiles and re-executes, providing rapid feedback.
1.4.4 neam-api -- API Server #
The API server exposes agents as HTTP endpoints:
./neam-api --port 8080 # Standard REST API
./neam-api --port 9090 --agent-file agent.neamb --a2a # A2A protocol mode
Standard endpoints:
| Endpoint | Method | Description |
|---|---|---|
/api/v1/health |
GET | Health check |
/api/v1/agents |
GET | List available agents |
/api/v1/agent/ask |
POST | Query an agent |
A2A endpoints (when --a2a is enabled):
| Endpoint | Method | Description |
|---|---|---|
/.well-known/agent.json |
GET | Agent card discovery |
/a2a |
POST | JSON-RPC 2.0 task dispatch |
/a2a/tasks/{id}/stream |
GET | SSE streaming |
1.4.5 neam-pkg -- Package Manager #
The package manager handles project initialization, dependency management, and package
publishing. It connects to the central registry at registry.neam.dev for package
discovery and distribution:
neam-pkg init my-project # Create a new project with neam.toml
neam-pkg install agent-utils # Install a dependency
neam-pkg install agent-utils@1.2.0 # Install a specific version
neam-pkg install --dev test-lib # Install a dev dependency
neam-pkg list # List installed packages
neam-pkg outdated # Show outdated packages
neam-pkg update # Update all dependencies
neam-pkg remove agent-utils # Remove a dependency
neam-pkg search "agent" # Search the registry
neam-pkg info agent-utils # Show package info
neam-pkg link # Link local package for development
neam-pkg login # Login to registry
neam-pkg publish # Publish your package
neam-pkg yank 1.0.0 # Yank (deprecate) a version
Packages are distributed as .neampkg archives containing source files, a
checksums.sha256 manifest, and an optional signature.sig for cryptographic
verification. The registry enforces the "One Hundred Percent" rule: a package
cannot be published unless it passes authentication and includes a Gym certificate
with pass_rate: 1.0, ensuring that every published component has been fully
evaluated.
Project configuration is stored in neam.toml:
neam_version = "1.0"
[project]
name = "my-project"
version = "0.1.0"
description = "My Neam project"
type = "binary"
authors = ["Developer <dev@example.com>"]
license = "MIT"
[project.entry_points]
main = "src/main.neam"
[dependencies]
utils = "^1.0.0"
json-parser = { version = "2.0", features = ["streaming"] }
[dev-dependencies]
test-framework = "0.1.0"
[features]
default = ["basic"]
basic = []
advanced = ["rag-support"]
1.4.6 neam-lsp -- Language Server #
The LSP server provides IDE integration for any editor that supports the Language Server Protocol (VS Code, Neovim, Emacs, Sublime Text, etc.):
- Syntax highlighting
- Error diagnostics as you type
- Go-to-definition for agents, tools, functions, and modules
- Hover information for declarations
- Autocompletion for keywords, agent properties, and imported symbols
1.4.7 neam-dap -- Debug Adapter #
The DAP server enables step-through debugging in VS Code and other DAP-compatible editors:
- Set breakpoints in
.neamsource files - Step through execution one statement at a time
- Inspect variables, agent state, and the call stack
- Watch expressions
1.4.8 neam-gym -- Evaluation Harness #
The evaluation harness runs agents against labeled datasets and produces statistical performance reports:
./neam-gym --agent agent.neamb --dataset questions.jsonl \
--output report.json --runs 5
Dataset format (JSONL, one test case per line):
{"id": "q1", "input": "What is 2+2?", "expected": "4", "grader": "exact_match"}
{"id": "q2", "input": "Explain gravity", "expected": "force", "grader": "contains"}
{"id": "q3", "input": "Is AI safe?", "expected": "nuanced", "grader": "llm_judge"}
Available graders: exact_match, contains, regex, llm_judge, semantic_match.
The report includes pass rate (mean and standard deviation across runs), latency percentiles (p50, p95, p99), total tokens, and estimated cost.
1.4.9 libneam -- Shared Library #
The shared library (libneam.dylib on macOS, libneam.so on Linux, neam.dll on
Windows) exposes a C API for embedding the Neam runtime in other languages. This enables
calling Neam agents from Python, Go, Rust, Swift, or any language with C FFI support.
The public C API header ships with pre-built binary distributions and is included in
the bin/ directory when building from source.
1.5 Key Concepts #
Before diving into Neam code, it helps to understand the core concepts that the language is built around. Each of these is a first-class language construct with dedicated syntax and compile-time validation.
1.5.1 Agents #
An agent is the fundamental unit of intelligence in Neam. It wraps an LLM provider with configuration (model, temperature, system prompt) and optional capabilities (tools, handoffs, knowledge bases, cognitive features).
agent MyAgent {
provider: "openai"
model: "gpt-4o-mini"
temperature: 0.7
system: "You are a helpful assistant."
}
Agents expose the .ask() method for sending queries and receiving responses.
Neam provides three agent types, each designed for a distinct execution model:
| Agent Type | Keyword | Execution Model | Best For |
|---|---|---|---|
| Stateless Agent | agent |
Single .ask() call, no lifecycle |
Simple Q&A, RAG queries, one-shot tasks |
| Claw Agent | claw agent |
Persistent sessions, multi-turn, compaction | Assistants, chatbots, workflow automation |
| Forge Agent | forge agent |
Iterative build-verify loops, fresh context | TDD coding, document generation, research |
The stateless agent is the foundation covered in this chapter and Chapters 10--14.
Claw agents add session persistence, multi-channel I/O, lane-based concurrency, and
semantic memory for long-running conversational agents (Chapter 24). Forge agents add
verification-driven iteration loops with git checkpoints, plan tracking, and fresh context
per iteration for autonomous build workflows (Chapter 25). The compiler enforces type
safety across all three: a forge agent cannot declare channels, and a claw agent cannot
declare a verify callback. Invalid combinations are rejected at compile time.
1.5.2 Skills (Tools) #
A skill is a capability that an agent can invoke during response generation. Skills
(the evolution of the earlier tool keyword, which remains supported for backward
compatibility) are defined with a description, parameter schema, and implementation.
Skills support impl blocks for inline logic:
skill Calculator {
description: "Perform arithmetic calculations"
params: { expression: string }
impl(expression) {
// Implementation in Neam code
return eval_math(expression);
}
}
Skills can also be marked as sensitive: true to require human approval before
execution, integrating with the policy and human-in-the-loop systems described in
Section 1.2.7.
Neam also supports extern skill declarations for binding to external services without writing inline implementation code:
// HTTP API binding
extern skill weather_api {
description: "Get current weather"
type: "http"
endpoint: "https://api.weather.com/v1/current"
method: "GET"
}
// MCP server binding
extern skill code_runner {
description: "Execute code in a sandbox"
type: "mcp"
server: "code-sandbox-server"
}
// Claude built-in tool binding
extern skill web_search {
description: "Search the web"
type: "claude_builtin"
tool: "web_search"
}
When an LLM decides to use a skill, the VM intercepts the tool call, executes the
impl block (for inline skills) or dispatches to the external binding (for extern
skills), and returns the result to the LLM for incorporation into its response.
1.5.3 Handoffs #
A handoff transfers control from one agent to another. This is the mechanism for building multi-agent systems where specialized agents handle different aspects of a task:
agent Triage {
provider: "openai"
model: "gpt-4o-mini"
system: "Route requests to the right specialist."
handoffs: [RefundAgent, BillingAgent]
}
1.5.4 Runners #
A runner defines a controlled execution loop for multi-agent systems. It specifies an entry agent, maximum turns, tracing configuration, and guardrails:
runner CustomerFlow {
entry_agent: Triage
max_turns: 10
tracing: TraceLogger
input_guardrails: [ContentFilter]
output_guardrails: [SafetyFilter]
}
1.5.5 Guardrails, Policies, and Budgets #
A guard is a filter applied to agent inputs or outputs. Guards implement an
on_observation callback that returns true (allow) or false (block):
guard SafetyFilter {
description: "Block unsafe content"
on_observation(text) {
if (text.contains("harmful")) {
return false;
}
return true;
}
}
Multiple guards can be composed into a guardchain for sequential filtering.
A policy declaration controls which skills an agent is permitted to use, with
explicit allow/deny/confirm lists and a default_deny option:
policy StrictPolicy {
allow: [search_docs]
confirm: [send_email]
default_deny: true
}
A budget declaration enforces resource limits per agent or session, preventing runaway costs:
budget SessionBudget {
api_calls: 100
tokens: 50000
cost_usd: 1.00
}
Guards, policies, and budgets all compile to bytecode and are enforced by the VM. Together they form Neam's security-first approach to agentic AI, aligned with the 10 OWASP agentic security domains (see Section 1.2.7).
1.5.6 Knowledge Bases #
A knowledge base ingests documents, chunks them, computes embeddings, and provides relevant context to agents via retrieval-augmented generation (RAG):
knowledge Docs {
vector_store: "usearch"
embedding_model: "nomic-embed-text"
chunk_size: 200
chunk_overlap: 50
sources: [{ type: "file", path: "./docs/*.md" }]
retrieval_strategy: "hybrid"
top_k: 5
}
Seven retrieval strategies are available as first-class language features: basic, MMR, hybrid, HyDE, self-RAG, corrective RAG, and agentic RAG. Each strategy is implemented natively in the VM using the uSearch HNSW index for vector similarity search, not as library wrappers. The compiler validates that the chosen strategy is compatible with the configured embedding model and vector store.
1.5.7 Voice Pipelines #
A voice pipeline chains speech-to-text, agent processing, and text-to-speech:
voice VoicePipeline {
agent: MyAgent
stt_provider: "whisper"
stt_model: "whisper-1"
tts_provider: "openai"
tts_model: "tts-1"
tts_voice: "alloy"
}
1.5.8 Cognitive Features #
Cognitive features are opt-in agent properties that enable advanced behaviors:
- Reasoning (
reasoning: chain_of_thought) -- structured thinking before answering - Reflection (
reflect: { ... }) -- self-evaluation and revision - Learning (
learning: { ... }) -- experience replay and pattern extraction - Evolution (
evolve: { ... }) -- automatic prompt refinement - Autonomy (
goals: [...],triggers: { ... }) -- goal-driven scheduling
These features compose. An agent can use reasoning, reflection, learning, evolution, and autonomy simultaneously, and the VM manages the interactions between them.
1.6 A Taste of Neam #
Let us walk through a complete Neam program to see how these concepts fit together. This example builds a simple customer service system with a triage agent that routes requests to specialists.
agent Triage {
provider: "openai"
model: "gpt-4o-mini"
temperature: 0.3
system: "You are a triage agent. Analyze the customer's request and route it.
Reply with ROUTE:REFUND, ROUTE:BILLING, or ROUTE:GENERAL."
}
agent RefundSpecialist {
provider: "openai"
model: "gpt-4o-mini"
temperature: 0.5
system: "You are a refund specialist. Process refund requests professionally
and empathetically. Ask for order numbers if not provided."
}
agent BillingSpecialist {
provider: "openai"
model: "gpt-4o-mini"
temperature: 0.5
system: "You are a billing specialist. Help customers with invoices,
payment methods, and account charges."
}
fun route(triage_response) {
if (triage_response.contains("ROUTE:REFUND")) {
return "refund";
}
if (triage_response.contains("ROUTE:BILLING")) {
return "billing";
}
return "general";
}
{
let query = "I was charged twice for order #12345 and I want my money back.";
// Step 1: Triage the request
let classification = Triage.ask(query);
let destination = route(classification);
emit "Routed to: " + destination;
// Step 2: Handle with the appropriate specialist
if (destination == "refund") {
let response = RefundSpecialist.ask(query);
emit response;
}
if (destination == "billing") {
let response = BillingSpecialist.ask(query);
emit response;
}
}
To compile and run:
./neamc customer_service.neam -o customer_service.neamb
./neam customer_service.neamb
Notice several things about this program:
-
Agent declarations are declarative. You specify what each agent is (provider, model, system prompt), not how it calls the API. The VM handles the HTTP mechanics.
-
The routing logic is plain code. The
routefunction is ordinary Neam code that inspects the triage agent's response. There is no special "router" abstraction to learn. -
The main block reads top to bottom. The execution flow is clear: triage, route, handle. There is no callback soup, no chain configuration, no framework-specific execution model.
-
Everything is validated at compile time. If you misspell
RefundSpecialistasRefundsSpecialistin theifblock, the compiler catches it.
1.7 Comparison: Neam vs. the Field #
The following table compares Neam with the major Python-based AI agent frameworks across key dimensions. This is not a value judgment -- each tool has its strengths. The table is intended to help you understand where Neam fits in the landscape.
| Dimension | Neam | LangChain | AutoGen | CrewAI | DSPy |
|---|---|---|---|---|---|
| Language | Neam (dedicated DSL) | Python | Python | Python | Python |
| Compilation | Yes (.neam to .neamb) |
No | No | No | No |
| Execution model | Bytecode VM (84 opcodes) | Python interpreter | Python interpreter | Python interpreter | Python interpreter |
| Agent definition | Language keyword | Class instantiation | Class instantiation | Class instantiation | Module composition |
| Tool definition | Language keyword with schema validation | Function decorator | Function/class | Function decorator | N/A (signatures) |
| Handoffs | First-class handoffs field |
Custom chain logic | Conversation routing | Task delegation | N/A |
| RAG strategies | 7 built-in (basic to agentic) | Via retrievers | External | External | Built-in retrieval |
| Voice pipelines | Built-in voice keyword |
External (ElevenLabs etc.) | External | External | N/A |
| Cognitive features | Built-in (reasoning, reflection, learning, evolution, autonomy) | Via prompt engineering | Via conversation patterns | Via process flows | Optimizers |
| Package manager | neam-pkg |
pip | pip | pip | pip |
| LSP/IDE support | neam-lsp (purpose-built) |
Python LSP (generic) | Python LSP (generic) | Python LSP (generic) | Python LSP (generic) |
| Debugger | neam-dap (purpose-built) |
Python debugger | Python debugger | Python debugger | Python debugger |
| Eval framework | neam-gym (built-in) |
LangSmith (external) | External | External | Built-in evaluation |
| API server | neam-api with A2A |
FastAPI (manual) | External | External | External |
| Multi-provider | 7 native (OpenAI, Anthropic, Gemini, Ollama, Azure, Bedrock, Vertex) | Via adapters | Via adapters | Via adapters | Via LM classes |
| Security guardrails | Native (guard/guardchain/policy/budget) | Manual implementation | Manual implementation | Manual implementation | Manual implementation |
| Package ecosystem | neam-pkg (built-in, governed registry) |
pip (general-purpose) | pip (general-purpose) | pip (general-purpose) | pip (general-purpose) |
| External tool binding | extern skill (HTTP/MCP/Claude built-in) |
Custom code per integration | Custom code per integration | Custom code per integration | Custom code per integration |
| Deployment | Built-in (Docker, K8s, Helm, AWS, GCP, Azure) | Manual | Manual | Manual | Manual |
1.7.1 Quantitative Benchmarks: Neam vs. Python #
The following benchmarks are drawn from the scientific comparison paper "Neam: A Compiled Domain-Specific Language for AI-Agentic Workloads" (Govindaraj et al., 2026). They compare Neam against Python+LangChain across seven dimensions.
Lines of Code Reduction:
| Benchmark | Neam (LOC) | Python+LangChain (LOC) | Reduction |
|---|---|---|---|
| B1: Single agent | 8 | 22 | 2.3--2.8x |
| B2: Pipeline (3 agents) | 25 | 85 | 2.9--3.4x |
| B3: Customer service (4 agents) | 35 | 145 | 3.4--4.1x |
| B4: RAG agent | 20 | 95 | 4.8x |
| B5: Full system (6 agents + RAG + voice + MCP) | 85 | 520 | 4.9--6.1x |
| B6: Cognitive agent (reasoning + reflection + learning) | 35 | 420 | 12.0x |
The cognitive agent benchmark (B6) achieves the most dramatic reduction: 35 lines of Neam versus 420 lines of Python combining DSPy, Reflexion, and custom persistence code. The reduction is not just about typing less code -- it reflects the elimination of integration glue, error handling boilerplate, and manual persistence management that dominates the Python implementation.
Startup Latency:
| Runtime | Compilation | Startup to First Agent Call | Total |
|---|---|---|---|
| Neam (compile + run) | 14ms | 18ms | 32ms |
| Neam (cached bytecode) | 0ms | 9ms | 9ms |
| Python + LangChain | 0ms | 2,800ms | 2,800ms |
| Python + OpenAI SDK | 0ms | 850ms | 850ms |
Neam's total compile-and-start time (32ms) is 87x faster than Python+LangChain. Pre-compiled bytecode reduces cold start to 9ms -- critical for serverless and edge deployments where cold start costs directly impact per-invocation pricing.
Memory Footprint (Resident Set Size):
| Runtime | 1 Agent | 4 Agents | Full System (6 agents + RAG + voice) |
|---|---|---|---|
| Neam VM | 4.5 MB | 6.2 MB | 14.8 MB |
| Python + LangChain | 89 MB | 112 MB | 210 MB |
| Python + OpenAI SDK | 45 MB | 52 MB | 95 MB |
Neam's memory footprint is 10--14x smaller than Python alternatives. The full system benchmark at 14.8 MB includes vector indexing, voice buffers, and MCP connections -- compared to 210 MB for the equivalent Python+LangChain deployment.
Per-Turn Orchestration Overhead (Non-LLM Operations):
| Operation | Neam (microseconds) | Python+LangChain (microseconds) | Ratio |
|---|---|---|---|
| Handoff routing decision | 0.8 | 45 | 56x |
| Context serialization (1KB) | 3.5 | 85 | 24x |
| Trace entry creation | 0.5 | 35 | 70x |
| MCP tool discovery | 2.1 | 95 | 45x |
| Total per-turn overhead | 6.9 | 260 | 38x |
At scale -- 50 iterations across 100 concurrent agents -- this translates to Neam completing all orchestration overhead in 34.5ms versus Python's 1,300ms. The LLM call itself dominates total latency in both cases, but the orchestration overhead determines how many agents can run concurrently on a single container.
Deployment Artifact Sizes:
| Deployment Target | Neam | Python+LangChain | Ratio |
|---|---|---|---|
| Native binary | 9.1 MB | N/A | -- |
| Shared library | 3.8 MB | N/A | -- |
| Docker image (minimal) | 32 MB | 1,400 MB | 43.75x smaller |
| AWS Lambda package | 14 MB | 280 MB | 20x smaller |
| WASM module | 2.4 MB | N/A | -- |
A Neam Docker image is 43x smaller than the equivalent Python+LangChain image. This directly reduces container registry costs, pull times, cold start latency, and the attack surface of production deployments.
Compile-Time Error Detection:
| Error Category | Neam | Python | Python + mypy |
|---|---|---|---|
| Undefined agent references | Compile | Runtime | Runtime |
| Missing required agent fields | Compile | Runtime | Runtime |
| Handoff to non-existent agent | Compile | Runtime | Runtime |
| Invalid temperature range | Compile | Runtime | N/A |
| Voice pipeline missing STT/TTS | Compile | Runtime | N/A |
| Tool schema validation | Compile | Runtime | N/A |
| Total catchable errors | 12/12 | 0/12 | 2/12 |
Neam catches all 12 categories of agentic configuration errors at compile time. Python catches zero. Even Python with mypy type checking catches only 2 of 12, because the remaining errors involve domain-specific constructs that mypy has no awareness of.
AI-Agentic Language Fitness Score:
A weighted fitness score across 13 dimensions (agent abstraction, orchestration, tool integration, RAG, voice, protocol support, type safety, runtime performance, memory efficiency, deployment, interoperability, ecosystem, and cognitive capabilities) produces the following rankings:
| Language | Weighted Fitness Score (F(L)) |
|---|---|
| Neam | 0.86 |
| Python | 0.34 |
| Rust | 0.27 |
| C++ | 0.26 |
| Mojo | 0.21 |
| Julia | 0.17 |
Neam's 0.86 fitness score represents a 2.53x advantage over Python (0.34), driven by dominance in the six AI-specific dimensions (agent abstraction, orchestration, tools, RAG, voice, protocols) and the cognitive dimension, which together carry 64% of the total weight.
Total Cost of Ownership (B5 System: 6 agents, 10K daily interactions):
| Cost Component | Python | Neam | Savings |
|---|---|---|---|
| Development (one-time) | $26,000 | $4,250 | 84% |
| Infrastructure (monthly) | $450 | $50 | 89% |
| Cold start penalty (serverless/month) | $120 | $8 | 93% |
| Debugging (runtime errors/month) | $2,000 | $200 | 90% |
| Total monthly (after dev) | $2,570 | $258 | 90% |
These benchmarks measure non-LLM operations. The LLM inference call itself (which accounts for 95--99% of wall-clock time in a typical agent interaction) is identical regardless of the host language. Neam's advantages are in everything around the LLM call: startup, orchestration, memory, deployment, and error detection.
1.7.2 When to Use Neam #
Neam is the right choice when:
- You are building a multi-agent system with complex orchestration (handoffs, routing, pipelines)
- You want compile-time validation of your agent configurations
- You need production tooling (API server, debugger, evaluation) out of the box
- You are using cognitive features (reasoning, reflection, learning, autonomy)
- You need multi-provider support within the same application
- You want a reproducible, inspectable platform for agent research
- You require security-first agentic AI with native guardrails, policies, and budget controls aligned with OWASP standards
- You need to integrate external tools (HTTP APIs, MCP servers, Claude built-in
tools) with compile-time validation via
extern skill - You want a governed package ecosystem where published components are verified and evaluated
1.7.3 When to Use Something Else #
Neam may not be the right choice when:
- You need to integrate deeply with an existing Python codebase (though
libneamprovides a C API for embedding) - Your use case is a simple single-agent chatbot with no orchestration (Python may be faster to prototype)
- You need a large ecosystem of pre-built integrations (LangChain's ecosystem is
currently larger, though Neam's
neam-pkgregistry andextern skillMCP bindings are closing this gap) - Your team has no interest in learning a new language syntax
1.8 Under the Hood: How a Neam Program Executes #
To build intuition for what happens when you run a Neam program, let us trace the execution of the simplest possible program:
{
emit "Hello, Neam!";
}
Step 1: Parsing #
The parser reads the source and produces an AST. For this program, the AST contains a
single block with a single emit statement whose argument is the string literal
"Hello, Neam!".
Step 2: Compilation #
The compiler walks the AST and emits bytecode:
CONSTANT 0 ; Push string "Hello, Neam!" from constant pool
EMIT ; Pop value and send to output stream
HALT ; End execution
The constant pool contains one entry: "Hello, Neam!" at index 0. The bytecode section
contains three instructions.
Step 3: Serialization #
The compiler writes the .neamb file:
Step 4: VM Execution #
The VM loads the .neamb file, initializes the constant pool, and begins executing
instructions:
CONSTANT 0-- pushes the string"Hello, Neam!"onto the stackEMIT-- pops the value and writes it to the output streamHALT-- execution complete
Output: Hello, Neam!
When the program includes agent declarations and .ask() calls, the VM additionally:
- Initializes HTTP clients for the declared providers
- Constructs API request bodies with the agent's configuration
- Sends HTTP requests to LLM provider endpoints
- Parses streaming or batch responses
- Handles tool call dispatch if the LLM requests tools
- Manages conversation history for multi-turn interactions
- Logs trace events to
.neam/traces/
1.9 The Road Ahead #
This chapter has given you the big picture: what Neam is, why it exists, how it differs from existing tools, and how its architecture works at a high level. Neam delivers a security-first compiled language for AI agents with native guardrails aligned to the OWASP Top 10, a governed package ecosystem, external skill bindings, and a comprehensive data type system with rich collection types, an iterator protocol, broadcasting, and a Table type for columnar data processing.
To get started quickly, visit https://neam.lovable.app/ for a guided installation and setup experience. The next chapter covers installing Neam on your machine, building it from source, configuring your editor, and running your first program.
From Chapter 3 onward, the book dives into the language itself -- variables, types, functions, control flow, and the core constructs that make Neam programs work. By the end of Part I, you will be writing multi-agent systems with confidence.
Exercises #
Exercise 1.1: Conceptual Mapping #
For each of the following Python/LangChain concepts, identify the corresponding Neam construct:
| Python / LangChain | Neam |
|---|---|
ChatOpenAI(model="gpt-4o") |
? |
Tool(name="search", ...) |
? |
AgentExecutor(agent, tools) |
? |
ConversationBufferMemory() |
? |
FAISS.from_documents(docs) |
? |
| Manual guardrail middleware | ? |
pip install package |
? |
| Custom API integration code | ? |
Exercise 1.2: Architecture Trace #
Consider the following Neam program:
agent Bot {
provider: "ollama"
model: "llama3"
system: "You are helpful."
}
{
let x = Bot.ask("Hi");
emit x;
}
Trace the execution through all four stages (parsing, compilation, VM execution, output). At the VM execution stage, describe what HTTP request the VM makes to the Ollama endpoint and what the response flow looks like.
Exercise 1.3: Toolchain Identification #
For each of the following tasks, identify which Neam toolchain component you would use:
- You want to check your
.neamfile for errors without running it. - You want to expose your agent as an HTTP endpoint.
- You want to benchmark your agent's accuracy on 100 test questions.
- You want to install a third-party Neam package.
- You want to get autocomplete in VS Code while editing
.neamfiles. - You want to step through your program line by line to debug a handoff issue.
- You want to call a Neam agent from a Python script.
- You want to search the Neam registry for an existing RAG utility package.
- You want to publish a package and need to ensure it passes evaluation.
Exercise 1.4: Framework Comparison #
Choose one of the following frameworks: LangChain, AutoGen, CrewAI, or DSPy. Build the customer service triage example from Section 1.6 in that framework. Then answer:
- How many lines of code did you write compared to the Neam version?
- At what point are configuration errors discovered (compile time vs. runtime)?
- How would you add a voice pipeline to your implementation?
- How would you evaluate your agent's accuracy on a labeled dataset?
Exercise 1.5: Design Thinking #
Imagine you are designing a new first-class construct for Neam (beyond agent, skill, knowledge, voice, runner, guard, guardchain, policy, budget, and memory). What would it be? Write a brief design document (1-2 paragraphs) that includes:
- The name of the construct
- The problem it solves
- An example of its syntax
- How the compiler would validate it
- How the VM would execute it