star3.0.0

BoxLang AI Module v3.0.0 — a major update introducing the new AI Skills system, MCP server seeding, tool registry, provider capability system, agent hierarchy, middleware support, and more.

Released: 2026

BoxLang AI 3.0 is the biggest release since the module launched. It's a ground-up rethink of how AI agents, models, and tools are structured — and it ships ten major features at once.

The headline is the AI Skills system: a first-class implementation of Anthropic's Agent Skills open standardarrow-up-right that lets you define reusable knowledge blocks — coding styles, domain rules, tone policies, API guidelines — once in a SKILL.md file and inject them into any number of agents and models at runtime. No more copy-pasting the same system-prompt boilerplate everywhere. Skills are versioned, composable, and come in two modes: always-on (full content in every call) and lazy (only a name + description until the LLM asks for more).

But that's just the start. MCP server seeding means agents can now discover and use tools from any MCP server automatically — point an agent at an MCP URL and it figures out what's available. The Global Tool Registry lets you register tools by name once and reference them as plain strings anywhere in your codebase. The provider capability system brings real type safety to multi-provider workflows, so you get a clear UnsupportedCapability error instead of a cryptic runtime crash when a provider doesn't support embeddings or streaming. A new parent-child agent hierarchy enables proper multi-agent orchestration trees with cycle detection, depth tracking, and ancestor traversal. Middleware support on both models and agents finally gives you a clean place to plug in logging, retries, and guardrails without touching provider code.

Under the hood, the architecture is significantly cleaner: BaseService has been properly split into a provider-agnostic base and OpenAIService, the IAiService interface has been trimmed to essentials, and all runnable objects have been moved to a dedicated runnables/ folder. AiAgent is now fully stateless, making it safe to run across concurrent requests without shared-state bugs.

3.0 has no breaking changes to the public BIF signatures — your existing aiChat(), aiEmbed(), and aiAgent() calls work exactly as before.


🎁 New Features

🎯 AI Skills System

Composable, reusable knowledge blocks — following the Claude Agent Skills open standardarrow-up-right — that can be injected into any model or agent system message at runtime.

Think of a skill as a portable unit of expertise: a coding style guide, a tone-of-voice policy, domain-specific rules, API cheat sheets, or anything else your AI should "know" before answering. Skills are defined once, stored in files or inline, and shared across any number of agents and models — no copy-paste, no drift.

Without Skills                      With Skills
─────────────────────────────────── ───────────────────────────────────
Agent A system message:             ✅ Define once → inject everywhere
  "Always use ISO dates.
   Follow our SQL style guide.      skills/
   Prefer short column names.           sql-style/      SKILL.md
   Never use SELECT *."                 date-format/    SKILL.md
                                        api-guidelines/ SKILL.md
Agent B system message:
  "Always use ISO dates.            agent = aiAgent( skills: sqlStyleSkill )
   Follow our SQL style guide.      model = aiModel( skills: apiGuidelinesSkill )
   ..."  ← copy-paste drift starts

📦 Defining Skills

Inline — for short, self-contained guidance that lives in your code:

From a SKILL.md file — for longer guidance that benefits from version control and editor tooling:

The SKILL.md file format is simple Markdown with an optional YAML frontmatter description field. If frontmatter is absent, the first paragraph of the body is used as the description automatically.

Store skill files under .ai/skills/<skill-name>/SKILL.md in your project. The directory name becomes the skill's default name when loading from a path.


⚡ Two Injection Modes

Different situations call for different strategies. Skills support two modes that you can mix freely within the same agent.

Always-on skills ( withSkills() / addSkill() ): Full skill content is injected into the system message on every single call. The LLM always has this knowledge in context, with zero latency. Best for short, universally relevant guidance like tone, format rules, or domain vocabulary.

Lazy / available skills ( withAvailableSkills() / addAvailableSkill() ): Only a compact index — just the skill name and one-line description — is included in the system message. When the LLM determines it needs a skill, it calls the auto-registered loadSkill( name ) tool to fetch the full content on demand. Best for large or specialized skill libraries where most skills are irrelevant to most queries — keeps token usage low.

You can promote a lazy skill to always-on at any point during a session:


🌍 Global Skills Pool

Register skills once at the application or module level and have them automatically available to every new agent — no need to explicitly pass them:

Global skills are prepended to every new agent's availableSkills pool. You can also configure them statically in ModuleConfig.bx via settings.globalSkills.


🔍 Introspection


📋 Full API Reference

Method / BIF
Where
Description

aiSkill( path | name, description, content, recurse )

Global BIF

Create inline or discover file-based skills

aiGlobalSkills()

Global BIF

Access the globally shared skills pool

withSkills( skills )

AiModel, AiAgent

Replace all always-on skills

addSkill( skill )

AiModel, AiAgent

Add a single always-on skill

withAvailableSkills( skills )

AiModel, AiAgent

Replace all lazy skills

addAvailableSkill( skill )

AiModel, AiAgent

Add a single lazy skill

activateSkill( name )

AiModel, AiAgent

Promote a lazy skill to always-on

buildSkillsContent()

AiModel, AiAgent

Render the combined system-message block

skills: [] param

aiAgent(), aiModel()

Seed always-on skills at construction time

availableSkills: [] param

aiAgent()

Seed lazy skills at construction time

📖 Full Guide: AI Skills


🔌 MCP Server Seeding for Agents and Models

Agents and models can now be seeded directly with one or more MCP servers. All tools exposed by those servers are automatically discovered via listTools() and registered as MCPTool instances — no manual Tool construction required.

New APIs:

  • withMCPServer( server, config ) — Attach a single MCP server (URL string or pre-configured MCPClient). Config supports token, timeout, headers, user, password.

  • withMCPServers( servers ) — Attach multiple servers in one call.

  • listMcpServers() — Returns currently connected MCP servers and their exposed tool names.

  • listTools() on AiAgent returns [{ name, description }] for all registered tools.

  • getConfig() now includes tools (full name/description list) and mcpServers (server URL + tool-name list).

  • New MCPTool class proxies a single MCP server tool and converts its inputSchema to OpenAI function-calling format automatically.

  • MCP servers are auto-injected into the system prompt so the LLM knows which tools came from which server.


🗄️ Global AI Tool Registry

New singleton AIToolRegistry (accessible via aiToolRegistry() BIF) provides a module-scoped registry for AI tools. Tools registered by name can be referenced in params.tools arrays as plain strings and resolved automatically before requests — no live object references required.

  • Supports module namespacing (e.g. now@bxai) to avoid collisions across modules

  • resolveTools() converts string keys → ITool instances before LLM requests

  • Two new interception points: onAIToolRegistryRegister and onAIToolRegistryUnregister


🔧 Tool System Overhaul

A major rethink of how tools are defined, registered, and invoked.

BaseTool abstract base class: All tool implementations now extend BaseTool, which provides the shared invocation lifecycle (firing beforeAIToolExecute/afterAIToolExecute events), result serialization, and the fluent describeArg() / describe[ArgName]() annotation syntax.

ClosureTool replaces the retired Tool.bx. A BaseTool subclass backed by any closure or lambda that auto-introspects the callable's parameter metadata to generate an OpenAI-compatible function schema.

CoreTools ships two built-in tools:

  • now@bxai — auto-registered on module load, returns current date/time in ISO 8601 for temporal awareness

  • httpGet — opt-in only (not auto-registered for security), fetches any URL via HTTP GET

Additional improvements:

  • Non-required tool arguments are now supported

  • Tools receive the full originating AiChatRequest as _chatRequest for context-aware logic

  • Structured output and streaming tools for Ollama


🛡️ Provider Capability System

A new type-safe capability system that prevents calling unsupported operations on providers.

  • New models/providers/capabilities/ package with IAiChatService and IAiEmbeddingsService scoped interfaces

  • Every provider exposes getCapabilities() (returns ["chat", "stream", "embeddings", ...]) and hasCapability( "chat" ) for clean runtime introspection

  • aiChat(), aiChatStream(), and aiEmbed() BIFs now check the provider implements the required capability before calling and throw a clear UnsupportedCapability exception instead of a cryptic provider error


🌲 AiAgent Parent-Child Hierarchy

AiAgent now tracks its position in a multi-agent tree, enabling sophisticated orchestration patterns with full introspection and cycle-detection.

New methods on AiAgent:

Method
Description

setParentAgent(parent)

Assign a parent with self-reference and cycle-detection guards

clearParentAgent()

Detach from a parent

hasParentAgent()

Returns true if the agent has a parent

isRootAgent()

Returns true for top-level agents

getRootAgent()

Walks up the tree and returns the root agent

getAgentDepth()

Returns nesting depth (0 = root, 1 = direct child, …)

getAgentPath()

Returns a slash-delimited path string, e.g. /coordinator/researcher

getAncestors()

Returns ordered array [immediateParent, …, root]

addSubAgent() now automatically calls setParentAgent(this) on the sub-agent. getConfig() includes parentAgent, agentDepth, and agentPath.


🧵 Middleware Support

AiModel and AiAgent both support composable middleware for cross-cutting concerns. Agent middleware is prepended ahead of model middleware in the execution chain. Middleware can be passed at construction time or added fluently via .use().

Provider lifecycle hooks: preRequest() and postResponse() on providers let you customize the request/response shape without overriding sendChatRequest(). Useful for custom headers, response normalization, or provider-specific logging.

📦 Shipped Core Middleware

BoxLang AI ships with six battle-tested middleware classes covering the most common cross-cutting concerns. All live in bxModules.bxai.models.middleware.core.

Middleware
When to Use It

LoggingMiddleware

Audit every LLM call and tool invocation — write to console, file, or both with a configurable log level

RetryMiddleware

Automatically retry failed LLM calls with exponential back-off; essential for flaky or rate-limited providers

GuardrailMiddleware

Block dangerous tools entirely and enforce regex-based argument validation before any tool runs

MaxToolCallsMiddleware

Prevent runaway agents by capping the total number of tool invocations per run

HumanInTheLoopMiddleware

Require explicit human approval (CLI prompt or custom callback) before sensitive tools execute

FlightRecorderMiddleware

Record live LLM/tool interactions to a JSON fixture and replay them offline — ideal for testing and debugging


📋 LoggingMiddleware

Logs agent lifecycle events — run start/end, LLM calls, tool calls, and errors — to BoxLang's ai log file and optionally to the console. Drop it in to gain instant observability with no code changes.

Option
Default
Description

logToFile

true

Write to the BoxLang ai log file

logToConsole

false

Also print to stdout

logLevel

"info"

BoxLang log level (info, debug, warning, error)

prefix

"[AI Middleware]"

Prefix prepended to every log message


🔁 RetryMiddleware

Wraps both LLM calls and tool calls with exponential back-off retry logic. Transient provider errors (rate limits, timeouts, 5xx) are retried automatically; non-retryable errors (bad input, max interactions) are surfaced immediately.

Option
Default
Description

maxRetries

3

Maximum retry attempts after first failure

initialDelay

1000

Initial delay in ms before first retry

backoffMultiplier

2

Multiplier applied to delay after each failure

maxDelay

30000

Hard cap on delay in ms

nonRetryableTypes

"InvalidInput,MaxInteractionsExceeded"

Comma-separated exception types to skip retrying


🛡️ GuardrailMiddleware

Blocks tool calls by name or rejects calls whose arguments match configured regex patterns. Use this to prevent dangerous operations (e.g., destructive SQL) from ever reaching a tool.

Option
Default
Description

blockedTools

[]

Array of tool names never allowed (case-insensitive)

argPatterns

{}

{ toolName: [ { paramName: "regexPattern" } ] } — rejects on any match


🙋 HumanInTheLoopMiddleware

Intercepts specified tool calls and requires a human to approve, reject, or edit the arguments before execution proceeds. Supports two modes:

  • cli — Blocks and prompts the user interactively on stdin/stdout. Great for local scripts and development.

  • web — Suspends the run and returns an AiMiddlewareResult.suspend() so the caller can checkpoint the state and present the approval request asynchronously (e.g., via a web UI or notification).

Option
Default
Description

toolsRequiringApproval

[]

Tool names that require human sign-off (case-insensitive)

mode

"cli"

"cli" for blocking prompt, "web" for async suspension

showArguments

true

Whether to display tool arguments in the CLI prompt

approvalCallback

Optional closure function(context) for custom approval logic


🎙️ FlightRecorderMiddleware

Records every LLM round-trip and tool invocation to a JSON fixture file, then replays that fixture deterministically — with no live provider or tool calls — for agent behaviour regression testing in CI.

Option
Default
Description

mode

"passthrough"

"passthrough", "record", or "replay"

fixturePath

""

Path to fixture file (required in replay mode, auto-generated in record)

fixtureDir

".ai/flight-recorder"

Base directory for auto-generated fixture files

recordTools

true

Whether to capture tool call inputs and outputs

strict

true

In replay mode, throw on type mismatch; when false, skip ahead to next matching interaction


🔢 MaxToolCallsMiddleware

Enforces a hard cap on the total number of tool invocations per agent run, cancelling any additional calls once the limit is reached. The counter resets automatically at the start of each new run.

Option
Default
Description

maxCalls

10

Maximum tool invocations allowed per run


🏢 Per-Call Identity Routing on All Memory Types

Following the Spring AI ChatMemory pattern, a single memory instance can now safely serve multiple tenants. add(), getAll(), clear(), trim(), seed(), and related methods on every IAiMemory and IVectorMemory implementation now accept optional userId and conversationId arguments. Construction-time values remain as fallbacks.

AiAgent is now fully stateless: userId and conversationId are resolved per-call from the options argument, eliminating shared-state concurrency bugs in multi-user deployments.


🤗 HuggingFace Embeddings Support

New huggingface provider for generating embeddings via the HuggingFace Inference API.


🔀 Custom Service URLs

All senders in BaseService now accept a custom URL override, enabling easy integration with proxies, self-hosted endpoints, and OpenAI-compatible APIs.


🔧 Enhancements

  • BaseServiceOpenAIService split: OpenAI-specific logic has been moved to a dedicated OpenAIService. BaseService is now truly provider-agnostic, making it much easier to implement custom providers that differ from the OpenAI standard.

  • IAiService contract trimmed: The base interface now declares only identity/configuration/capability-discovery methods (getName(), configure(), getCapabilities(), hasCapability()). The operation methods have moved to their respective capability interfaces.

  • VoyageService corrected: Now extends BaseService directly and implements only IAiEmbeddingsService — eliminating the old stubbed-out chat methods that would throw at runtime. The type system now enforces the embeddings-only constraint.

  • Runnable objects refactored to runnables/ folder (AiModel, AiAgent, AiMessage) for a cleaner separation between service logic and runnable wrappers.

  • resume() / resumeStream() now require explicit threadId: Both methods now require threadId as an explicit required string argument instead of defaulting to an instance property. This eliminates ambiguous state in multi-user deployments.

  • Renamed BaseService.sendRequest()sendChatRequest() for clarity.

  • Reduced duplicate payload fields in onAITokenCount events.


🐛 Bug Fixes

  • Streaming events: Model and Agent streaming was not announcing the global beforeAIModelInvoke / afterAIModelInvoke pre/post events. Fixed.

  • MCP requestId crash: A null-scope crash occurred on JSON-RPC notifications for MCP servers (notifications intentionally omit id). Fixed.

  • MiniMax error surfacing: base_resp.status_code != 0 errors were silently swallowed. Now they surface correctly with status code and message.

  • OllamaService stale embedding hook: The old postEmbeddingResponse() hook was never wired to the current BaseService lifecycle and silently did nothing. Replaced with the proper postResponse( aiRequest, dataPacket, result, operation ) override.


📖 Full Documentation

Explore the full documentation for all new features:

Last updated