3.0.0
BoxLang AI Module v3.0.0 — a major update introducing the new AI Skills system, MCP server seeding, tool registry, provider capability system, agent hierarchy, middleware support, and more.
Released: 2026
BoxLang AI 3.0 is the biggest release since the module launched. It's a ground-up rethink of how AI agents, models, and tools are structured — and it ships ten major features at once.
The headline is the AI Skills system: a first-class implementation of Anthropic's Agent Skills open standard that lets you define reusable knowledge blocks — coding styles, domain rules, tone policies, API guidelines — once in a SKILL.md file and inject them into any number of agents and models at runtime. No more copy-pasting the same system-prompt boilerplate everywhere. Skills are versioned, composable, and come in two modes: always-on (full content in every call) and lazy (only a name + description until the LLM asks for more).
But that's just the start. MCP server seeding means agents can now discover and use tools from any MCP server automatically — point an agent at an MCP URL and it figures out what's available. The Global Tool Registry lets you register tools by name once and reference them as plain strings anywhere in your codebase. The provider capability system brings real type safety to multi-provider workflows, so you get a clear UnsupportedCapability error instead of a cryptic runtime crash when a provider doesn't support embeddings or streaming. A new parent-child agent hierarchy enables proper multi-agent orchestration trees with cycle detection, depth tracking, and ancestor traversal. Middleware support on both models and agents finally gives you a clean place to plug in logging, retries, and guardrails without touching provider code.
Under the hood, the architecture is significantly cleaner: BaseService has been properly split into a provider-agnostic base and OpenAIService, the IAiService interface has been trimmed to essentials, and all runnable objects have been moved to a dedicated runnables/ folder. AiAgent is now fully stateless, making it safe to run across concurrent requests without shared-state bugs.
3.0 has no breaking changes to the public BIF signatures — your existing aiChat(), aiEmbed(), and aiAgent() calls work exactly as before.
🎁 New Features
🎯 AI Skills System
Composable, reusable knowledge blocks — following the Claude Agent Skills open standard — that can be injected into any model or agent system message at runtime.
Think of a skill as a portable unit of expertise: a coding style guide, a tone-of-voice policy, domain-specific rules, API cheat sheets, or anything else your AI should "know" before answering. Skills are defined once, stored in files or inline, and shared across any number of agents and models — no copy-paste, no drift.
Without Skills With Skills
─────────────────────────────────── ───────────────────────────────────
Agent A system message: ✅ Define once → inject everywhere
"Always use ISO dates.
Follow our SQL style guide. skills/
Prefer short column names. sql-style/ SKILL.md
Never use SELECT *." date-format/ SKILL.md
api-guidelines/ SKILL.md
Agent B system message:
"Always use ISO dates. agent = aiAgent( skills: sqlStyleSkill )
Follow our SQL style guide. model = aiModel( skills: apiGuidelinesSkill )
..." ← copy-paste drift starts📦 Defining Skills
Inline — for short, self-contained guidance that lives in your code:
From a SKILL.md file — for longer guidance that benefits from version control and editor tooling:
The SKILL.md file format is simple Markdown with an optional YAML frontmatter description field. If frontmatter is absent, the first paragraph of the body is used as the description automatically.
Store skill files under .ai/skills/<skill-name>/SKILL.md in your project. The directory name becomes the skill's default name when loading from a path.
⚡ Two Injection Modes
Different situations call for different strategies. Skills support two modes that you can mix freely within the same agent.
Always-on skills ( withSkills() / addSkill() ): Full skill content is injected into the system message on every single call. The LLM always has this knowledge in context, with zero latency. Best for short, universally relevant guidance like tone, format rules, or domain vocabulary.
Lazy / available skills ( withAvailableSkills() / addAvailableSkill() ): Only a compact index — just the skill name and one-line description — is included in the system message. When the LLM determines it needs a skill, it calls the auto-registered loadSkill( name ) tool to fetch the full content on demand. Best for large or specialized skill libraries where most skills are irrelevant to most queries — keeps token usage low.
You can promote a lazy skill to always-on at any point during a session:
🌍 Global Skills Pool
Register skills once at the application or module level and have them automatically available to every new agent — no need to explicitly pass them:
Global skills are prepended to every new agent's availableSkills pool. You can also configure them statically in ModuleConfig.bx via settings.globalSkills.
🔍 Introspection
📋 Full API Reference
aiSkill( path | name, description, content, recurse )
Global BIF
Create inline or discover file-based skills
aiGlobalSkills()
Global BIF
Access the globally shared skills pool
withSkills( skills )
AiModel, AiAgent
Replace all always-on skills
addSkill( skill )
AiModel, AiAgent
Add a single always-on skill
withAvailableSkills( skills )
AiModel, AiAgent
Replace all lazy skills
addAvailableSkill( skill )
AiModel, AiAgent
Add a single lazy skill
activateSkill( name )
AiModel, AiAgent
Promote a lazy skill to always-on
buildSkillsContent()
AiModel, AiAgent
Render the combined system-message block
skills: [] param
aiAgent(), aiModel()
Seed always-on skills at construction time
availableSkills: [] param
aiAgent()
Seed lazy skills at construction time
📖 Full Guide: AI Skills
🔌 MCP Server Seeding for Agents and Models
Agents and models can now be seeded directly with one or more MCP servers. All tools exposed by those servers are automatically discovered via listTools() and registered as MCPTool instances — no manual Tool construction required.
New APIs:
withMCPServer( server, config )— Attach a single MCP server (URL string or pre-configuredMCPClient). Config supportstoken,timeout,headers,user,password.withMCPServers( servers )— Attach multiple servers in one call.listMcpServers()— Returns currently connected MCP servers and their exposed tool names.listTools()onAiAgentreturns[{ name, description }]for all registered tools.getConfig()now includestools(full name/description list) andmcpServers(server URL + tool-name list).New
MCPToolclass proxies a single MCP server tool and converts itsinputSchemato OpenAI function-calling format automatically.MCP servers are auto-injected into the system prompt so the LLM knows which tools came from which server.
🗄️ Global AI Tool Registry
New singleton AIToolRegistry (accessible via aiToolRegistry() BIF) provides a module-scoped registry for AI tools. Tools registered by name can be referenced in params.tools arrays as plain strings and resolved automatically before requests — no live object references required.
Supports module namespacing (e.g.
now@bxai) to avoid collisions across modulesresolveTools()converts string keys →IToolinstances before LLM requestsTwo new interception points:
onAIToolRegistryRegisterandonAIToolRegistryUnregister
🔧 Tool System Overhaul
A major rethink of how tools are defined, registered, and invoked.
BaseTool abstract base class: All tool implementations now extend BaseTool, which provides the shared invocation lifecycle (firing beforeAIToolExecute/afterAIToolExecute events), result serialization, and the fluent describeArg() / describe[ArgName]() annotation syntax.
ClosureTool replaces the retired Tool.bx. A BaseTool subclass backed by any closure or lambda that auto-introspects the callable's parameter metadata to generate an OpenAI-compatible function schema.
CoreTools ships two built-in tools:
now@bxai— auto-registered on module load, returns current date/time in ISO 8601 for temporal awarenesshttpGet— opt-in only (not auto-registered for security), fetches any URL via HTTP GET
Additional improvements:
Non-required tool arguments are now supported
Tools receive the full originating
AiChatRequestas_chatRequestfor context-aware logicStructured output and streaming tools for Ollama
🛡️ Provider Capability System
A new type-safe capability system that prevents calling unsupported operations on providers.
New
models/providers/capabilities/package withIAiChatServiceandIAiEmbeddingsServicescoped interfacesEvery provider exposes
getCapabilities()(returns["chat", "stream", "embeddings", ...]) andhasCapability( "chat" )for clean runtime introspectionaiChat(),aiChatStream(), andaiEmbed()BIFs now check the provider implements the required capability before calling and throw a clearUnsupportedCapabilityexception instead of a cryptic provider error
🌲 AiAgent Parent-Child Hierarchy
AiAgent now tracks its position in a multi-agent tree, enabling sophisticated orchestration patterns with full introspection and cycle-detection.
New methods on AiAgent:
setParentAgent(parent)
Assign a parent with self-reference and cycle-detection guards
clearParentAgent()
Detach from a parent
hasParentAgent()
Returns true if the agent has a parent
isRootAgent()
Returns true for top-level agents
getRootAgent()
Walks up the tree and returns the root agent
getAgentDepth()
Returns nesting depth (0 = root, 1 = direct child, …)
getAgentPath()
Returns a slash-delimited path string, e.g. /coordinator/researcher
getAncestors()
Returns ordered array [immediateParent, …, root]
addSubAgent() now automatically calls setParentAgent(this) on the sub-agent. getConfig() includes parentAgent, agentDepth, and agentPath.
🧵 Middleware Support
AiModel and AiAgent both support composable middleware for cross-cutting concerns. Agent middleware is prepended ahead of model middleware in the execution chain. Middleware can be passed at construction time or added fluently via .use().
Provider lifecycle hooks: preRequest() and postResponse() on providers let you customize the request/response shape without overriding sendChatRequest(). Useful for custom headers, response normalization, or provider-specific logging.
📦 Shipped Core Middleware
BoxLang AI ships with six battle-tested middleware classes covering the most common cross-cutting concerns. All live in bxModules.bxai.models.middleware.core.
LoggingMiddleware
Audit every LLM call and tool invocation — write to console, file, or both with a configurable log level
RetryMiddleware
Automatically retry failed LLM calls with exponential back-off; essential for flaky or rate-limited providers
GuardrailMiddleware
Block dangerous tools entirely and enforce regex-based argument validation before any tool runs
MaxToolCallsMiddleware
Prevent runaway agents by capping the total number of tool invocations per run
HumanInTheLoopMiddleware
Require explicit human approval (CLI prompt or custom callback) before sensitive tools execute
FlightRecorderMiddleware
Record live LLM/tool interactions to a JSON fixture and replay them offline — ideal for testing and debugging
📋 LoggingMiddleware
Logs agent lifecycle events — run start/end, LLM calls, tool calls, and errors — to BoxLang's ai log file and optionally to the console. Drop it in to gain instant observability with no code changes.
logToFile
true
Write to the BoxLang ai log file
logToConsole
false
Also print to stdout
logLevel
"info"
BoxLang log level (info, debug, warning, error)
prefix
"[AI Middleware]"
Prefix prepended to every log message
🔁 RetryMiddleware
Wraps both LLM calls and tool calls with exponential back-off retry logic. Transient provider errors (rate limits, timeouts, 5xx) are retried automatically; non-retryable errors (bad input, max interactions) are surfaced immediately.
maxRetries
3
Maximum retry attempts after first failure
initialDelay
1000
Initial delay in ms before first retry
backoffMultiplier
2
Multiplier applied to delay after each failure
maxDelay
30000
Hard cap on delay in ms
nonRetryableTypes
"InvalidInput,MaxInteractionsExceeded"
Comma-separated exception types to skip retrying
🛡️ GuardrailMiddleware
Blocks tool calls by name or rejects calls whose arguments match configured regex patterns. Use this to prevent dangerous operations (e.g., destructive SQL) from ever reaching a tool.
blockedTools
[]
Array of tool names never allowed (case-insensitive)
argPatterns
{}
{ toolName: [ { paramName: "regexPattern" } ] } — rejects on any match
🙋 HumanInTheLoopMiddleware
Intercepts specified tool calls and requires a human to approve, reject, or edit the arguments before execution proceeds. Supports two modes:
cli— Blocks and prompts the user interactively on stdin/stdout. Great for local scripts and development.web— Suspends the run and returns anAiMiddlewareResult.suspend()so the caller can checkpoint the state and present the approval request asynchronously (e.g., via a web UI or notification).
toolsRequiringApproval
[]
Tool names that require human sign-off (case-insensitive)
mode
"cli"
"cli" for blocking prompt, "web" for async suspension
showArguments
true
Whether to display tool arguments in the CLI prompt
approvalCallback
—
Optional closure function(context) for custom approval logic
🎙️ FlightRecorderMiddleware
Records every LLM round-trip and tool invocation to a JSON fixture file, then replays that fixture deterministically — with no live provider or tool calls — for agent behaviour regression testing in CI.
mode
"passthrough"
"passthrough", "record", or "replay"
fixturePath
""
Path to fixture file (required in replay mode, auto-generated in record)
fixtureDir
".ai/flight-recorder"
Base directory for auto-generated fixture files
recordTools
true
Whether to capture tool call inputs and outputs
strict
true
In replay mode, throw on type mismatch; when false, skip ahead to next matching interaction
🔢 MaxToolCallsMiddleware
Enforces a hard cap on the total number of tool invocations per agent run, cancelling any additional calls once the limit is reached. The counter resets automatically at the start of each new run.
maxCalls
10
Maximum tool invocations allowed per run
🏢 Per-Call Identity Routing on All Memory Types
Following the Spring AI ChatMemory pattern, a single memory instance can now safely serve multiple tenants. add(), getAll(), clear(), trim(), seed(), and related methods on every IAiMemory and IVectorMemory implementation now accept optional userId and conversationId arguments. Construction-time values remain as fallbacks.
AiAgent is now fully stateless: userId and conversationId are resolved per-call from the options argument, eliminating shared-state concurrency bugs in multi-user deployments.
🤗 HuggingFace Embeddings Support
New huggingface provider for generating embeddings via the HuggingFace Inference API.
🔀 Custom Service URLs
All senders in BaseService now accept a custom URL override, enabling easy integration with proxies, self-hosted endpoints, and OpenAI-compatible APIs.
🔧 Enhancements
BaseService→OpenAIServicesplit: OpenAI-specific logic has been moved to a dedicatedOpenAIService.BaseServiceis now truly provider-agnostic, making it much easier to implement custom providers that differ from the OpenAI standard.IAiServicecontract trimmed: The base interface now declares only identity/configuration/capability-discovery methods (getName(),configure(),getCapabilities(),hasCapability()). The operation methods have moved to their respective capability interfaces.VoyageServicecorrected: Now extendsBaseServicedirectly and implements onlyIAiEmbeddingsService— eliminating the old stubbed-out chat methods that would throw at runtime. The type system now enforces the embeddings-only constraint.Runnable objects refactored to
runnables/folder (AiModel,AiAgent,AiMessage) for a cleaner separation between service logic and runnable wrappers.resume()/resumeStream()now require explicitthreadId: Both methods now requirethreadIdas an explicitrequired stringargument instead of defaulting to an instance property. This eliminates ambiguous state in multi-user deployments.Renamed
BaseService.sendRequest()→sendChatRequest()for clarity.Reduced duplicate payload fields in
onAITokenCountevents.
🐛 Bug Fixes
Streaming events: Model and Agent streaming was not announcing the global
beforeAIModelInvoke/afterAIModelInvokepre/post events. Fixed.MCP requestId crash: A null-scope crash occurred on JSON-RPC notifications for MCP servers (notifications intentionally omit
id). Fixed.MiniMax error surfacing:
base_resp.status_code != 0errors were silently swallowed. Now they surface correctly with status code and message.OllamaServicestale embedding hook: The oldpostEmbeddingResponse()hook was never wired to the currentBaseServicelifecycle and silently did nothing. Replaced with the properpostResponse( aiRequest, dataPacket, result, operation )override.
📖 Full Documentation
Explore the full documentation for all new features:
Last updated