Building Custom Document Loaders
Learn how to build custom document loaders to integrate any data source into BoxLang AI workflows.
Create custom document loaders to integrate any data source into your BoxLang AI workflows. This guide shows you how to build loaders that work seamlessly with memory systems, RAG pipelines, and the aiDocuments() BIF.
🎯 Why Custom Loaders?
Build custom loaders when you need to:
Load proprietary formats - Parse custom file formats or data structures
Integrate APIs - Pull data from external services or APIs
Transform data - Apply domain-specific logic during loading
Add metadata - Extract and preserve rich metadata from sources
Optimize performance - Implement specialized caching or batching
🏗️ Loader Architecture
📝 IDocumentLoader Interface
All document loaders must implement the IDocumentLoader interface:
interface {
/**
* Load documents from the source
* @return Array of Document objects
*/
public array function load();
/**
* Load documents asynchronously
* @return BoxLang Future resolving to array of Documents
*/
public any function loadAsync();
/**
* Get the source being loaded
* @return string
*/
public string function getSource();
/**
* Configure the loader
* @param config Configuration struct
* @return this (for fluent API)
*/
public any function configure( required struct config );
}🚀 Quick Start: Simple Custom Loader
Here's a minimal custom loader that loads data from an API:
Usage:
🎨 Extending BaseDocumentLoader
The BaseDocumentLoader provides common functionality:
Inherited Methods
Fluent API Pattern
Implement fluent methods for your custom configuration:
Usage:
💡 Advanced Example: Database Loader
Here's a more complex example that loads data from a database with pagination:
Usage:
🔌 Registering Custom Loaders
Make your custom loader available via aiDocuments():
Module Registration
In your ModuleConfig.bx:
Usage After Registration
✅ Best Practices
1. Error Handling
Always wrap external calls with try/catch:
2. Resource Cleanup
Clean up resources in finally blocks:
3. Metadata Enrichment
Add rich metadata for better retrieval:
4. Performance Optimization
Implement batching and caching:
📚 Next Steps
📖 Document Loaders: Complete loader documentation
🧬 RAG Workflows: Implementing RAG
🔧 Custom Transformers: Building transformers
💻 Examples: Check
examples/loaders/for more examples
🎓 Summary
Custom document loaders enable you to:
✅ Integrate any data source into BoxLang AI workflows
✅ Preserve rich metadata for better retrieval
✅ Implement domain-specific logic and transformations
✅ Work seamlessly with memory systems and RAG pipelines
✅ Provide fluent APIs for easy configuration
Start with BaseDocumentLoader, implement the load() method, and you're ready to integrate your custom data sources!
Last updated