Provider Setup & Configuration
Comprehensive guide to configuring AI providers in BoxLang AI - from API keys to local deployment with Ollama.
This guide covers detailed setup instructions for all supported AI providers, helping you choose the right provider and configure it properly for your use case.
📋 Table of Contents
🎯 Quick Provider Comparison
OpenAI
Cloud
General purpose, GPT-5
$$$
Fast
128K
Claude
Cloud
Long context, analysis
$$$
Fast
200K
Gemini
Cloud
Google integration, multimodal
$$
Fast
1M
Ollama
Local
Privacy, offline, free
Free
Medium
Varies
Groq
Cloud
Ultra-fast inference
$$
Fastest
32K
DeepSeek
Cloud
Code, reasoning
$
Fast
64K
HuggingFace
Cloud
Open-source models
$
Medium
Varies
OpenRouter
Gateway
Access multiple models
Varies
Fast
Varies
Perplexity
Cloud
Research, citations
$$
Fast
8K
Cohere
Cloud
Embeddings, multilingual
$$
Fast
128K
Voyage
Cloud
State-of-art embeddings
$$
Fast
N/A
💡 Recommendations by Use Case
General Chatbot: OpenAI (GPT-4), Claude (Sonnet)
Long Documents: Claude (200K context), Gemini (1M context)
Code Generation: DeepSeek, OpenAI (GPT-4)
Fast Responses: Groq, Gemini
Privacy/Offline: Ollama (local)
Embeddings/RAG: Voyage, Cohere, OpenAI
Research: Perplexity (citations)
Cost-Effective: Ollama (free), DeepSeek, Gemini
Multimodal: Gemini, OpenAI (GPT-4)
🔧 Configuration Basics
All providers are configured in your boxlang.json file:
Configuration Options Reference
provider
string
"openai"
Default AI provider
apiKey
string
""
API key for the provider
chatURL
string
Auto
Custom API endpoint URL
defaultParams
struct
{}
Default parameters for requests
timeout
number
30
Request timeout in seconds
returnFormat
string
"single"
Default return format
☁️ Cloud Providers
🟢 OpenAI (ChatGPT)
Best for: General purpose AI, content generation, code assistance
Get API Key: https://platform.openai.com/api-keys
Configuration:
Available Models:
gpt-5
Latest, most advanced
128K
Everything
gpt-4
Most capable
128K
Complex tasks, reasoning
gpt-4-turbo
Faster, cheaper
128K
Production apps
gpt-3.5-turbo
Fast, affordable
16K
Simple tasks, high volume
gpt-4o
Optimized for chat
128K
Conversational AI
Pricing (as of Dec 2024):
GPT-5: ~$30/1M tokens input, ~$60/1M tokens output
GPT-4: ~$10/1M tokens input, ~$30/1M tokens output
GPT-4-Turbo: ~$5/1M tokens input, ~$15/1M tokens output
GPT-3.5-Turbo: ~$0.50/1M tokens
Usage Example:
🟣 Claude (Anthropic)
Best for: Long context analysis, detailed reasoning, safety-focused applications
Get API Key: https://console.anthropic.com/
Configuration:
Available Models:
claude-3-5-opus-20241022
Most capable
200K
Complex analysis
claude-3-5-sonnet-20241022
Balanced (recommended)
200K
General use
claude-3-5-haiku-20241022
Fastest, cheapest
200K
High volume
Pricing:
Opus: ~$15/1M input, ~$75/1M output
Sonnet: ~$3/1M input, ~$15/1M output
Haiku: ~$0.25/1M input, ~$1.25/1M output
Special Features:
200K context window - Entire books in one request
Constitutional AI - Enhanced safety and helpfulness
Vision support - Image analysis capabilities
🔵 Gemini (Google)
Best for: Google integration, multimodal content, massive context windows
Get API Key: https://makersuite.google.com/app/apikey
Configuration:
Available Models:
gemini-2.0-flash
Fast, efficient (recommended)
1M
General use
gemini-1.5-pro
Most capable
2M
Complex tasks
gemini-1.5-flash
Fast, affordable
1M
High volume
Special Features:
1-2M context window - Process entire codebases
Multimodal native - Text, images, audio, video
Free tier available - Great for development
🔸 Grok (xAI)
Best for: Real-time data, Twitter/X integration, conversational AI
Get API Key: https://console.x.ai/
Configuration:
🤗 HuggingFace
Best for: Open-source models, community-driven, flexibility
Get API Key: https://huggingface.co/settings/tokens
Configuration:
Popular Models:
Qwen/Qwen2.5-72B-Instruct- Powerful general-purpose modelmeta-llama/Llama-3.1-8B-Instruct- Meta's Llama modelmistralai/Mistral-7B-Instruct-v0.3- Fast and efficient
⚡ Groq
Best for: Ultra-fast inference with LPU architecture
Get API Key: https://console.groq.com/
Configuration:
Special Features:
Fastest inference - Up to 500 tokens/second
LPU architecture - Hardware-optimized for AI
Free tier - Generous limits for testing
🔷 DeepSeek
Best for: Code generation, reasoning tasks, cost-effective
Get API Key: https://platform.deepseek.com/
Configuration:
🟠 Mistral
Best for: European data residency, balanced performance/cost
Get API Key: https://console.mistral.ai/
Configuration:
Available Models:
mistral-large-latest- Most capablemistral-medium-latest- Balancedmistral-small-latest- Fast, cost-effective
🌐 OpenRouter (Multi-Model Gateway)
Best for: Access multiple models through one API, cost optimization
Get API Key: https://openrouter.ai/keys
Configuration:
Special Features:
Access 100+ models through one API
Automatic fallback if model unavailable
Cost tracking across providers
Free models available
🔎 Perplexity
Best for: Research, factual accuracy, citations
Get API Key: https://www.perplexity.ai/settings/api
Configuration:
Special Features:
Real-time web search integration
Automatic source citations
Fact-checked responses
🧡 Cohere
Best for: Embeddings, multilingual support, RAG applications
Get API Key: https://dashboard.cohere.com/api-keys
Configuration:
Special Features:
Best-in-class embeddings for RAG
Native multilingual support (100+ languages)
Tool use and structured output
🚀 Voyage
Best for: State-of-the-art embeddings optimized for RAG
Get API Key: https://dash.voyageai.com/
Configuration:
Note: Voyage is embedding-only (use with aiEmbed() or vector memory)
🦙 Local AI with Ollama
Perfect for privacy, offline use, and zero API costs!
Why Ollama?
✅ 100% Free - No API costs ever
✅ Privacy - Data never leaves your machine
✅ Offline - Works without internet
✅ No Rate Limits - Use as much as you want
✅ Fast - Low latency on local hardware
Installation Methods
Option 1: Native Installation
macOS:
Linux:
Windows: Download installer from https://ollama.ai
Option 2: Docker (Recommended for Production)
See Running Ollama with Docker in the installation guide.
Pull and Configure Models
BoxLang Configuration
Note: Ollama doesn't require an API key for local use.
Verify Installation
Model Selection Guide
llama3.2:1b
1GB
⚡⚡⚡
⭐⭐
Quick responses, testing
llama3.2
3GB
⚡⚡
⭐⭐⭐
General use (recommended)
phi3
2GB
⚡⚡⚡
⭐⭐⭐
Balanced quality/speed
mistral
4GB
⚡⚡
⭐⭐⭐⭐
High quality responses
codellama
4GB
⚡⚡
⭐⭐⭐⭐
Code generation
qwen2.5:7b
5GB
⚡
⭐⭐⭐⭐⭐
Best quality (slower)
Hardware Requirements
Minimum: 8GB RAM, 4GB disk space
Recommended: 16GB RAM, 10GB disk space
Optimal: 32GB RAM, GPU (NVIDIA/AMD)
🔐 Environment Variables
Use environment variables to keep API keys out of config files:
In boxlang.json
Set Environment Variables
macOS/Linux:
Windows:
Auto-Detection
Convention: By default, BoxLang AI automatically detects environment variables following the pattern {PROVIDER_NAME}_API_KEY. This means you don't need to explicitly configure API keys in boxlang.json if you set the appropriate environment variable.
For example:
Setting
OPENAI_API_KEYallows you to use OpenAI without configurationSetting
CLAUDE_API_KEYallows you to use Claude without configurationAnd so on for all providers
Automatically detected environment variables:
OPENAI_API_KEYCLAUDE_API_KEYANTHROPIC_API_KEY(alternative for Claude)GEMINI_API_KEYGOOGLE_API_KEY(alternative for Gemini)GROQ_API_KEYDEEPSEEK_API_KEYHUGGINGFACE_API_KEYHF_TOKEN(alternative for HuggingFace)MISTRAL_API_KEYPERPLEXITY_API_KEYCOHERE_API_KEYVOYAGE_API_KEY
🔄 Multiple Providers
Use different providers for different tasks:
Provider Services
Create reusable service instances:
🔧 Troubleshooting
❌ "No API key provided"
Solution: Set API key in config or pass directly
⏱️ "Connection timeout"
Solution: Increase timeout setting
🔌 "Connection refused" (Ollama)
Check if Ollama is running:
🚫 "Model not found"
Solution: Pull the model first (Ollama)
💰 "Rate limit exceeded"
Solutions:
Upgrade to paid tier
Implement request caching
Use different provider for high-volume tasks
Switch to Ollama (no limits)
🔑 "Invalid API key"
Verify:
Key is complete and not truncated
Key is for correct provider
Key has not expired
Account has credits/subscription
🚀 Next Steps
Installation Guide - Install the BoxLang AI module
Quick Start - Your first AI conversation
Basic Chatting - Learn the fundamentals
Advanced Features - Tools, streaming, multimodal
💡 Tips for Production
Use environment variables for API keys (never commit to git)
Set appropriate timeouts based on your use case
Implement retry logic for transient errors
Monitor costs with provider dashboards
Use Ollama for development/testing to save costs
Cache responses when possible to reduce API calls
Choose right model for each task (don't always use most expensive)
Rate limit your application to avoid provider rate limits
Last updated