Provider Setup & Configuration

Comprehensive guide to configuring AI providers in BoxLang AI - from API keys to local deployment with Ollama.

This guide covers detailed setup instructions for all supported AI providers, helping you choose the right provider and configure it properly for your use case.

📋 Table of Contents

🎯 Quick Provider Comparison

Provider

Type

Best For

Cost

Speed

Context

OpenAI

Cloud

General purpose, GPT-5

$$$

Fast

128K

Claude

Cloud

Long context, analysis

$$$

Fast

200K

Gemini

Cloud

Google integration, multimodal

Fast

Ollama

Local

Privacy, offline, free

Free

Medium

Varies

Groq

Cloud

Ultra-fast inference

Fastest

32K

DeepSeek

Cloud

Code, reasoning

Fast

64K

HuggingFace

Cloud

Open-source models

Medium

Varies

OpenRouter

Gateway

Access multiple models

Varies

Fast

Varies

Perplexity

Cloud

Research, citations

Fast

Cohere

Cloud

Embeddings, multilingual

Fast

128K

Voyage

Cloud

State-of-art embeddings

Fast

N/A

💡 Recommendations by Use Case

General Chatbot: OpenAI (GPT-4), Claude (Sonnet)
Long Documents: Claude (200K context), Gemini (1M context)
Code Generation: DeepSeek, OpenAI (GPT-4)
Fast Responses: Groq, Gemini
Privacy/Offline: Ollama (local)
Embeddings/RAG: Voyage, Cohere, OpenAI
Research: Perplexity (citations)
Cost-Effective: Ollama (free), DeepSeek, Gemini
Multimodal: Gemini, OpenAI (GPT-4)

🔧 Configuration Basics

All providers are configured in your boxlang.json file:

{
  "modules": {
    "bxai": {
      "settings": {
        "provider": "openai",
        "apiKey": "your-api-key-here",
        "defaultParams": {
          "model": "gpt-4",
          "temperature": 0.7
        }
      }
    }
  }
}

Configuration Options Reference

Setting

Type

Default

Description

provider

string

"openai"

Default AI provider

apiKey

string

""

API key for the provider

chatURL

string

Auto

Custom API endpoint URL

defaultParams

struct

{}

Default parameters for requests

timeout

number

30

Request timeout in seconds

returnFormat

string

"single"

Default return format

☁️ Cloud Providers

🟢 OpenAI (ChatGPT)

Best for: General purpose AI, content generation, code assistance

Get API Key: https://platform.openai.com/api-keys

Configuration:

{
  "modules": {
    "bxai": {
      "settings": {
        "provider": "openai",
        "apiKey": "sk-proj-...",
        "defaultParams": {
          "model": "gpt-4",
          "temperature": 0.7,
          "max_tokens": 2000
        }
      }
    }
  }
}

Available Models:

Model

Description

Context

Best For

gpt-5

Latest, most advanced

128K

Everything

gpt-4

Most capable

128K

Complex tasks, reasoning

gpt-4-turbo

Faster, cheaper

128K

Production apps

gpt-3.5-turbo

Fast, affordable

16K

Simple tasks, high volume

gpt-4o

Optimized for chat

128K

Conversational AI

Pricing (as of Dec 2024):

GPT-5: ~$30/1M tokens input, ~$60/1M tokens output
GPT-4: ~$10/1M tokens input, ~$30/1M tokens output
GPT-4-Turbo: ~$5/1M tokens input, ~$15/1M tokens output
GPT-3.5-Turbo: ~$0.50/1M tokens

Usage Example:

// Use default configured provider
result = aiChat( "Explain quantum computing" )

// Override provider and model
result = aiChat(
    "Explain quantum computing",
    { model: "gpt-4-turbo" },
    { provider: "openai" }
)

🟣 Claude (Anthropic)

Best for: Long context analysis, detailed reasoning, safety-focused applications

Get API Key: https://console.anthropic.com/

Configuration:

{
  "modules": {
    "bxai": {
      "settings": {
        "provider": "claude",
        "apiKey": "sk-ant-...",
        "defaultParams": {
          "model": "claude-3-5-sonnet-20241022",
          "max_tokens": 4096
        }
      }
    }
  }
}

Available Models:

Model

Description

Context

Best For

claude-3-5-opus-20241022

Most capable

200K

Complex analysis

claude-3-5-sonnet-20241022

Balanced (recommended)

200K

General use

claude-3-5-haiku-20241022

Fastest, cheapest

200K

High volume

Pricing:

Opus: ~$15/1M input, ~$75/1M output
Sonnet: ~$3/1M input, ~$15/1M output
Haiku: ~$0.25/1M input, ~$1.25/1M output

Special Features:

200K context window - Entire books in one request
Constitutional AI - Enhanced safety and helpfulness
Vision support - Image analysis capabilities

🔵 Gemini (Google)

Best for: Google integration, multimodal content, massive context windows

Get API Key: https://makersuite.google.com/app/apikey

Configuration:

{
  "modules": {
    "bxai": {
      "settings": {
        "provider": "gemini",
        "apiKey": "AIza...",
        "defaultParams": {
          "model": "gemini-2.0-flash"
        }
      }
    }
  }
}

Available Models:

Model

Description

Context

Best For

gemini-2.0-flash

Fast, efficient (recommended)

General use

gemini-1.5-pro

Most capable

Complex tasks

gemini-1.5-flash

Fast, affordable

High volume

Special Features:

1-2M context window - Process entire codebases
Multimodal native - Text, images, audio, video
Free tier available - Great for development

🔸 Grok (xAI)

Best for: Real-time data, Twitter/X integration, conversational AI

Get API Key: https://console.x.ai/

Configuration:

{
  "modules": {
    "bxai": {
      "settings": {
        "provider": "grok",
        "apiKey": "xai-...",
        "defaultParams": {
          "model": "grok-2"
        }
      }
    }
  }
}

🤗 HuggingFace

Best for: Open-source models, community-driven, flexibility

Get API Key: https://huggingface.co/settings/tokens

Configuration:

{
  "modules": {
    "bxai": {
      "settings": {
        "provider": "huggingface",
        "apiKey": "hf_...",
        "defaultParams": {
          "model": "Qwen/Qwen2.5-72B-Instruct"
        }
      }
    }
  }
}

Popular Models:

Qwen/Qwen2.5-72B-Instruct - Powerful general-purpose model
meta-llama/Llama-3.1-8B-Instruct - Meta's Llama model
mistralai/Mistral-7B-Instruct-v0.3 - Fast and efficient

⚡ Groq

Best for: Ultra-fast inference with LPU architecture

Get API Key: https://console.groq.com/

Configuration:

{
  "modules": {
    "bxai": {
      "settings": {
        "provider": "groq",
        "apiKey": "gsk_...",
        "defaultParams": {
          "model": "llama-3.1-70b-versatile"
        }
      }
    }
  }
}

Special Features:

Fastest inference - Up to 500 tokens/second
LPU architecture - Hardware-optimized for AI
Free tier - Generous limits for testing

🔷 DeepSeek

Best for: Code generation, reasoning tasks, cost-effective

Get API Key: https://platform.deepseek.com/

Configuration:

{
  "modules": {
    "bxai": {
      "settings": {
        "provider": "deepseek",
        "apiKey": "sk-...",
        "defaultParams": {
          "model": "deepseek-chat"
        }
      }
    }
  }
}

🟠 Mistral

Best for: European data residency, balanced performance/cost

Get API Key: https://console.mistral.ai/

Configuration:

{
  "modules": {
    "bxai": {
      "settings": {
        "provider": "mistral",
        "apiKey": "...",
        "defaultParams": {
          "model": "mistral-medium-latest"
        }
      }
    }
  }
}

Available Models:

mistral-large-latest - Most capable
mistral-medium-latest - Balanced
mistral-small-latest - Fast, cost-effective

🌐 OpenRouter (Multi-Model Gateway)

Best for: Access multiple models through one API, cost optimization

Get API Key: https://openrouter.ai/keys

Configuration:

{
  "modules": {
    "bxai": {
      "settings": {
        "provider": "openrouter",
        "apiKey": "sk-or-...",
        "defaultParams": {
          "model": "openai/gpt-4"
        }
      }
    }
  }
}

Special Features:

Access 100+ models through one API
Automatic fallback if model unavailable
Cost tracking across providers
Free models available

🔎 Perplexity

Best for: Research, factual accuracy, citations

Get API Key: https://www.perplexity.ai/settings/api

Configuration:

{
  "modules": {
    "bxai": {
      "settings": {
        "provider": "perplexity",
        "apiKey": "pplx-...",
        "defaultParams": {
          "model": "llama-3.1-sonar-large-128k-online"
        }
      }
    }
  }
}

Special Features:

Real-time web search integration
Automatic source citations
Fact-checked responses

🧡 Cohere

Best for: Embeddings, multilingual support, RAG applications

Get API Key: https://dashboard.cohere.com/api-keys

Configuration:

{
  "modules": {
    "bxai": {
      "settings": {
        "provider": "cohere",
        "apiKey": "...",
        "defaultParams": {
          "model": "command-r-plus"
        }
      }
    }
  }
}

Special Features:

Best-in-class embeddings for RAG
Native multilingual support (100+ languages)
Tool use and structured output

🚀 Voyage

Best for: State-of-the-art embeddings optimized for RAG

Get API Key: https://dash.voyageai.com/

Configuration:

{
  "modules": {
    "bxai": {
      "settings": {
        "embeddingProvider": "voyage",
        "embeddingApiKey": "pa-...",
        "embeddingModel": "voyage-3"
      }
    }
  }
}

Note: Voyage is embedding-only (use with aiEmbed() or vector memory)

🦙 Local AI with Ollama

Perfect for privacy, offline use, and zero API costs!

Why Ollama?

✅ 100% Free - No API costs ever
✅ Privacy - Data never leaves your machine
✅ Offline - Works without internet
✅ No Rate Limits - Use as much as you want
✅ Fast - Low latency on local hardware

Installation Methods

Option 1: Native Installation

macOS:

brew install ollama

Linux:

curl -fsSL https://ollama.ai/install.sh | sh

Windows: Download installer from https://ollama.ai

Option 2: Docker (Recommended for Production)

See Running Ollama with Docker in the installation guide.

Pull and Configure Models

# Recommended general model
ollama pull llama3.2

# Smaller/faster options
ollama pull llama3.2:1b      # 1B parameters, very fast
ollama pull phi3             # Microsoft's efficient model

# Code-focused
ollama pull codellama
ollama pull deepseek-coder

# High quality
ollama pull mistral
ollama pull qwen2.5

BoxLang Configuration

{
  "modules": {
    "bxai": {
      "settings": {
        "provider": "ollama",
        "chatURL": "http://localhost:11434",
        "defaultParams": {
          "model": "llama3.2"
        }
      }
    }
  }
}

Note: Ollama doesn't require an API key for local use.

Verify Installation

# Check Ollama is running
ollama list

# Test a model
ollama run llama3.2 "Hello!"

# Check from BoxLang
curl http://localhost:11434/api/tags

Model Selection Guide

Model

Size

Speed

Quality

Use Case

llama3.2:1b

1GB

⚡⚡⚡

⭐⭐

Quick responses, testing

llama3.2

3GB

⚡⚡

⭐⭐⭐

General use (recommended)

phi3

2GB

⚡⚡⚡

⭐⭐⭐

Balanced quality/speed

mistral

4GB

⚡⚡

⭐⭐⭐⭐

High quality responses

codellama

4GB

⚡⚡

⭐⭐⭐⭐

Code generation

qwen2.5:7b

5GB

⚡

⭐⭐⭐⭐⭐

Best quality (slower)

Hardware Requirements

Minimum: 8GB RAM, 4GB disk space
Recommended: 16GB RAM, 10GB disk space
Optimal: 32GB RAM, GPU (NVIDIA/AMD)

🔐 Environment Variables

Use environment variables to keep API keys out of config files:

In boxlang.json

{
  "modules": {
    "bxai": {
      "settings": {
        "provider": "openai",
        "apiKey": "${OPENAI_API_KEY}"
      }
    }
  }
}

Set Environment Variables

macOS/Linux:

export OPENAI_API_KEY="sk-..."
export CLAUDE_API_KEY="sk-ant-..."
export GEMINI_API_KEY="AIza..."

Windows:

set OPENAI_API_KEY=sk-...
set CLAUDE_API_KEY=sk-ant-...

Auto-Detection

Convention: By default, BoxLang AI automatically detects environment variables following the pattern {PROVIDER_NAME}_API_KEY. This means you don't need to explicitly configure API keys in boxlang.json if you set the appropriate environment variable.

For example:

Setting OPENAI_API_KEY allows you to use OpenAI without configuration
Setting CLAUDE_API_KEY allows you to use Claude without configuration
And so on for all providers

Automatically detected environment variables:

OPENAI_API_KEY
CLAUDE_API_KEY
ANTHROPIC_API_KEY (alternative for Claude)
GEMINI_API_KEY
GOOGLE_API_KEY (alternative for Gemini)
GROQ_API_KEY
DEEPSEEK_API_KEY
HUGGINGFACE_API_KEY
HF_TOKEN (alternative for HuggingFace)
MISTRAL_API_KEY
PERPLEXITY_API_KEY
COHERE_API_KEY
VOYAGE_API_KEY

🔄 Multiple Providers

Use different providers for different tasks:

// Use OpenAI for general chat
chatResult = aiChat(
    "Explain AI",
    {},
    { provider: "openai", apiKey: getSystemSetting( "OPENAI_API_KEY" ) }
)

// Use Claude for deep analysis
analysisResult = aiChat(
    "Analyze this report: ${report}",
    { max_tokens: 8000 },
    { provider: "claude" }
)

// Use Ollama for privacy-sensitive data
privateResult = aiChat(
    "Process: ${sensitiveData}",
    {},
    { provider: "ollama" }
)

// Use Groq for fast responses
quickResult = aiChat(
    "Quick question",
    {},
    { provider: "groq" }
)

Provider Services

Create reusable service instances:

// Create services for each provider
openaiService = aiService( "openai" )
claudeService = aiService( "claude" )
ollamaService = aiService( "ollama" )

// Use them in your application
generalChat = openaiService.invoke( request )
deepAnalysis = claudeService.invoke( complexRequest )
privateProcessing = ollamaService.invoke( sensitiveRequest )

🔧 Troubleshooting

❌ "No API key provided"

Solution: Set API key in config or pass directly

// Option 1: Set in boxlang.json
{
  "modules": {
    "bxai": {
      "settings": {
        "apiKey": "your-key"
      }
    }
  }
}

// Option 2: Pass in request
answer = aiChat( "Hello", {}, { apiKey: "your-key" } )

// Option 3: Use environment variable
export OPENAI_API_KEY="sk-..."

⏱️ "Connection timeout"

Solution: Increase timeout setting

{
  "modules": {
    "bxai": {
      "settings": {
        "timeout": 60
      }
    }
  }
}

🔌 "Connection refused" (Ollama)

Check if Ollama is running:

# Start Ollama
ollama serve

# Or check if running
curl http://localhost:11434/api/tags

# For Docker
docker compose -f docker-compose-ollama.yml ps

🚫 "Model not found"

Solution: Pull the model first (Ollama)

# List available models
ollama list

# Pull missing model
ollama pull llama3.2

💰 "Rate limit exceeded"

Solutions:

Upgrade to paid tier
Implement request caching
Use different provider for high-volume tasks
Switch to Ollama (no limits)

🔑 "Invalid API key"

Verify:

Key is complete and not truncated
Key is for correct provider
Key has not expired
Account has credits/subscription

🚀 Next Steps

Installation Guide - Install the BoxLang AI module
Quick Start - Your first AI conversation
Basic Chatting - Learn the fundamentals
Advanced Features - Tools, streaming, multimodal

💡 Tips for Production

Use environment variables for API keys (never commit to git)
Set appropriate timeouts based on your use case
Implement retry logic for transient errors
Monitor costs with provider dashboards
Use Ollama for development/testing to save costs
Cache responses when possible to reduce API calls
Choose right model for each task (don't always use most expensive)
Rate limit your application to avoid provider rate limits

Previousinstallation NextQuick Start Guide

Last updated 1 day ago

Good morning

📋 Table of Contents

🎯 Quick Provider Comparison

💡 Recommendations by Use Case

🔧 Configuration Basics

Configuration Options Reference

☁️ Cloud Providers

🟢 OpenAI (ChatGPT)

🟣 Claude (Anthropic)

🔵 Gemini (Google)

🔸 Grok (xAI)

🤗 HuggingFace

⚡ Groq

🔷 DeepSeek

🟠 Mistral

🌐 OpenRouter (Multi-Model Gateway)

🔎 Perplexity

🧡 Cohere

🚀 Voyage

🦙 Local AI with Ollama

Why Ollama?

Installation Methods

Option 1: Native Installation

Option 2: Docker (Recommended for Production)

Pull and Configure Models

BoxLang Configuration

Verify Installation

Model Selection Guide

Hardware Requirements

🔐 Environment Variables

In boxlang.json

Set Environment Variables

Auto-Detection

🔄 Multiple Providers

Provider Services

🔧 Troubleshooting

❌ "No API key provided"

⏱️ "Connection timeout"

🔌 "Connection refused" (Ollama)

🚫 "Model not found"

💰 "Rate limit exceeded"

🔑 "Invalid API key"

🚀 Next Steps

💡 Tips for Production