Provider Setup & Configuration

Comprehensive guide to configuring AI providers in BoxLang AI - from API keys to local deployment with Ollama.

This guide covers detailed setup instructions for all supported AI providers, helping you choose the right provider and configure it properly for your use case.

📋 Table of Contents


🎯 Quick Provider Comparison

Provider
Type
Best For
Cost
Speed
Context

OpenAI

Cloud

General purpose, GPT-5

$$$

Fast

128K

Claude

Cloud

Long context, analysis

$$$

Fast

200K

Gemini

Cloud

Google integration, multimodal

$$

Fast

1M

Ollama

Local

Privacy, offline, free

Free

Medium

Varies

Groq

Cloud

Ultra-fast inference

$$

Fastest

32K

DeepSeek

Cloud

Code, reasoning

$

Fast

64K

HuggingFace

Cloud

Open-source models

$

Medium

Varies

OpenRouter

Gateway

Access multiple models

Varies

Fast

Varies

Perplexity

Cloud

Research, citations

$$

Fast

8K

Cohere

Cloud

Embeddings, multilingual

$$

Fast

128K

Voyage

Cloud

State-of-art embeddings

$$

Fast

N/A

💡 Recommendations by Use Case

  • General Chatbot: OpenAI (GPT-4), Claude (Sonnet)

  • Long Documents: Claude (200K context), Gemini (1M context)

  • Code Generation: DeepSeek, OpenAI (GPT-4)

  • Fast Responses: Groq, Gemini

  • Privacy/Offline: Ollama (local)

  • Embeddings/RAG: Voyage, Cohere, OpenAI

  • Research: Perplexity (citations)

  • Cost-Effective: Ollama (free), DeepSeek, Gemini

  • Multimodal: Gemini, OpenAI (GPT-4)


🔧 Configuration Basics

All providers are configured in your boxlang.json file:

Configuration Options Reference

Setting
Type
Default
Description

provider

string

"openai"

Default AI provider

apiKey

string

""

API key for the provider

chatURL

string

Auto

Custom API endpoint URL

defaultParams

struct

{}

Default parameters for requests

timeout

number

30

Request timeout in seconds

returnFormat

string

"single"

Default return format


☁️ Cloud Providers

🟢 OpenAI (ChatGPT)

Best for: General purpose AI, content generation, code assistance

Get API Key: https://platform.openai.com/api-keys

Configuration:

Available Models:

Model
Description
Context
Best For

gpt-5

Latest, most advanced

128K

Everything

gpt-4

Most capable

128K

Complex tasks, reasoning

gpt-4-turbo

Faster, cheaper

128K

Production apps

gpt-3.5-turbo

Fast, affordable

16K

Simple tasks, high volume

gpt-4o

Optimized for chat

128K

Conversational AI

Pricing (as of Dec 2024):

  • GPT-5: ~$30/1M tokens input, ~$60/1M tokens output

  • GPT-4: ~$10/1M tokens input, ~$30/1M tokens output

  • GPT-4-Turbo: ~$5/1M tokens input, ~$15/1M tokens output

  • GPT-3.5-Turbo: ~$0.50/1M tokens

Usage Example:


🟣 Claude (Anthropic)

Best for: Long context analysis, detailed reasoning, safety-focused applications

Get API Key: https://console.anthropic.com/

Configuration:

Available Models:

Model
Description
Context
Best For

claude-3-5-opus-20241022

Most capable

200K

Complex analysis

claude-3-5-sonnet-20241022

Balanced (recommended)

200K

General use

claude-3-5-haiku-20241022

Fastest, cheapest

200K

High volume

Pricing:

  • Opus: ~$15/1M input, ~$75/1M output

  • Sonnet: ~$3/1M input, ~$15/1M output

  • Haiku: ~$0.25/1M input, ~$1.25/1M output

Special Features:

  • 200K context window - Entire books in one request

  • Constitutional AI - Enhanced safety and helpfulness

  • Vision support - Image analysis capabilities


🔵 Gemini (Google)

Best for: Google integration, multimodal content, massive context windows

Get API Key: https://makersuite.google.com/app/apikey

Configuration:

Available Models:

Model
Description
Context
Best For

gemini-2.0-flash

Fast, efficient (recommended)

1M

General use

gemini-1.5-pro

Most capable

2M

Complex tasks

gemini-1.5-flash

Fast, affordable

1M

High volume

Special Features:

  • 1-2M context window - Process entire codebases

  • Multimodal native - Text, images, audio, video

  • Free tier available - Great for development


🔸 Grok (xAI)

Best for: Real-time data, Twitter/X integration, conversational AI

Get API Key: https://console.x.ai/

Configuration:


🤗 HuggingFace

Best for: Open-source models, community-driven, flexibility

Get API Key: https://huggingface.co/settings/tokens

Configuration:

Popular Models:

  • Qwen/Qwen2.5-72B-Instruct - Powerful general-purpose model

  • meta-llama/Llama-3.1-8B-Instruct - Meta's Llama model

  • mistralai/Mistral-7B-Instruct-v0.3 - Fast and efficient


⚡ Groq

Best for: Ultra-fast inference with LPU architecture

Get API Key: https://console.groq.com/

Configuration:

Special Features:

  • Fastest inference - Up to 500 tokens/second

  • LPU architecture - Hardware-optimized for AI

  • Free tier - Generous limits for testing


🔷 DeepSeek

Best for: Code generation, reasoning tasks, cost-effective

Get API Key: https://platform.deepseek.com/

Configuration:


🟠 Mistral

Best for: European data residency, balanced performance/cost

Get API Key: https://console.mistral.ai/

Configuration:

Available Models:

  • mistral-large-latest - Most capable

  • mistral-medium-latest - Balanced

  • mistral-small-latest - Fast, cost-effective


🌐 OpenRouter (Multi-Model Gateway)

Best for: Access multiple models through one API, cost optimization

Get API Key: https://openrouter.ai/keys

Configuration:

Special Features:

  • Access 100+ models through one API

  • Automatic fallback if model unavailable

  • Cost tracking across providers

  • Free models available


🔎 Perplexity

Best for: Research, factual accuracy, citations

Get API Key: https://www.perplexity.ai/settings/api

Configuration:

Special Features:

  • Real-time web search integration

  • Automatic source citations

  • Fact-checked responses


🧡 Cohere

Best for: Embeddings, multilingual support, RAG applications

Get API Key: https://dashboard.cohere.com/api-keys

Configuration:

Special Features:

  • Best-in-class embeddings for RAG

  • Native multilingual support (100+ languages)

  • Tool use and structured output


🚀 Voyage

Best for: State-of-the-art embeddings optimized for RAG

Get API Key: https://dash.voyageai.com/

Configuration:

Note: Voyage is embedding-only (use with aiEmbed() or vector memory)


🦙 Local AI with Ollama

Perfect for privacy, offline use, and zero API costs!

Why Ollama?

  • 100% Free - No API costs ever

  • Privacy - Data never leaves your machine

  • Offline - Works without internet

  • No Rate Limits - Use as much as you want

  • Fast - Low latency on local hardware

Installation Methods

Option 1: Native Installation

macOS:

Linux:

Windows: Download installer from https://ollama.ai

See Running Ollama with Docker in the installation guide.

Pull and Configure Models

BoxLang Configuration

Note: Ollama doesn't require an API key for local use.

Verify Installation

Model Selection Guide

Model
Size
Speed
Quality
Use Case

llama3.2:1b

1GB

⚡⚡⚡

⭐⭐

Quick responses, testing

llama3.2

3GB

⚡⚡

⭐⭐⭐

General use (recommended)

phi3

2GB

⚡⚡⚡

⭐⭐⭐

Balanced quality/speed

mistral

4GB

⚡⚡

⭐⭐⭐⭐

High quality responses

codellama

4GB

⚡⚡

⭐⭐⭐⭐

Code generation

qwen2.5:7b

5GB

⭐⭐⭐⭐⭐

Best quality (slower)

Hardware Requirements

  • Minimum: 8GB RAM, 4GB disk space

  • Recommended: 16GB RAM, 10GB disk space

  • Optimal: 32GB RAM, GPU (NVIDIA/AMD)


🔐 Environment Variables

Use environment variables to keep API keys out of config files:

In boxlang.json

Set Environment Variables

macOS/Linux:

Windows:

Auto-Detection

Convention: By default, BoxLang AI automatically detects environment variables following the pattern {PROVIDER_NAME}_API_KEY. This means you don't need to explicitly configure API keys in boxlang.json if you set the appropriate environment variable.

For example:

  • Setting OPENAI_API_KEY allows you to use OpenAI without configuration

  • Setting CLAUDE_API_KEY allows you to use Claude without configuration

  • And so on for all providers

Automatically detected environment variables:

  • OPENAI_API_KEY

  • CLAUDE_API_KEY

  • ANTHROPIC_API_KEY (alternative for Claude)

  • GEMINI_API_KEY

  • GOOGLE_API_KEY (alternative for Gemini)

  • GROQ_API_KEY

  • DEEPSEEK_API_KEY

  • HUGGINGFACE_API_KEY

  • HF_TOKEN (alternative for HuggingFace)

  • MISTRAL_API_KEY

  • PERPLEXITY_API_KEY

  • COHERE_API_KEY

  • VOYAGE_API_KEY


🔄 Multiple Providers

Use different providers for different tasks:

Provider Services

Create reusable service instances:


🔧 Troubleshooting

❌ "No API key provided"

Solution: Set API key in config or pass directly

⏱️ "Connection timeout"

Solution: Increase timeout setting

🔌 "Connection refused" (Ollama)

Check if Ollama is running:

🚫 "Model not found"

Solution: Pull the model first (Ollama)

💰 "Rate limit exceeded"

Solutions:

  • Upgrade to paid tier

  • Implement request caching

  • Use different provider for high-volume tasks

  • Switch to Ollama (no limits)

🔑 "Invalid API key"

Verify:

  • Key is complete and not truncated

  • Key is for correct provider

  • Key has not expired

  • Account has credits/subscription


🚀 Next Steps


💡 Tips for Production

  1. Use environment variables for API keys (never commit to git)

  2. Set appropriate timeouts based on your use case

  3. Implement retry logic for transient errors

  4. Monitor costs with provider dashboards

  5. Use Ollama for development/testing to save costs

  6. Cache responses when possible to reduce API calls

  7. Choose right model for each task (don't always use most expensive)

  8. Rate limit your application to avoid provider rate limits

Last updated