🧬Embeddings
Generate numerical vector representations of text that capture semantic meaning. Embeddings power semantic search, recommendations, clustering, and similarity detection.
🎯 What are Embeddings?
Embeddings convert text into high-dimensional vectors (arrays of numbers) where semantically similar texts have similar vector representations.
🏗️ Embedding Architecture
Example:
"cat" → [0.2, -0.5, 0.8, 0.1, ...] (1536 dimensions)
"kitten" → [0.3, -0.4, 0.7, 0.2, ...] (similar vector)
"car" → [-0.6, 0.3, -0.2, 0.9, ...] (different vector)Key Properties:
Similar meanings = Close vectors
Different meanings = Distant vectors
Math operations preserve semantic relationships
Dimension count varies by model (typically 768-3072)
🔧 The aiEmbed() Function
aiEmbed() FunctionGenerate embeddings for single texts or batches.
🔄 Embedding Generation Flow
Basic Usage
Single Text
Batch Processing
Configuration Options
Provider Selection
Model Selection
Return Formats
Control what data is returned:
Raw Response (Default)
Full API response with metadata:
Embeddings Array
Get just the vectors:
First Vector
Get single vector for single input:
💡 Use Cases
🔍 Semantic Search
Find documents similar to a query using vector similarity:
Semantic Search Flow
Text Clustering
Group similar texts together:
Recommendations
Recommend items based on similarity:
Duplicate Detection
Find duplicate or near-duplicate content:
RAG (Retrieval Augmented Generation)
Combine embeddings with AI chat for intelligent Q&A:
Advanced Techniques
Dimension Reduction
Some models support dimension reduction for faster processing:
Caching Embeddings
Embeddings are expensive - cache them:
Batch Optimization
Process large datasets efficiently:
Chunked Document Embeddings
Embed large documents by chunks:
Provider Comparison
OpenAI
Models:
text-embedding-3-small(1536 dimensions) - Default, balancedtext-embedding-3-large(3072 dimensions) - Highest qualitytext-embedding-ada-002(1536 dimensions) - Legacy
Pros:
High quality embeddings
Good for English text
Supports dimension reduction
Cons:
Requires API key
Costs money
Data sent to OpenAI servers
Usage:
Ollama
Models:
nomic-embed-text(768 dimensions) - Recommendedmxbai-embed-large(1024 dimensions) - High qualityMany others available
Pros:
Completely free
Runs locally
Private - data stays on your machine
No API key needed
Cons:
Requires Ollama installation
Slightly lower quality than OpenAI
Slower than API calls
Setup:
Usage:
Gemini
Models:
text-embedding-004(768 dimensions)embedding-001(768 dimensions) - Legacy
Pros:
Good quality
Google infrastructure
Competitive pricing
Cons:
Requires API key
Data sent to Google
Usage:
Voyage AI
Models:
voyage-3(1024 dimensions) - Latest, highest qualityvoyage-3-lite(512 dimensions) - Faster, more efficientvoyage-code-3(1024 dimensions) - Optimized for codevoyage-finance-2(1024 dimensions) - Financial documentsvoyage-law-2(1024 dimensions) - Legal documents
Pros:
State-of-the-art quality for RAG and semantic search
Specialized models for specific domains
input_typeparameter optimizes for queries vs documentsExcellent performance on retrieval benchmarks
Cons:
Embeddings only (no chat support)
Requires API key
Free tier has 3 RPM rate limit
Data sent to Voyage servers
Setup:
Usage:
When to Use Voyage:
Building RAG (Retrieval Augmented Generation) systems
Semantic search requiring highest accuracy
Domain-specific applications (code, finance, legal)
When you can optimize queries vs documents separately
Note: Voyage specializes in embeddings only. For chat completions, use OpenAI, Claude, or another provider.
Cohere
Models:
embed-english-v3.0(1024 dimensions) - Latest English model, best qualityembed-multilingual-v3.0(1024 dimensions) - Supports 100+ languagesembed-english-light-v3.0(384 dimensions) - Faster, lighter versionembed-english-v2.0(4096 dimensions) - Legacy, larger model
Pros:
Excellent multilingual support (100+ languages)
input_typeparameter optimizes for different use casesMultiple model sizes for speed/quality tradeoffs
Also offers chat capabilities
Good documentation and examples
Competitive pricing
Cons:
Requires API key
Data sent to Cohere servers
Rate limits on free tier
Setup:
Usage:
Input Types:
search_query- Optimize for search queriessearch_document- Optimize for documents being searchedclustering- Optimize for clustering tasksclassification- Optimize for classification tasks
When to Use Cohere:
Need multilingual embeddings (100+ languages)
Want to optimize separately for queries vs documents
Building search, clustering, or classification systems
Need both embeddings and chat in one provider
Want multiple model size options
Best Practices
1. Choose the Right Model
2. Batch When Possible
3. Cache Embeddings
4. Normalize Text
5. Handle Errors Gracefully
6. Monitor Costs
Troubleshooting
Empty or Invalid Embeddings
Dimension Mismatches
Similarity Scores
Summary
Embeddings enable powerful semantic understanding:
Generate: Use
aiEmbed()for single or batch processingSearch: Compare vectors with cosine similarity
Optimize: Cache embeddings, batch requests, choose right model
Apply: Semantic search, clustering, recommendations, RAG
Start with OpenAI for quality, try Ollama for privacy and cost savings!
Last updated