Vector Memory Systems

Comprehensive guide to vector memory systems for semantic search and retrieval in BoxLang AI applications.

Vector memory enables semantic search and retrieval using embeddings to find contextually relevant information based on meaning rather than just recency. This guide covers all vector memory implementations and how to choose the right one for your needs.

πŸ“– Looking for Standard Memory? For conversation history management, see the Memory Systems Guide.

πŸ“‹ Table of Contents


πŸ”’ Multi-Tenant Isolation

All vector memory providers support multi-tenant isolation through userId and conversationId parameters. This enables secure, isolated vector storage for:

  • Per-user isolation: Separate vector collections per user

  • Per-conversation isolation: Multiple conversations for same user

  • Combined isolation: Complete conversation isolation in shared collections

How Multi-Tenant Works

Vector memories automatically filter searches and retrievals by userId/conversationId:

Multi-Conversation Support

Isolate multiple conversations for the same user:

Storage Strategy by Provider

Provider
Storage Method
Filter Type

BoxVector

Metadata

In-memory filter

Chroma

Metadata

$and operator

Milvus

Metadata

filter expressions

MySQL

Dedicated columns

SQL WHERE

Postgres

Dedicated columns

SQL WHERE

Pinecone

Metadata

$eq operators

Qdrant

Payload root

match filters

TypeSense

Root fields

:= filters

Weaviate

Properties root

GraphQL Equal

Hybrid

Delegates to vector provider

Provider-specific

All providers support getAllDocuments(), getRelevant(), and findSimilar() with automatic tenant filtering.

For enterprise patterns, security considerations, and advanced multi-tenancy, see the Multi-Tenant Memory Guide.


πŸ“– Overview

Vector memory systems store conversation messages as embeddings (numerical vector representations) and enable semantic similarity search. Unlike standard memory that retrieves messages chronologically, vector memory finds the most relevant messages based on meaning.

πŸ—οΈ Vector Memory Architecture

Key Benefits

  • Semantic Understanding: Find relevant context based on meaning, not just keywords

  • Long-term Context: Search across thousands of past messages efficiently

  • Intelligent Retrieval: Get the most relevant history, even if discussed long ago

  • Scalable: Handle large conversation datasets with specialized vector databases

  • Flexible: Choose from local (in-memory), cloud, or self-hosted solutions

Use Cases

  • Customer Support: Retrieve relevant past support cases

  • Knowledge Bases: Find similar questions and answers

  • Long Conversations: Maintain context across lengthy interactions

  • Multi-session: Remember user preferences across sessions

  • RAG Applications: Combine document retrieval with AI responses


πŸ”„ How Vector Memory Works

πŸ”„ Vector Search Process

1. Embedding Generation

When you add a message, it's converted to a vector embedding:

When retrieving context, vector memory finds similar messages:

3. Integration with Agents

Agents automatically use vector memory for context:


Choosing a Vector Provider

Quick Decision Matrix

Provider
Best For
Setup
Cost
Performance
Multi-Tenant

BoxVector

Development, testing, small datasets

βœ… Instant

Free

Good

βœ…

Hybrid

Balanced recent + semantic

βœ… Easy

Low

Excellent

βœ…

ChromaDB

Python integration, local dev

βš™οΈ Moderate

Free

Good

βœ…

PostgreSQL

Existing Postgres infrastructure

βš™οΈ Moderate

Low

Good

βœ…

MySQL

Existing MySQL 9+ infrastructure

βš™οΈ Moderate

Low

Good

βœ…

TypeSense

Fast typo-tolerant search, autocomplete

βš™οΈ Easy

Free/Paid

Excellent

βœ…

Pinecone

Production, cloud-first

βš™οΈ Easy

Paid

Excellent

βœ…

Qdrant

Self-hosted, high performance

βš™οΈ Complex

Free/Paid

Excellent

βœ…

Weaviate

GraphQL, knowledge graphs

βš™οΈ Complex

Free/Paid

Excellent

βœ…

Milvus

Enterprise, massive scale

βš™οΈ Complex

Free/Paid

Outstanding

βœ…

Detailed Recommendations

Start Development:

  • Use BoxVector for immediate prototyping

  • Use Hybrid when you need both recent and semantic context

Production (Cloud):

  • Pinecone: Best for cloud-native, managed service

  • Qdrant Cloud: Excellent performance, generous free tier

Production (Self-Hosted):

  • PostgreSQL: If you already use Postgres

  • MySQL: If you already use MySQL 9+

  • TypeSense: Fast typo-tolerant search with low latency

  • Qdrant: Best performance for self-hosted

  • Milvus: Enterprise-grade, handles billions of vectors

Special Use Cases:

  • ChromaDB: Python ML infrastructure

  • Weaviate: Complex queries, GraphQL API

  • Hybrid: Best of both worlds (recent + semantic)


Vector Memory Types

BoxVectorMemory

In-memory vector storage perfect for development and testing.

Features:

  • No external dependencies

  • Instant setup

  • Full feature support

  • Cosine similarity search

Configuration:

Multi-Tenant Configuration:

Best For:

  • Local development

  • Testing

  • Small datasets (< 10,000 messages)

  • Proof of concepts

Limitations:

  • Data lost on restart

  • Limited to single instance

  • Memory usage grows with dataset


ChromaVectorMemory

ChromaDB integration for local vector storage.

Features:

  • Local persistence

  • Python ecosystem integration

  • Easy Docker deployment

  • Metadata filtering

Setup:

Configuration:

Multi-Tenant Configuration:

Best For:

  • Python-based infrastructure

  • Local development with persistence

  • Medium datasets (< 1M vectors)


PostgresVectorMemory

PostgreSQL with pgvector extension.

Features:

  • Use existing Postgres infrastructure

  • ACID compliance

  • Familiar SQL queries

  • Mature ecosystem

Setup:

Configuration:

Multi-Tenant Configuration:

Best For:

  • Existing PostgreSQL deployments

  • Applications requiring SQL access

  • Strong consistency requirements

  • Medium-large datasets


MysqlVectorMemory

MySQL 9+ with native VECTOR data type support.

Features:

  • Native vector storage (MySQL 9+)

  • Use existing MySQL infrastructure

  • ACID compliance

  • Familiar SQL ecosystem

  • Application-layer distance calculations (MySQL Community Edition compatible)

Requirements:

  • MySQL 9.0 or later (Community or Enterprise Edition)

  • Configured BoxLang datasource

  • VECTOR data type support

Setup:

MySQL 9 Community Edition includes native VECTOR data type support. No extensions needed - tables are auto-created:

Configuration:

BoxLang Datasource Setup:

Distance Functions:

  • COSINE: Cosine distance (1 - cosine similarity), best for semantic search

  • L2: Euclidean distance (L2 norm), good for spatial data

  • DOT: Dot product similarity, efficient for normalized vectors

Usage Example:

Multi-Tenant Configuration:

Best For:

  • Existing MySQL 9+ deployments

  • Organizations standardized on MySQL

  • Applications requiring SQL access

  • ACID compliance requirements

  • Medium-large datasets (millions of vectors)

Performance Notes:

  • Distance calculations performed in application layer (MySQL Community Edition compatible)

  • MySQL HeatWave (Oracle Cloud) provides native DISTANCE() function for optimal performance

  • Suitable for production use with proper indexing

  • Table is automatically created with collection-based indexing

MySQL Community vs HeatWave:

  • Community Edition (Free): VECTOR data type, app-layer distance calculations

  • HeatWave (Oracle Cloud): Native DISTANCE() function, VECTOR INDEX, GPU acceleration


TypesenseVectorMemory

TypeSense is a fast, typo-tolerant search engine optimized for instant search experiences and vector similarity search.

Features:

  • Lightning-fast search with typo tolerance

  • Native vector search support

  • Easy Docker deployment

  • RESTful API

  • Built-in relevance tuning

  • Excellent for autocomplete and instant search

Requirements:

  • TypeSense Server 0.23.0+ (vector search support)

  • HTTP/HTTPS access to TypeSense instance

  • API key for authentication

Setup:

Configuration:

TypeSense Cloud Configuration:

Usage Example:

Multi-Tenant Configuration:

Best For:

  • Applications requiring fast, low-latency search

  • Autocomplete and instant search features

  • Typo-tolerant semantic search

  • E-commerce product search

  • Documentation search

  • Customer support systems

  • Small to medium datasets (< 10M vectors)

TypeSense Advantages:

  • Speed: Sub-50ms search latency

  • Typo Tolerance: Built-in fuzzy search

  • Simple Setup: Single binary, easy Docker deployment

  • RESTful API: Simple HTTP API, easy integration

  • Relevance Tuning: Fine-grained control over ranking

Pricing:

  • Self-Hosted: Free (open source)

  • TypeSense Cloud:

    • Free tier: Development clusters

    • Paid: Production clusters from $0.03/hour

When to Choose TypeSense:

  • Need instant search with typo tolerance

  • Want simple deployment and management

  • Require low-latency semantic search

  • Building search-heavy applications

  • Need both keyword and vector search

Performance Notes:

  • Optimized for low-latency queries (< 50ms)

  • In-memory index for fast access

  • Horizontal scaling support

  • Efficient resource usage


PineconeVectorMemory

Pinecone managed cloud vector database.

Features:

  • Fully managed, no ops

  • Excellent performance

  • Auto-scaling

  • Built-in metadata filtering

Setup:

  1. Sign up at pinecone.io

  2. Create an index

  3. Get API key

Configuration:

Multi-Tenant Configuration:

Best For:

  • Production cloud deployments

  • Teams without ML ops expertise

  • Rapid scaling requirements

  • Global deployments

Pricing:

  • Free tier: 1GB storage, 100K operations/month

  • Paid: Scales with usage


QdrantVectorMemory

Qdrant high-performance vector search engine.

Features:

  • Rust-based (excellent performance)

  • Rich filtering capabilities

  • Payload support

  • Self-hosted or cloud

Setup:

Configuration:

Multi-Tenant Configuration:

Best For:

  • High-performance requirements

  • Self-hosted production

  • Complex filtering needs

  • Large datasets (millions of vectors)

Qdrant Cloud:

  • Free tier: 1GB cluster

  • Excellent developer experience


WeaviateVectorMemory

Weaviate GraphQL vector database with knowledge graph capabilities.

Features:

  • GraphQL API

  • Automatic vectorization (optional)

  • Knowledge graph functionality

  • Rich schema support

Setup:

Configuration:

Multi-Tenant Configuration:

Best For:

  • Complex entity relationships

  • Knowledge graph requirements

  • GraphQL preferences

  • Multi-modal applications


MilvusVectorMemory

Milvus enterprise-grade distributed vector database.

Features:

  • Massive scalability (billions of vectors)

  • Distributed architecture

  • GPU acceleration support

  • Enterprise features

Setup:

Configuration:

Multi-Tenant Configuration:

Best For:

  • Enterprise deployments

  • Massive datasets (> 10M vectors)

  • High throughput requirements

  • GPU-accelerated search


Hybrid Memory

HybridMemory combines the benefits of both standard memory (recency) and vector memory (relevance).

How It Works

  1. Maintains recent messages in a window

  2. Stores all messages in vector database

  3. Returns combination of recent + semantically relevant messages

  4. Automatically deduplicates

Configuration

Multi-Tenant Configuration:

Benefits

  • Recent Context: Always includes latest messages

  • Semantic Relevance: Finds related past conversations

  • Balanced: Best of both approaches

  • Automatic: No manual context management

Use Cases


Configuration Examples

Development Setup

Production (Cloud)

Production (Self-Hosted)

Embedding Provider Options

With Caching


Best Practices

1. Choose Appropriate Embedding Models

2. Use Metadata for Filtering

3. Optimize Collection Size

4. Monitor Performance

5. Use Hybrid for User-Facing Apps

6. Dimension Matching

Ensure embedding dimensions match across your application:

7. Use Multi-Tenant Isolation

Securely isolate user and conversation data in shared collections:


Advanced Usage

Custom Similarity Thresholds

Multi-Collection Strategy

Cross-Session Continuity

Batch Operations


Troubleshooting

Common Issues

1. Dimension Mismatch

Solution: Ensure embedding model dimensions match collection configuration

2. Connection Errors

Solution: Verify host, port, and network accessibility. Check firewall rules.

3. API Key Issues

Solution: Verify API keys for both embedding provider and vector database

4. Slow Performance

Solution:

  • Enable caching for embeddings

  • Use appropriate index type (Milvus, Qdrant)

  • Reduce limit parameter

  • Consider smaller embedding model

5. Out of Memory

Solution: Switch to persistent vector database (Chroma, Postgres, etc.)


See Also


Next Steps: Try the Vector Memory Examples or learn about building custom vector memory providers.

Last updated