Pinecone Vector Database Guide

Pinecone is our semantic memory powerhouse, storing knowledge, concepts, and facts as vector embeddings. This guide explains how vector databases work, why we chose Pinecone, and how to use it effectively for semantic memory.

🎯 What are Vector Embeddings?

The Problem with Text Search

Traditional text search has limitations:

Exact Match Only: "machine learning" won't find "ML" or "artificial intelligence"
No Semantic Understanding: Can't understand that "car" and "automobile" are related
Language Barriers: "hello" and "hola" are treated as completely different
Context Loss: "bank" could mean financial institution or river edge

The Vector Solution

Vector embeddings solve these problems by converting text into numerical representations that capture meaning:

// Text input
const text = "machine learning algorithms";

// Vector embedding (simplified - actual vectors have 768 dimensions)
const embedding = [0.1, 0.3, 0.7, 0.2, 0.9, 0.4, 0.6, 0.8, ...];

Key Properties of Vectors:

Similarity: Similar concepts have similar vectors
Distance: Vector distance represents semantic similarity
Dimensionality: Higher dimensions = more nuanced understanding
Language Agnostic: Same concept in different languages has similar vectors

🧠 How Vector Similarity Works

Cosine Similarity

The most common way to measure vector similarity:

function cosineSimilarity(vecA: number[], vecB: number[]): number {
  const dotProduct = vecA.reduce((sum, a, i) => sum + a * vecB[i], 0);
  const magnitudeA = Math.sqrt(vecA.reduce((sum, a) => sum + a * a, 0));
  const magnitudeB = Math.sqrt(vecB.reduce((sum, b) => sum + b * b, 0));
  
  return dotProduct / (magnitudeA * magnitudeB);
}

// Example
const vector1 = [0.1, 0.2, 0.3]; // "machine learning"
const vector2 = [0.15, 0.25, 0.35]; // "artificial intelligence"
const similarity = cosineSimilarity(vector1, vector2); // ~0.99 (very similar)

Similarity Thresholds

const SIMILARITY_THRESHOLDS = {
  EXACT_MATCH: 0.95,      // Nearly identical
  VERY_SIMILAR: 0.85,     // Highly related
  SIMILAR: 0.70,          // Related concepts
  SOMEWHAT_SIMILAR: 0.50, // Loosely related
  DIFFERENT: 0.30         // Unrelated
};

🏗️ Pinecone Architecture

Why Pinecone?

Managed Service: No infrastructure management
High Performance: Sub-second search across millions of vectors
Scalability: Handles growing data automatically
Developer Friendly: Simple API and excellent documentation
Cost Effective: Pay only for what you use

Data Structure

Index Structure

interface PineconeVector {
  id: string;                    // Unique identifier
  values: number[];              // 768-dimensional vector
  metadata: {
    content: string;             // Original text
    concept: string;             // Main concept
    category: string;            // Knowledge category
    confidence: number;          // Confidence score
    source: string;              // Where it came from
    timestamp: string;           // When it was created
    tags: string[];              // Searchable tags
    userId: string;              // User who created it
  };
}

🚀 Setting Up Pinecone

1. Create Pinecone Account

# Visit https://www.pinecone.io/
# Sign up for free account
# Get your API key and environment

2. Environment Configuration

# packages/server/.env
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_ENVIRONMENT=your_pinecone_environment
PINECONE_INDEX_NAME=clear-ai-memories

3. Initialize Pinecone

import { Pinecone } from '@pinecone-database/pinecone';

const pinecone = new Pinecone({
  apiKey: process.env.PINECONE_API_KEY!,
  environment: process.env.PINECONE_ENVIRONMENT!
});

const index = pinecone.Index(process.env.PINECONE_INDEX_NAME!);

4. Create Index

async function createPineconeIndex() {
  try {
    await pinecone.createIndex({
      name: 'clear-ai-memories',
      dimension: 768, // nomic-embed-text dimensions
      metric: 'cosine',
      spec: {
        serverless: {
          cloud: 'aws',
          region: 'us-east-1'
        }
      }
    });
    
    console.log('Index created successfully');
  } catch (error) {
    console.error('Error creating index:', error);
  }
}

🔍 Vector Operations

Storing Semantic Memory

async function storeSemanticMemory(memoryData: SemanticMemoryData) {
  try {
    // Generate embedding using Ollama
    const embedding = await generateEmbedding(memoryData.content);
    
    // Prepare vector for Pinecone
    const vector: PineconeVector = {
      id: memoryData.id,
      values: embedding,
      metadata: {
        content: memoryData.content,
        concept: memoryData.concept,
        category: memoryData.category,
        confidence: memoryData.confidence,
        source: memoryData.source,
        timestamp: memoryData.timestamp,
        tags: memoryData.tags,
        userId: memoryData.userId
      }
    };
    
    // Upsert to Pinecone
    await index.upsert([vector]);
    
    return { success: true, id: memoryData.id };
  } catch (error) {
    console.error('Error storing semantic memory:', error);
    throw error;
  }
}

Searching Semantic Memory

async function searchSemanticMemory(
  query: string, 
  userId: string, 
  limit: number = 10,
  threshold: number = 0.7
) {
  try {
    // Generate query embedding
    const queryEmbedding = await generateEmbedding(query);
    
    // Search Pinecone
    const searchResponse = await index.query({
      vector: queryEmbedding,
      topK: limit,
      includeMetadata: true,
      filter: {
        userId: { $eq: userId }
      }
    });
    
    // Filter by similarity threshold
    const results = searchResponse.matches
      ?.filter(match => match.score && match.score >= threshold)
      .map(match => ({
        id: match.id,
        content: match.metadata?.content,
        concept: match.metadata?.concept,
        category: match.metadata?.category,
        confidence: match.metadata?.confidence,
        similarity: match.score,
        tags: match.metadata?.tags
      })) || [];
    
    return results;
  } catch (error) {
    console.error('Error searching semantic memory:', error);
    throw error;
  }
}

Advanced Search with Filters

async function searchWithFilters(
  query: string,
  filters: {
    userId: string;
    category?: string;
    tags?: string[];
    minConfidence?: number;
    dateRange?: { start: string; end: string };
  },
  limit: number = 10
) {
  try {
    const queryEmbedding = await generateEmbedding(query);
    
    // Build filter object
    const pineconeFilter: any = {
      userId: { $eq: filters.userId }
    };
    
    if (filters.category) {
      pineconeFilter.category = { $eq: filters.category };
    }
    
    if (filters.tags && filters.tags.length > 0) {
      pineconeFilter.tags = { $in: filters.tags };
    }
    
    if (filters.minConfidence) {
      pineconeFilter.confidence = { $gte: filters.minConfidence };
    }
    
    if (filters.dateRange) {
      pineconeFilter.timestamp = {
        $gte: filters.dateRange.start,
        $lte: filters.dateRange.end
      };
    }
    
    const searchResponse = await index.query({
      vector: queryEmbedding,
      topK: limit,
      includeMetadata: true,
      filter: pineconeFilter
    });
    
    return searchResponse.matches?.map(match => ({
      id: match.id,
      content: match.metadata?.content,
      concept: match.metadata?.concept,
      category: match.metadata?.category,
      confidence: match.metadata?.confidence,
      similarity: match.score,
      tags: match.metadata?.tags,
      timestamp: match.metadata?.timestamp
    })) || [];
  } catch (error) {
    console.error('Error searching with filters:', error);
    throw error;
  }
}

📊 Vector Management

Batch Operations

async function batchUpsertMemories(memories: SemanticMemoryData[]) {
  try {
    // Generate embeddings for all memories
    const vectors = await Promise.all(
      memories.map(async (memory) => {
        const embedding = await generateEmbedding(memory.content);
        return {
          id: memory.id,
          values: embedding,
          metadata: {
            content: memory.content,
            concept: memory.concept,
            category: memory.category,
            confidence: memory.confidence,
            source: memory.source,
            timestamp: memory.timestamp,
            tags: memory.tags,
            userId: memory.userId
          }
        };
      })
    );
    
    // Upsert in batches of 100 (Pinecone limit)
    const batchSize = 100;
    for (let i = 0; i < vectors.length; i += batchSize) {
      const batch = vectors.slice(i, i + batchSize);
      await index.upsert(batch);
    }
    
    return { success: true, count: vectors.length };
  } catch (error) {
    console.error('Error batch upserting memories:', error);
    throw error;
  }
}

Memory Updates

async function updateSemanticMemory(
  id: string, 
  updates: Partial<SemanticMemoryData>
) {
  try {
    // Get existing vector
    const fetchResponse = await index.fetch([id]);
    const existingVector = fetchResponse.vectors?.[id];
    
    if (!existingVector) {
      throw new Error('Memory not found');
    }
    
    // Update metadata
    const updatedMetadata = {
      ...existingVector.metadata,
      ...updates,
      updatedAt: new Date().toISOString()
    };
    
    // If content changed, regenerate embedding
    let values = existingVector.values;
    if (updates.content) {
      values = await generateEmbedding(updates.content);
    }
    
    // Upsert updated vector
    await index.upsert([{
      id,
      values,
      metadata: updatedMetadata
    }]);
    
    return { success: true, id };
  } catch (error) {
    console.error('Error updating semantic memory:', error);
    throw error;
  }
}

Memory Deletion

async function deleteSemanticMemory(id: string) {
  try {
    await index.deleteOne(id);
    return { success: true, id };
  } catch (error) {
    console.error('Error deleting semantic memory:', error);
    throw error;
  }
}

async function deleteUserMemories(userId: string) {
  try {
    // Delete all vectors for a user
    await index.deleteMany({
      filter: {
        userId: { $eq: userId }
      }
    });
    
    return { success: true, userId };
  } catch (error) {
    console.error('Error deleting user memories:', error);
    throw error;
  }
}

🎯 Real-World Examples

Example 1: Knowledge Base Search

// Store knowledge about React
await storeSemanticMemory({
  id: 'react-hooks-knowledge',
  content: 'React hooks are functions that let you use state and lifecycle features in functional components',
  concept: 'React Hooks',
  category: 'frontend',
  confidence: 0.9,
  source: 'documentation',
  timestamp: new Date().toISOString(),
  tags: ['React', 'hooks', 'functional-components', 'state'],
  userId: 'user-123'
});

// Search for similar knowledge
const results = await searchSemanticMemory(
  'How do I use state in React components?',
  'user-123',
  5,
  0.7
);

// Results will include the React hooks knowledge even though
// the query uses different wording

Example 2: Concept Learning

// Store learning progress
await storeSemanticMemory({
  id: 'user-learned-promises',
  content: 'User successfully implemented async/await with Promises in JavaScript',
  concept: 'JavaScript Promises',
  category: 'learning-progress',
  confidence: 0.8,
  source: 'code-review',
  timestamp: new Date().toISOString(),
  tags: ['JavaScript', 'Promises', 'async-await', 'learning'],
  userId: 'user-123'
});

// Later, when user asks about async programming
const results = await searchSemanticMemory(
  'I need help with asynchronous programming',
  'user-123',
  10,
  0.6
);

// System knows user already learned Promises and can provide
// more advanced async programming concepts

Example 3: Contextual Recommendations

// Store user preferences
await storeSemanticMemory({
  id: 'user-prefers-typescript',
  content: 'User prefers TypeScript over JavaScript for type safety',
  concept: 'Programming Language Preference',
  category: 'user-preferences',
  confidence: 0.9,
  source: 'conversation',
  timestamp: new Date().toISOString(),
  tags: ['TypeScript', 'JavaScript', 'preferences', 'type-safety'],
  userId: 'user-123'
});

// When suggesting solutions, system will prioritize TypeScript examples
const results = await searchSemanticMemory(
  'How do I handle errors in my code?',
  'user-123',
  5,
  0.7
);

// Results will be filtered to show TypeScript error handling patterns

🔧 Performance Optimization

Index Configuration

// Optimize for your use case
const indexConfig = {
  name: 'clear-ai-memories',
  dimension: 768,
  metric: 'cosine', // Best for text similarity
  spec: {
    serverless: {
      cloud: 'aws',
      region: 'us-east-1'
    }
  }
};

Query Optimization

// Use appropriate topK values
const searchConfig = {
  topK: 10,        // Don't fetch more than needed
  includeMetadata: true, // Only if you need metadata
  filter: {        // Use filters to reduce search space
    userId: { $eq: userId }
  }
};

Caching Strategy

// Cache frequently accessed vectors
const vectorCache = new Map<string, number[]>();

async function getCachedEmbedding(text: string): Promise<number[]> {
  if (vectorCache.has(text)) {
    return vectorCache.get(text)!;
  }
  
  const embedding = await generateEmbedding(text);
  vectorCache.set(text, embedding);
  
  // Limit cache size
  if (vectorCache.size > 1000) {
    const firstKey = vectorCache.keys().next().value;
    vectorCache.delete(firstKey);
  }
  
  return embedding;
}

🚨 Troubleshooting

Common Issues

1. Index Not Found

// Check if index exists
const indexes = await pinecone.listIndexes();
console.log('Available indexes:', indexes);

// Create index if it doesn't exist
if (!indexes.includes('clear-ai-memories')) {
  await createPineconeIndex();
}

2. Dimension Mismatch

// Ensure embedding dimensions match index
const embedding = await generateEmbedding(text);
console.log('Embedding dimensions:', embedding.length); // Should be 768

// Check index configuration
const indexStats = await index.describeIndexStats();
console.log('Index dimensions:', indexStats.dimension);

3. Query Timeout

// Add timeout to queries
const searchResponse = await index.query({
  vector: queryEmbedding,
  topK: 10,
  includeMetadata: true,
  filter: { userId: { $eq: userId } }
}).timeout(5000); // 5 second timeout

4. Rate Limiting

// Implement rate limiting
class RateLimitedPinecone {
  private lastRequest = 0;
  private minInterval = 100; // 100ms between requests
  
  async query(params: any) {
    const now = Date.now();
    const timeSinceLastRequest = now - this.lastRequest;
    
    if (timeSinceLastRequest < this.minInterval) {
      await new Promise(resolve => 
        setTimeout(resolve, this.minInterval - timeSinceLastRequest)
      );
    }
    
    this.lastRequest = Date.now();
    return index.query(params);
  }
}

🎯 Best Practices

1. Data Organization

Use meaningful IDs: Include user and concept information
Rich metadata: Store all relevant information in metadata
Consistent tags: Use standardized tag naming conventions
User isolation: Always filter by userId for security

2. Query Strategy

Appropriate thresholds: Start with 0.7, adjust based on results
Limit results: Don't fetch more than needed
Use filters: Reduce search space with metadata filters
Cache embeddings: Avoid regenerating same embeddings

3. Performance

Batch operations: Group multiple upserts
Monitor usage: Track API calls and costs
Optimize dimensions: Use appropriate embedding model
Regular cleanup: Remove outdated or low-quality vectors

4. Security

User isolation: Never mix user data
Input validation: Sanitize all inputs
Access control: Implement proper authentication
Data encryption: Use HTTPS for all API calls

🚀 Next Steps

Now that you understand Pinecone vectors, explore:

Ollama Embeddings - Local text embedding generation
Memory Examples - Practical usage scenarios
Troubleshooting Guide - Common issues and solutions
Memory System Overview - Complete system understanding

Ready to learn about local embeddings? Check out the Ollama Integration Guide!

🎯 What are Vector Embeddings?​

The Problem with Text Search​

The Vector Solution​

🧠 How Vector Similarity Works​

Cosine Similarity​

Similarity Thresholds​

🏗️ Pinecone Architecture​

Why Pinecone?​

Data Structure​

Index Structure​

🚀 Setting Up Pinecone​

1. Create Pinecone Account​

2. Environment Configuration​

3. Initialize Pinecone​

4. Create Index​

🔍 Vector Operations​

Storing Semantic Memory​

Searching Semantic Memory​

Advanced Search with Filters​

📊 Vector Management​

Batch Operations​

Memory Updates​

Memory Deletion​

🎯 Real-World Examples​

Example 1: Knowledge Base Search​

Example 2: Concept Learning​

Example 3: Contextual Recommendations​

🔧 Performance Optimization​

Index Configuration​

Query Optimization​

Caching Strategy​

🚨 Troubleshooting​

Common Issues​

1. Index Not Found​

2. Dimension Mismatch​

3. Query Timeout​

4. Rate Limiting​

🎯 Best Practices​

1. Data Organization​

2. Query Strategy​

3. Performance​

4. Security​

🚀 Next Steps​