Core Concepts

This page explains the key ideas behind Clear AI v2 in plain language. Whether you're a developer, product manager, or just curious about conversational AI, you'll understand how everything works.

The Problem: Why Traditional AI Agents Fall Short

Imagine asking an assistant: "Show me sales data"

Traditional AI might:

Guess which time period you mean (this week? this month?)
Return all data even if it's millions of records
Give confident answers even with incomplete data
Fail if the task takes multiple steps

Clear AI v2 instead:

Asks: "Which time period: today, this week, or this month?"
Shows progress: "Step 2/4: Analyzing 1,543 records..."
Expresses uncertainty: "Based on limited data (23%), I'm 65% confident..."
Remembers context when you ask "What about rejected ones?"

Core Concepts

1. Conversational Intelligence

Traditional AI gives one-shot responses. Clear AI v2 enables natural, multi-turn conversations.

What It Means

Your AI can:

Ask clarifying questions when information is ambiguous
Show progress for long-running tasks
Express uncertainty when confidence is low
Understand follow-ups like "what about those?" or "and the rejected ones?"

Why It Matters

Real users don't speak in perfect queries. They say things like:

"Show me data" (what data? from when?)
"What about the other ones?" (refers to previous conversation)
"Yes" (confirming an action)

Clear AI v2 handles all of this naturally.

How It Works

Response Types: The AI can respond in different ways

Answer: Direct response with data
Question: Ask for clarification
Progress: "Step 2 of 5: Processing..."
Acknowledgment: "Got it, working on that"

Intent Detection: Understands what users mean

"Show me shipments" → Query (fetch data)
"What is a shipment?" → Question (explain concept)
"yes" or "no" → Confirmation (approve/deny)
"What about FacilityB?" → Follow-up (referring to previous context)

Confidence Scoring: Expresses certainty appropriately

High confidence (90%): Give direct answer
Low confidence (60%): Add disclaimer "I'm not completely certain..."

2. Context Management

AI models have limited memory (context windows). A conversation might start with 4,000 tokens available but after 20 messages, you're out of space.

What It Means

Clear AI v2 manages conversation context intelligently:

Tracks all messages and their token counts
Compresses old messages when the context window fills up
Preserves important information (recent messages, key entities, system instructions)
Saves ~70-80% of tokens when compressing

Why It Matters

Without context management:

Long conversations fail with "context limit exceeded"
Important information gets lost
Every message costs more and more

With Clear AI v2:

Conversations can go on indefinitely
Key information is always available
Costs stay reasonable

How It Works

Compression Strategies:

Sliding Window: Keep only the N most recent messages
Prioritization: Keep important messages, drop less important ones
Summarization: Use AI to condense old messages into summaries

The system automatically chooses the best strategy based on the situation.

3. Memory Systems

There are two types of memory:

Episodic Memory (Neo4j)

What: Remembers the flow of conversations

Like: Your personal journal of interactions

Example:

User asked about shipments at 2pm
Then asked about facilities at 2:15pm
Then asked "What about rejected ones?" (referring to shipments from earlier)

Why: Understand conversation history and relationships

Semantic Memory (Pinecone)

What: Remembers facts and knowledge that can be searched

Like: A searchable database of everything the AI has learned

Example:

"FacilityA specializes in plastic recycling"
"Normal contamination rate is 2-5%"
"User prefers weekly reports"

Why: Retrieve relevant information based on meaning, not just keywords

4. Workflows

Complex tasks require multiple steps with conditional logic. Workflows let you define these step-by-step processes.

What It Means

Instead of writing complex if/else chains, you define a state graph:

Start → Fetch Data → Analyze → Valid? 
                               ↓ No → Ask User
                               ↓ Yes → Generate Report → End

Why It Matters

Real business logic is complex:

If analysis confidence > 70%, proceed; otherwise ask for confirmation
If data is incomplete, fetch more; if that fails, notify user
If task is interrupted, save state and resume later

How It Works

Graph Builder: Define nodes (steps) and edges (transitions)

Conditional Branching: Next step depends on current state

Checkpointing: Save progress, resume after interruptions

5. Token Management

AI providers charge by tokens (roughly words). Longer conversations = more money.

What It Means

Clear AI v2 helps you:

Count tokens accurately for any text or conversation
Set budgets per operation or per user
Estimate costs before executing
Prevent overruns automatically

Why It Matters

Without token management:

Costs spiral out of control
One long conversation could cost $10+
No visibility into spending

With Clear AI v2:

Set budget: "Max 5,000 tokens per query"
Get warnings: "90% of budget used"
See costs: "This query cost $0.012"

How It Works

Token Counter: Uses tiktoken for accurate counting across all models

Token Budget: Reserve tokens, track usage, prevent overruns

Cost Estimator: Calculate $ cost based on provider pricing

6. Multi-Provider LLM Support

Don't lock yourself into one AI provider.

What It Means

Clear AI v2 works with:

OpenAI (GPT-3.5, GPT-4) - Industry standard
Groq (Llama, Mixtral) - Fast, free tier
Ollama (Local models) - Privacy, no costs

Automatic fallback: If OpenAI is down, try Groq, then Ollama.

Why It Matters

Reliability: If one provider is down, others work
Cost optimization: Use cheaper providers when appropriate
Privacy: Use local models for sensitive data
Flexibility: Switch providers without code changes

How It Works

One unified interface for all providers:

const llm = new LLMProvider(); // Automatically picks best available
const response = await llm.chat(messages);

Configuration handles everything:

DEFAULT_LLM_PROVIDER=openai
FALLBACK_PROVIDERS=groq,ollama

7. Observability

In production, you need to see what's happening when things go wrong.

What It Means

Clear AI v2 integrates with Langfuse to track:

Every LLM call (prompt, response, tokens, cost)
Every workflow step
Performance metrics
Errors and failures

Why It Matters

When a user reports "the AI gave a wrong answer":

See exactly what prompt was sent
Check what the LLM returned
Review token usage and costs
Trace through the entire workflow

How It Works

Automatic tracing with minimal code:

const tracer = new LangfuseTracer(config);
const trace = tracer.startTrace('user_query');

// ... execute your logic

tracer.endTrace(trace.id);

View everything in the Langfuse dashboard.

How It All Fits Together

Here's a typical flow:

User sends message: "Show me contaminated shipments"
Intent Detection: Classifies as 'query'
Context Check: Do we have enough tokens? Yes.
Workflow Starts:
- Step 1: Fetch data from database
- Step 2: Analyze results
- Step 3: Calculate confidence (85%)
- Step 4: Format response
Progress Updates: User sees "Step 2/4: Analyzing data..."

Response Built:

Found 23 contaminated shipments from the past week.
[confidence: 85%]

Context Updated: Message and response added to conversation history
Memory Stored:
- Episodic: "User queried contaminated shipments at 3:45pm"
- Semantic: "Normal contamination rate is 2-5% based on data"
Observability: Everything logged to Langfuse
User Follow-up: "What about from FacilityA?"
Intent Detection: Classifies as 'follow-up'
Context Retrieval: Knows "those" refers to contaminated shipments
Entity Extraction: Extracts "FacilityA"
Filtered Response: Returns only FacilityA's contaminated shipments

Key Benefits

For Users

✅ Natural conversations (ask follow-ups, get clarifications)
✅ Transparency (see progress, understand confidence)
✅ Reliability (works even with ambiguous requests)

For Developers

✅ Production-ready (tested, documented, robust)
✅ Modular (use only what you need)
✅ Type-safe (TypeScript throughout)
✅ Well-tested (724 unit + 45 integration tests)

For Business

✅ Cost control (token budgets, usage tracking)
✅ Observability (debug production issues)
✅ Flexibility (switch providers, no lock-in)
✅ Scalability (handles long conversations efficiently)

What's Next?

Now that you understand the concepts:

🏗️ Architecture - See the technical design
💬 Conversational AI - Dive into response types
🧠 Context Management - Learn about compression
📊 Workflows - Build multi-step processes
💰 Token Management - Control costs

Questions? Check the guides or specific module documentation.

The Problem: Why Traditional AI Agents Fall Short​

Core Concepts​

1. Conversational Intelligence​

What It Means​

Why It Matters​

How It Works​

2. Context Management​

What It Means​

Why It Matters​

How It Works​

3. Memory Systems​

Episodic Memory (Neo4j)​

Semantic Memory (Pinecone)​

4. Workflows​

What It Means​

Why It Matters​

How It Works​

5. Token Management​

What It Means​

Why It Matters​

How It Works​

6. Multi-Provider LLM Support​

What It Means​

Why It Matters​

How It Works​

7. Observability​

What It Means​

Why It Matters​

How It Works​

How It All Fits Together​

Key Benefits​

For Users​

For Developers​

For Business​

What's Next?​

The Problem: Why Traditional AI Agents Fall Short

Core Concepts

1. Conversational Intelligence

What It Means

Why It Matters

How It Works

2. Context Management

What It Means

Why It Matters

How It Works

3. Memory Systems

Episodic Memory (Neo4j)

Semantic Memory (Pinecone)

4. Workflows

What It Means

Why It Matters

How It Works

5. Token Management

What It Means

Why It Matters

How It Works

6. Multi-Provider LLM Support

What It Means

Why It Matters

How It Works

7. Observability

What It Means

Why It Matters

How It Works

How It All Fits Together

Key Benefits

For Users

For Developers

For Business

What's Next?