Core Concepts
This page explains the key ideas behind Clear AI v2 in plain language. Whether you're a developer, product manager, or just curious about conversational AI, you'll understand how everything works.
The Problem: Why Traditional AI Agents Fall Short
Imagine asking an assistant: "Show me sales data"
Traditional AI might:
- Guess which time period you mean (this week? this month?)
- Return all data even if it's millions of records
- Give confident answers even with incomplete data
- Fail if the task takes multiple steps
Clear AI v2 instead:
- Asks: "Which time period: today, this week, or this month?"
- Shows progress: "Step 2/4: Analyzing 1,543 records..."
- Expresses uncertainty: "Based on limited data (23%), I'm 65% confident..."
- Remembers context when you ask "What about rejected ones?"
Core Concepts
1. Conversational Intelligence
Traditional AI gives one-shot responses. Clear AI v2 enables natural, multi-turn conversations.
What It Means
Your AI can:
- Ask clarifying questions when information is ambiguous
- Show progress for long-running tasks
- Express uncertainty when confidence is low
- Understand follow-ups like "what about those?" or "and the rejected ones?"
Why It Matters
Real users don't speak in perfect queries. They say things like:
- "Show me data" (what data? from when?)
- "What about the other ones?" (refers to previous conversation)
- "Yes" (confirming an action)
Clear AI v2 handles all of this naturally.
How It Works
Response Types: The AI can respond in different ways
- Answer: Direct response with data
- Question: Ask for clarification
- Progress: "Step 2 of 5: Processing..."
- Acknowledgment: "Got it, working on that"
Intent Detection: Understands what users mean
- "Show me shipments" → Query (fetch data)
- "What is a shipment?" → Question (explain concept)
- "yes" or "no" → Confirmation (approve/deny)
- "What about FacilityB?" → Follow-up (referring to previous context)
Confidence Scoring: Expresses certainty appropriately
- High confidence (90%): Give direct answer
- Low confidence (60%): Add disclaimer "I'm not completely certain..."
2. Context Management
AI models have limited memory (context windows). A conversation might start with 4,000 tokens available but after 20 messages, you're out of space.
What It Means
Clear AI v2 manages conversation context intelligently:
- Tracks all messages and their token counts
- Compresses old messages when the context window fills up
- Preserves important information (recent messages, key entities, system instructions)
- Saves ~70-80% of tokens when compressing
Why It Matters
Without context management:
- Long conversations fail with "context limit exceeded"
- Important information gets lost
- Every message costs more and more
With Clear AI v2:
- Conversations can go on indefinitely
- Key information is always available
- Costs stay reasonable
How It Works
Compression Strategies:
- Sliding Window: Keep only the N most recent messages
- Prioritization: Keep important messages, drop less important ones
- Summarization: Use AI to condense old messages into summaries
The system automatically chooses the best strategy based on the situation.
3. Memory Systems
There are two types of memory:
Episodic Memory (Neo4j)
What: Remembers the flow of conversations
Like: Your personal journal of interactions
Example:
- User asked about shipments at 2pm
- Then asked about facilities at 2:15pm
- Then asked "What about rejected ones?" (referring to shipments from earlier)
Why: Understand conversation history and relationships
Semantic Memory (Pinecone)
What: Remembers facts and knowledge that can be searched
Like: A searchable database of everything the AI has learned
Example:
- "FacilityA specializes in plastic recycling"
- "Normal contamination rate is 2-5%"
- "User prefers weekly reports"
Why: Retrieve relevant information based on meaning, not just keywords
4. Workflows
Complex tasks require multiple steps with conditional logic. Workflows let you define these step-by-step processes.
What It Means
Instead of writing complex if/else chains, you define a state graph:
Start → Fetch Data → Analyze → Valid?
↓ No → Ask User
↓ Yes → Generate Report → End
Why It Matters
Real business logic is complex:
- If analysis confidence > 70%, proceed; otherwise ask for confirmation
- If data is incomplete, fetch more; if that fails, notify user
- If task is interrupted, save state and resume later
How It Works
Graph Builder: Define nodes (steps) and edges (transitions)
Conditional Branching: Next step depends on current state
Checkpointing: Save progress, resume after interruptions
5. Token Management
AI providers charge by tokens (roughly words). Longer conversations = more money.
What It Means
Clear AI v2 helps you:
- Count tokens accurately for any text or conversation
- Set budgets per operation or per user
- Estimate costs before executing
- Prevent overruns automatically
Why It Matters
Without token management:
- Costs spiral out of control
- One long conversation could cost $10+
- No visibility into spending
With Clear AI v2:
- Set budget: "Max 5,000 tokens per query"
- Get warnings: "90% of budget used"
- See costs: "This query cost $0.012"
How It Works
Token Counter: Uses tiktoken
for accurate counting across all models
Token Budget: Reserve tokens, track usage, prevent overruns
Cost Estimator: Calculate $ cost based on provider pricing
6. Multi-Provider LLM Support
Don't lock yourself into one AI provider.
What It Means
Clear AI v2 works with:
- OpenAI (GPT-3.5, GPT-4) - Industry standard
- Groq (Llama, Mixtral) - Fast, free tier
- Ollama (Local models) - Privacy, no costs
Automatic fallback: If OpenAI is down, try Groq, then Ollama.
Why It Matters
- Reliability: If one provider is down, others work
- Cost optimization: Use cheaper providers when appropriate
- Privacy: Use local models for sensitive data
- Flexibility: Switch providers without code changes
How It Works
One unified interface for all providers:
const llm = new LLMProvider(); // Automatically picks best available
const response = await llm.chat(messages);
Configuration handles everything:
DEFAULT_LLM_PROVIDER=openai
FALLBACK_PROVIDERS=groq,ollama
7. Observability
In production, you need to see what's happening when things go wrong.
What It Means
Clear AI v2 integrates with Langfuse to track:
- Every LLM call (prompt, response, tokens, cost)
- Every workflow step
- Performance metrics
- Errors and failures
Why It Matters
When a user reports "the AI gave a wrong answer":
- See exactly what prompt was sent
- Check what the LLM returned
- Review token usage and costs
- Trace through the entire workflow
How It Works
Automatic tracing with minimal code:
const tracer = new LangfuseTracer(config);
const trace = tracer.startTrace('user_query');
// ... execute your logic
tracer.endTrace(trace.id);
View everything in the Langfuse dashboard.
How It All Fits Together
Here's a typical flow:
-
User sends message: "Show me contaminated shipments"
-
Intent Detection: Classifies as 'query'
-
Context Check: Do we have enough tokens? Yes.
-
Workflow Starts:
- Step 1: Fetch data from database
- Step 2: Analyze results
- Step 3: Calculate confidence (85%)
- Step 4: Format response
-
Progress Updates: User sees "Step 2/4: Analyzing data..."
-
Response Built:
Found 23 contaminated shipments from the past week.
[confidence: 85%] -
Context Updated: Message and response added to conversation history
-
Memory Stored:
- Episodic: "User queried contaminated shipments at 3:45pm"
- Semantic: "Normal contamination rate is 2-5% based on data"
-
Observability: Everything logged to Langfuse
-
User Follow-up: "What about from FacilityA?"
-
Intent Detection: Classifies as 'follow-up'
-
Context Retrieval: Knows "those" refers to contaminated shipments
-
Entity Extraction: Extracts "FacilityA"
-
Filtered Response: Returns only FacilityA's contaminated shipments
Key Benefits
For Users
- ✅ Natural conversations (ask follow-ups, get clarifications)
- ✅ Transparency (see progress, understand confidence)
- ✅ Reliability (works even with ambiguous requests)
For Developers
- ✅ Production-ready (tested, documented, robust)
- ✅ Modular (use only what you need)
- ✅ Type-safe (TypeScript throughout)
- ✅ Well-tested (724 unit + 45 integration tests)
For Business
- ✅ Cost control (token budgets, usage tracking)
- ✅ Observability (debug production issues)
- ✅ Flexibility (switch providers, no lock-in)
- ✅ Scalability (handles long conversations efficiently)
What's Next?
Now that you understand the concepts:
- 🏗️ Architecture - See the technical design
- 💬 Conversational AI - Dive into response types
- 🧠 Context Management - Learn about compression
- 📊 Workflows - Build multi-step processes
- 💰 Token Management - Control costs
Questions? Check the guides or specific module documentation.