Shortcuts
BlogMarch 4, 2025

Mohamed Elbarry
Architecting Context-Aware Conversation Flows for E-commerce Chatbots
E-commerce chatbots fail when they treat every message in isolation. A user says "I'm looking for running shoes," then "something under $100," then "in blue"—and the bot asks which category they want. At Lumin AI, we built context-aware flows so the assistant remembers the conversation and can recommend products and answer follow-ups without starting over. Here's how we structured it and what we'd keep.
  • Next.js + React for the chat UI and real-time updates
  • Nest.js for the backend and orchestration
  • MongoDB for conversation history and user context
  • Redis for caching conversation state and recent context windows
  • Supabase for real-time channels and auth where needed
  • TypeScript end-to-end for type-safe message and context shapes
  • LLM / NLU layer for intent and entity extraction and response generation
Two things made this hard. First, context across turns: we had to keep a compact representation of what the user had already said (product type, budget, preferences) and feed it into each new turn so the model didn't hallucinate or ignore prior answers. Dumping full conversation history blows up token count and latency—research on context window size shows latency can increase many times once you push past a few thousand tokens. Second, latency: if the bot takes more than a couple of seconds to reply, the conversation feels broken. We had to fetch context, call the model, optionally hit the product API, and stream the reply without blowing the 2s budget. Doing that in 15+ languages added another constraint—we couldn't rely on one monolithic prompt; we needed structured context that worked across locales. We modeled the current "state" of the conversation so the backend and the LLM always had a consistent view of what the user wanted. That made it easy to pass only the relevant slice into the context window instead of dumping full history.
Tsx
// types/conversation.ts
export interface ConversationContext {
  sessionId: string;
  locale: string;
  intent: 'browse' | 'support' | 'recommendation' | 'unknown';
  entities: {
    category?: string;
    priceMax?: number;
    attributes?: Record<string, string>;
  };
  lastProductIds: string[];
  turnCount: number;
  updatedAt: string;
}
We didn't send the entire thread to the model every time. We kept a rolling summary plus the last N messages in Redis keyed by session, and only loaded that for each request. That pattern—durable session state plus a small context window—matches what production LLM apps need for conversation memory and context management. Token usage stayed predictable and response time under control.
Tsx
// Simplified: load or build context for this turn
async function getContextForTurn(sessionId: string, newMessage: string): Promise<ConversationContext> {
  const cached = await redis.get<ConversationContext>(`ctx:${sessionId}`);
  const base = cached ?? { sessionId, locale: 'en', intent: 'unknown', entities: {}, lastProductIds: [], turnCount: 0, updatedAt: new Date().toISOString() };

  const updated = await extractAndMergeEntities(base, newMessage);
  await redis.setex(`ctx:${sessionId}`, 3600, JSON.stringify(updated)); // 1h TTL

  return updated;
}
We used the LLM (or a smaller NLU step) to update intent and entities from each message, then branched logic on that instead of free-form text. For "show me blue ones under $100" we'd set category/color and priceMax, then the recommendation path could call the product API with filters and inject results into the reply. That gave us consistent behavior and made it possible to tune prompts per intent and locale. On Lumin AI we got average response time under 2 seconds, 60% reduction in support response time, and 25% higher customer satisfaction. Multilingual support (15+ languages) worked because the context structure was locale-agnostic—we only swapped prompts and copy per language. The analytics we built on top showed which intents and flows drove the most conversions, so we could double down on what worked. Context-aware flows weren't a gimmick; they were what made the chatbot actually useful. I'm currently looking for new challenges in the AI and Full Stack space. If you're building something interesting, let's chat.
Share this post: