Slack Assistant V2: Hybrid AI Pipeline Architecture

Context

The Brainforge Slack Assistant V2 solves a universal enterprise problem: decision-makers need answers from multiple systems (CRM, codebase, knowledge docs, web, meeting transcripts) without leaving their chat workflow. Building a bot that queries one system is straightforward. Building one that intelligently queries six, formats results coherently, and responds in seconds requires a specific architecture pattern.

The V1 assistant used a simple LLM-call with no tool integration. V2 adds 6 parallel integrations with keyword-routed intent detection, streaming status updates, and identity-aware responses.

Guidance

Architecture Pattern: Intent-Gated Parallel Pipeline

The core pattern is keyword-gated tool dispatch followed by parallel data gathering and LLM synthesis:

Slack Event
    │
    ▼
Regex Intent Gates ───► HubSpot ──┐
    │                              │
    ├── shouldSearchWeb ───► Exa ──┤
    ├── shouldUseHubSpot ──────────┤
    ├── shouldUseRepoContext ─► Plt─┤
    ├── shouldTriggerSkill ─► Lin ─┤
    │                              │
    ▼                              ▼
Streaming Status Updates      Promise.all()
    │                              │
    ▼                              ▼
LLM Response Generation ◄─── Context Assembly
    │
    ▼
Slack Reply (chat.update)

Key Design Decisions & Rationale

1. Keyword Gates over LLM Tool Selection

// Fast regex approach (avg 0.1ms)
const shouldSearchWeb = (text) => /\b(search|find|web|exa)\b/.test(text);
 
// vs. LLM approach (300ms + cost per query)
// "Which of these tools does this query need?"

Why: LLM-based tool selection on every message adds latency and token cost. For a first-pass router, regex is ~3000x faster and deterministic. The tradeoff: semantic queries that don’t match keywords get no data — the answer is still contextually sound, just less rich.

2. All Tools Called, Every Time

// Process flow
const [hubspot, exa, repo, github, transcripts] = await Promise.all([
  searchHubSpot(query),            // may return empty
  searchWeb ? exaSearch(query) : '',   // gated
  buildRepoContext(query),         // may return empty
  searchGitHubRepo(query),         // may return empty
  searchTranscripts(query),        // may return empty
]);

Why call tools that might not match? Because the integration functions themselves do a double-check: shouldSearchWeb gates Exa at the caller, but buildRepoContext has its own shouldUseRepoContext gate internally. This redundant gating guarantees the LLM sees every piece of data that might be relevant, at the cost of always calling the GitHub API. For most queries, 4/6 integrations return empty results — an acceptable waste for reliability.

3. Streaming Status Updates as UX Feedback

await updateStatus('Thinking...');   // immediate feedback
// ... data gathering ...
await updateStatus('Generating response...');  // nudge
// ... LLM generation ...
await client.chat.update({ text: fullReply }); // final

Multi-second API calls without feedback feel broken. The three-phase status update (“Thinking…” → “Generating…” → answer) is a simple pattern with high UX impact. The updateStatus function deduplicates to avoid Slack rate limits.

4. Identity Injection for Personalization

const identityContext = await resolveUserIdentity(slackUserId);
// "The user asking is Jane Doe (VP Sales) at Acme Corp."
const finalSystem = identityContext
  ? `${systemPrompt}\n\nIdentity: ${identityContext}`
  : systemPrompt;

Injecting identity into the system prompt is cheap and powerful. The LLM uses this to contextualize results — “show my deals” becomes “show Jane Doe’s deals.” Identity resolution fails silently to maintain graceful degradation.

Integration Signatures

Every integration follows this contract:

type IntegrationResult = {
  context: string;        // Formatted text for LLM context window
  citations: Array<{      // Source URLs for transparency
    url: string;
    title: string;
  }>;
};

This uniform output lets the orchestrator join results with '\n\n---\n\n' without caring which integration produced them. Citations surface in the context for the LLM to reference in its response.

Config-Driven Integration Enablement

Integrations are optional per the AssistantConfig type:

type AssistantConfig = {
  // Required
  botToken: string;
  signingSecret: string;
  azureApiKey: string;
  azureDeployment: string;
 
  // Optional — each enables its integration
  hubspotAccessToken?: string;
  exaApiKey?: string;
  platformApiUrl?: string;
  githubToken?: string;
};

An integration is skipped when its credential is missing. This makes it safe to deploy with partial configuration — the bot works, just without that data source.

Why This Matters

This architecture pattern solves the tension between response speed and data breadth in AI assistants:

ApproachLatencyCoverageReliability
Single LLM call (no tools)~1sLowHigh
Sequential tool calling3-8sHighMedium
Parallel tool calling (this pattern)2-4sHighHigh
LLM-routed tool selection3-6sMediumMedium

The pattern is replicable: Adding a new integration requires:

  1. A source file with a searchXxx() function returning { context, citations }
  2. A shouldUseXxx() keyword gate
  3. Wiring in assistant.ts: add a Promise.all branch and include in the context assembly

No LLM prompt changes needed. No routing logic updates.

Compound value: Each new integration multiplies the assistant’s usefulness (adds new answer categories) while adding zero complexity to the core flow. The 7th integration costs the same as the 2nd.

When to Apply

  • Chat-based AI assistants that need to query multiple enterprise systems
  • Multi-source research bots where breadth of data matters more than query efficiency
  • Gradually-scoped assistants starting with 1-2 integrations and growing
  • Slack/Teams bot development where streaming status updates are available

Avoid when:

  • Single data source is sufficient
  • Sub-second responses are required (adds 2-4s best case)
  • All integrations need to be called every time (no empty-result waste)

Examples

Adding a New Integration

Step-by-step for adding a Notion search:

// 1. notion.ts
export const searchNotion = async ({ apiKey, query }) => {
  // ... Notion API call ...
  return { context: formattedResults, citations };
};
 
// 2. assistant.ts — add Promise.all branch
const [hubspotCtx, exaCtx, repoCtx, githubCtx, transcriptCtx, notionCtx] = await Promise.all([
  // ... existing ...
  config.notionApiKey ? searchNotion({ apiKey: config.notionApiKey, query }) : '',
]);
const parts = [hubspotCtx, exaCtx, repoCtx, githubCtx, transcriptCtx, notionCtx].filter(Boolean);

Customizing a Keyword Gate

// Adding "top" and "trending" as Exa triggers
const shouldSearchWeb = (text: string): boolean => {
  const lower = text.toLowerCase();
  return /\b(search|find|lookup|look up|web|internet|online|news|latest|exa|who is|compare)\b/.test(lower)
    || /\b(top|trending)\b/.test(lower); // new triggers
};
  • apps/slack-apps/brainforge-assistant/ARCHITECTURE.md — Comprehensive developer guide
  • knowledge/engineering/brainforge-slack-assistant/ROADMAP.md — Vision and phased roadmap
  • knowledge/engineering/brainforge-slack-assistant/cloud-agent-runbook.md — Cloud Agent execution
  • apps/slack-apps/brainforge-assistant/TESTING.md — Testing guide
  • apps/slack-apps/brainforge-assistant/e2e/README.md — Browser E2E details