Slack Assistant GitHub Repository Search Integration

Context

The GitHub integration enables live code and documentation search within the brainforge-ai/brainforge-platform repository. Unlike the Platform repo integration (which uses vector search for semantic matching on indexed content), this integration does direct GitHub API code search for live, always-current results. It powers queries like “find where we handle OAuth” or “show me the HubSpot API config.”

Guidance

Two-Pass Search Strategy

The integration runs two searches per query to maximize coverage:

  1. Code search: GET /search/code?q={query}+repo:brainforge-ai/brainforge-platform&per_page=10

    • Returns code file paths and URLs
    • Raw content fetched for top 5 matches via GET /repos/{repo}/contents/{path} with Accept: application/vnd.github.v3.raw
  2. Markdown search: Same query appended with +extension:md

    • Captures documentation and knowledge base entries
    • Results appended to citation list (no separate content fetch)

Raw Content Fetching Pitfalls

const content = await makeGitHubRequest(key, `/repos/${repo}/contents/${item.path}`, {
  Accept: 'application/vnd.github.v3.raw',
});

The Accept: application/vnd.github.v3.raw header is essential — without it, the API returns a JSON object with metadata and a base64-encoded content field. With it, the response is the raw file text. This header is passed via the third headers parameter of makeGitHubRequest.

Rate Limit Awareness

const filePromises = items.slice(0, 5).map(async (item) => { /* ... */ });

Only 5 file contents are fetched per query to avoid GitHub’s 60 req/hr unauthenticated rate limit (5,000 req/hr with token). Each failed file fetch is caught silently and excluded from results.

Dual-Use Token

The GitHub token (GITHUB_TOKEN) is shared between github.ts and transcripts.ts. This is intentional — transcript files are stored in the same repo and accessed via the same GitHub API. One env var, two integrations.

Result Formatting

Results include a file list and file contents for top matches:

GitHub code search in brainforge-ai/brainforge-platform: "HubSpot config"
Found 12 code matches and 3 doc matches.

Relevant files:
- apps/slack-apps/brainforge-assistant/src/hubspot.ts
- apps/slack-apps/brainforge-assistant/src/config.ts
- ...

File contents:
[1] apps/slack-apps/brainforge-assistant/src/hubspot.ts
const HUBSPOT_API_HOST = 'api.hubapi.com';
...

The LLM uses the file contents (truncated to 1000 chars per file) to answer the user’s question.

Why This Matters

This integration provides live code search with zero indexing latency. Unlike the Platform repo integration (which relies on pre-indexed vector embeddings that may be stale), GitHub API search always reflects the current state of the repository. The two-pass strategy (code + markdown) ensures both implementation files and documentation files are found.

When to Apply

  • Any assistant that needs to answer questions about a codebase
  • Development tools that search for implementations, configs, or patterns
  • Repository-aware research assistants

Examples

Basic Query

User: "Where do we handle OAuth in the platform?"
→ searchGitHubRepo({ token, query: "OAuth" })
→ Results include: src/utils/auth/oauth.ts, knowledge/standards/02-patterns/auth-oauth.md

Query Sanitization

The query is cleaned before sending: Slack mentions are stripped (<@U12345>), tool invocation phrases like “search the repo” are removed, and whitespace is normalized.

  • apps/slack-apps/brainforge-assistant/src/github.ts
  • apps/slack-apps/brainforge-assistant/src/transcripts.ts (shares the token)
  • apps/slack-apps/brainforge-assistant/src/assistant.ts (query cleaning logic)
  • docs/solutions/architecture-patterns/slack-assistant-v2-hybrid-ai-pipeline-2026-04-28.md