OpenCode as a Cursor Cloud Alternative (Slack-Triggered) — Investigation
Date: 2026-03-12
Owner: Platform Engineering
Status: Proposed MVP path
1) Executive recommendation
Use a hybrid architecture:
- Keep the Slack-triggered MVP independent of OpenWork platform readiness. The MVP does not require the OpenWork web app or hosted Labs (
labs.brainforge.ai) to be production-ready. A background worker can run OpenCode (CLI or minimal runner) in an isolated environment without the OpenWork UI. - Use the existing Brainforge Slack Assistant as the control plane for user triggers.
- Start with a small background worker that runs OpenCode jobs asynchronously and posts results back to Slack.
Optional later: when the OpenWork platform is up, the same worker can be pointed at a hosted OpenWork runtime for richer session/proxy features. That is a follow-on integration, not an MVP gate.
2) Why this is viable now
From current repo state:
- Slack trigger surfaces already exist in
apps/slack-apps/brainforge-assistantand support slash-command expansion — no dependency on OpenWork. - OpenCode can run headlessly (CLI / non-interactive job) in a worker; we do not need the OpenWork web app for the MVP.
- OpenWork exists in-repo (
apps/openwork) with hosted runtime docs; Platform has OpenWork routing/rollout scaffolding (/openworkroute + rollout flag). These are optional for the Slack-triggered MVP and become relevant when we want in-app UX or Labs-backed runs.
The missing piece is mostly orchestration and safety controls for unattended coding runs (queue, worker, time/token caps), not “wait for OpenWork platform.”
3) Priority requirements and parity snapshot
Must-have (your stated priority)
- Feature parity for common asks (basic code edits, tests, summaries, small refactors).
- Slack-triggered runs (command + status updates + result links).
- Azure/Gemini model support with easy default switching.
- Lower cost envelope than current 10–30 daily cloud sessions per engineer.
Current parity (high-level)
- Slack trigger path: partial (existing Slack bot is productionized; add OpenCode job command).
- OpenCode runtime for MVP: viable via worker + OpenCode (CLI or minimal runner in a container/VM). Hosted OpenWork (Labs) is not required for Phase A; it can be a later integration.
- Model support: viable via OpenCode providers (Azure + Gemini are supported provider paths).
- Ops/runbooks: strong baseline exists where applicable (e.g. token handling, rollback); worker-specific runbook can be added without blocking on OpenWork platform.
4) Recommended MVP architecture (small start)
Control plane (Slack)
- Add a new command pattern in the Slack assistant.
- Proposed command format:
/brainforge code <repo> <prompt> [--strong] [--branch <name>] /brainforge code brainforge-platform "add input validation to the contact form" - Command payload includes:
- repo/workspace target,
- task prompt,
- model tier (
cheapdefault,strongoptional), - optional branch naming hint.
Execution plane (background worker) — decoupled from OpenWork platform
- Queue: At 50–100 requests/day, use a jobs table in existing Supabase (or Postgres). No new queue service (e.g. Redis) required for MVP. Slack app inserts a row when a command is received; worker polls the table for pending jobs (or uses Postgres
LISTEN/NOTIFYif you want). Add Redis only if volume or latency requirements grow later. - Worker runs in an isolated environment (e.g. container or VM):
- Creates isolated run directory,
- Executes OpenCode non-interactive job (CLI or minimal runner),
- Captures logs/artifacts.
- No dependency on OpenWork web app or Labs for this path. When Labs is ready, the worker can optionally call hosted OpenWork APIs instead of (or in addition to) local OpenCode runs.
- Worker posts lifecycle updates to Slack thread:
- queued → running → tests → completed/failed.
Result surface
- Return in Slack:
- short summary,
- changed files,
- test output snippet,
- branch/commit reference,
- rerun/review buttons (later phase).
5) Cost strategy and model routing
5.1 Model tier routing (default: cheap, escalate intentionally)
| Tier | Models | When used |
|---|---|---|
| Cheap (default) | gemini-1.5-flash, azure/gpt-4o-mini | First pass for all tasks |
| Strong | azure/gpt-4o, gemini-1.5-pro | Escalation triggers only |
Escalation triggers (automatic):
- Task prompt contains keywords: “refactor”, “architecture”, “redesign”, “complex”
- First attempt (cheap) fails tests or produces invalid output
- User explicitly requests:
/brainforge code --strong <repo> <prompt>
Cost visibility: Every Slack response includes: model used, estimated cost, token count.
5.2 Guardrails
- Runtime timeout: 15 min max per job (aligns with Railway HTTP edge limit; jobs are not HTTP requests but we keep the same ceiling for predictability)
- Token budget: 100K tokens per run (cheap) / 500K tokens (strong escalation)
- Retries: Max 1 retry on failure (cheap→strong, or strong→abort)
- Concurrency: 1 run per user, 3 runs per channel (queuing beyond that)
- Daily caps: 10 runs/user/day initially (configurable via env)
6) MVP scope (2 weeks)
Phase A — Basic functionality trial (week 1)
Goal: Prove the loop works end-to-end for one repo, one task type, with manual review.
- Slack command to submit code task.
- Background job executes OpenCode task in one repo/workspace (worker path; no OpenWork platform required).
- Post completion/failure result back to Slack.
- Manual review remains required before merge.
Success criteria:
- At least 30 real internal runs.
- ≥70% of basic asks complete without manual code changes (tests pass; human only reviews and clicks merge).
- Clear per-run cost and latency metrics.
- Security review of worker env/secrets complete.
Phase boundary → Phase B: Proceed if success criteria met; otherwise iterate on blockers before expanding scope.
Phase B — Reliability and parity hardening (week 2)
Goal: Add templates, model routing, safety controls, and broader task types.
- Add test-command templates by task type (edit, test, summary, refactor).
- Add structured run metadata and searchable logs.
- Model escalation fallback: Implement cheap→strong routing (see §5.1).
- Add queue prioritization (user/channel caps).
- Add safety controls: sensitive file warnings, branch protection checks.
Phase C — Scale and polish (weeks 3–4, contingent on B success)
Goal: Reduce friction, expand adoption, integrate with hosted OpenWork if valuable.
- Expand task types (multi-file support, complex refactors).
- Add auto-PR creation option (with required reviewers).
- Reduce manual review requirements for low-risk task types.
- Optional: integrate worker with hosted OpenWork Labs when ready (session/proxy features).
What we’re NOT building in MVP (scope fence):
- Web UI for job status (Slack-only for now)
- Automatic PR creation without human review (Phase C only)
- Multi-file complex refactors (start with single-file edits; expand in Phase C)
- Integration with hosted OpenWork Labs (Phase C option)
- Self-healing retries beyond 1 retry
- Real-time streaming of OpenCode output to Slack (post full result only)
7) Risks and mitigations
-
OpenCode execution reliability in worker
- Risk: OpenCode run (CLI/headless) may fail or hang in the worker environment.
- Mitigation: gate MVP on a worker + OpenCode smoke pass (single run, logs, timeout). This is independent of hosted OpenWork. Any “hosted OpenWork proxy/session” validation is a separate track and does not block the Slack-triggered MVP.
-
Security/secrets
- Risk: Slack-triggered automation increases blast radius.
- Mitigation: token-scoped service account, restricted filesystem roots, strict env-secret handling via 1Password → runtime env.
-
Runaway costs from long jobs
- Mitigation: token/time caps, cheap-default routing, bounded retries. Auto-throttle if error rate >30% over 10 minutes (pause queue, alert to Slack alerts).
-
User trust
- Mitigation: always include test evidence + changed files + model used in Slack response.
-
OpenCode CLI dependency
- Risk: OpenCode releases breaking changes; worker breaks on deploy.
- Mitigation: OpenCode is open source — vendor discontinuation risk is minimal (community can fork/maintain). Still: pin OpenCode version in Dockerfile; test before upgrading; monitor release notes; fork if needed for stability.
-
Worker isolation and repo access
- Worker isolation: Each job runs in a fresh temp directory with restricted permissions. Container-based isolation preferred; temp-dir isolation acceptable for Phase A.
- Network egress: Allow only: GitHub (repo clone), Azure/Gemini APIs, Slack webhooks. Block all other outbound. Enforcement mechanism: Use Railway’s VPC with NAT gateway + egress firewall rules, or run a sidecar proxy (e.g. tinyproxy with allowlist) that controls outbound from the worker container. Decision needed in Phase A implementation.
- Filesystem: No access to Platform DB (except jobs table), no access to other workers’ workspaces. Secrets via 1Password Environment → Railway (existing pattern: create an “OpenCode Worker - Railway” 1Password Environment, sync to Railway service env vars).
- Secrets rotation: GitHub App tokens: use short-lived installation tokens (1-hour TTL, refreshed as needed). Model API keys: rotate via 1Password on standard cadence (30-90 days). Audit all token usage via GitHub App logs and Azure/Gemini API dashboards.
- Reference: Use 1Password Environments (docs) to organize and manage the worker’s env vars separately from other projects. Share the Environment with the team for easier onboarding.
- Cleanup: Delete temp workspace after job completes (success or failure). Max workspace lifetime = job timeout + 2 min cleanup buffer.
- Repo authentication: GitHub App token scoped to specific repos only. Shallow clone (
--depth 1) of target branch. Worker pushes to new branch namedopencode-{user}-{timestamp}; never force-push to existing branches.
8) Testing and observability
8.1 Testing strategy (from day one)
| Level | What | When |
|---|---|---|
| Unit | Worker job state machine (pending → running → completed/failed) | Every commit |
| Integration | One canned task runs via queue → worker → Slack | Every deploy |
| E2E / smoke | Manual: /brainforge code test "add a comment to README" in test Slack channel | Before merge to main, after deploy |
8.2 Observability (metrics to capture)
From day one, log and optionally alert on:
- Queue depth: How many jobs waiting
- Job duration: queued → completed (p50, p95, p99)
- Success/failure rate: By task type and model tier
- Cost per run: Based on model + token count (Azure/Gemini pricing)
- Worker health: Crash/restart count
- Cost baseline vs actual: Track current Cursor Cloud spend (baseline) and OpenCode worker spend weekly
Alerts: Slack webhook to alerts if queue depth >10, worker crashes 3x in 10 min, or error rate >30%.
Cost baseline requirement: Before Phase A starts, document current monthly Cursor Cloud spend (target: 30-50% reduction by end of Phase C).
8.3 Audit logging
- Audit table fields: user, repo, branch, task type, model used, cost, status, changed files, timestamp
- Retention: 90 days for audit logs; 7 days for full job logs (or store in cheap object storage)
- Sensitive file blocking: Changes to
.env,secrets/,*.key,credentials*files require explicit human approval before the worker pushes the branch. Block auto-push; Slack message includes: “Changes detected to sensitive files. Awaiting approval: [Approve] [Reject].“
10) Concrete next actions
- Implement Slack → queue → worker thin slice in
apps/slack-apps/brainforge-assistant(no dependency on OpenWork platform). - Validate OpenCode execution in the worker (smoke: one task, logs, timeout); use CLI or minimal runner path.
- Instrument cost + latency telemetry from day one.
- Run internal pilot for one team before wider rollout.
Optional, when OpenWork platform is ready: validate runtime gate in hosted OpenWork (session create, async prompt, message/event retrieval) and document integration path for worker → Labs if desired.
9) Hosting: all on Railway
Yes — the full MVP can run entirely on Railway. No need for a separate cloud or VM.
| Piece | How it runs on Railway |
|---|---|
| Slack Assistant | Already Railway-ready (RAILWAY_GIT_BRANCH, RAILWAY_ENVIRONMENT in apps/slack-apps/brainforge-assistant). Deploy as a Railway service; receives slash commands and enqueues jobs. |
| Platform | Already on Railway (Next.js). No change for this MVP. |
| Queue | No new service. Use a jobs table in existing Supabase (or Postgres). At 50–100 requests/day, polling or LISTEN/NOTIFY is sufficient. Worker and Slack app both use the same DB. Add Railway Redis only if volume or latency demands it later. |
| OpenCode worker | New Railway service: Dockerfile (or Nixpacks) that installs Node + OpenCode CLI (or minimal runner), clones repo into a short-lived workspace, runs the OpenCode job, captures logs/artifacts, posts result to Slack. Use a single worker process to start; scale later if needed. |
Notes:
- The worker is not the full OpenWork stack. It does not run
openwork serveor the OpenWork orchestrator. It only needs a process that can execute OpenCode in non-interactive/headless mode (CLI or SDK) inside an isolated run directory. That keeps the worker image and runtime simple and avoids the OpenCode proxy/session issues documented inopenwork-railway-sidecar-feasibility-2026-03-07.md. - Secrets: Use 1Password Environment → Railway (existing workflow). Create an “OpenCode Worker - Railway” 1Password Environment to organize the worker’s env vars. Use
op run --environmentor sync to Railway. See 1Password Environments docs. - One Railway project can contain: Platform service, Slack Assistant service, Worker service, and existing Supabase for the jobs table. No Redis or other new data store required for MVP at 50–100 req/day.
Will Railway handle this load?
Yes, with the architecture above (queue + background worker, not long HTTP requests).
| Concern | Railway behavior | Our design |
|---|---|---|
| Long-running jobs (15–20 min) | Railway’s edge proxy limits HTTP requests to 15 minutes. A single request cannot run longer. | Job execution happens inside the worker process, not as an HTTP request. Slack command → enqueue → return 200 immediately. Worker polls queue and runs OpenCode in-process/subprocess; that work is not an HTTP request, so the 15 min limit does not apply. Cap each job at ~15 min in app logic to stay under if we ever expose a “run and wait” API. |
| Worker process | Containers run until they exit or are redeployed. No platform-level max runtime for a process. | Worker is a persistent service (loop: poll queue → run job → post to Slack). No “request = run job” pattern. |
| Queue / DB | MVP uses a jobs table in Supabase (no new queue service). Worker and Slack app share the same DB. | At 50–100 req/day, polling or LISTEN/NOTIFY is fine. Add Redis only if you outgrow this. |
| Concurrency | You control replica count and resource limits per service. | Start with one worker; add replicas and per-user/concurrency caps in app logic if needed. |
| CPU/memory | Configurable per service (Settings → Resource Limits). | Size the worker for OpenCode (enough RAM for context, CPU for tool runs). Platform and Slack app are unchanged. |
Bottom line: Railway can run this load. Keep “enqueue fast, work in background” and avoid tying job duration to any HTTP request.
11) External references reviewed
- OpenCode docs (models/providers/sdk/config): https://opencode.ai/docs
- Goose (Block): https://github.com/block/goose
- Stripe Minions (parts 1/2): https://stripe.dev/blog/minions-stripes-one-shot-end-to-end-coding-agents and related follow-up
- Ramp background agent writeups:
- Background-agents reference:
12) Repo references
knowledge/engineering/openwork-platform-integration/openwork-platform-integration-plan.mdknowledge/engineering/openwork-platform-integration/openwork-railway-sidecar-feasibility-2026-03-07.mdstandards/03-knowledge/engineering/setup/openwork-hosted-runtime-contract.mdstandards/03-knowledge/engineering/setup/openwork-hosted-ops-runbook.mdapps/slack-apps/brainforge-assistant/src/index.jsapps/platform/src/lib/openwork/entryRoute.ts