Alternative Architecture: CLI-Native Agent (Shell Exec)

Date: 2026-03-25 Author: Sam (Brainforge) Status: Draft — comparison against custom-tools approach Related: spike-command-center-data-access.md, linear-tickets-data-access-chat-integration.md

The question

The current plan (custom-tools approach) builds 9 individual Mastra tool functions that each wrap specific GWS CLI commands and Slack API calls. That works, but the GWS CLI was already designed for agent use — structured JSON output, auto-pagination, schema introspection, 40+ built-in skills. Instead of writing search_drive(), get_drive_activity(), search_gmail(), etc. by hand, we could give the agent a generic shell-exec tool that runs gws commands directly and skip the wrapper layer entirely.

This document maps out what that architecture looks like, what changes, and what the effort comparison is.

MCP server mode — removed

GWS CLI had an MCP server mode (gws mcp -s drive,gmail,calendar) through v0.7.0. It was removed in v0.8.0 via PR #275 (merged March 6, 2026). The removal was a breaking change with no replacement — the maintainers cited context window bloat, tool-name parsing bugs, state management issues, and security concerns. The current version (0.22.1) does not have gws mcp.

Third-party projects that depended on gws mcp have migrated to a subprocess-per-call bridge pattern — spawning a short-lived gws CLI process for each tool call instead. That’s the pattern Architecture B uses.

The Slack MCP server (mcp.slack.com) is still live and maintained by Slack.

Architecture A: Custom Tools (current plan)

┌─────────────────────────────────────────────────────┐
│  Next.js + Mastra (Cloud Run, Eden GCP)             │
│                                                      │
│  ┌──────────────┐    ┌───────────────────────────┐  │
│  │  Chat UI     │───▶│  Mastra Agent              │  │
│  │  (React)     │    │  (TypeScript)              │  │
│  └──────────────┘    │                            │  │
│                      │  Tools (hand-built):       │  │
│                      │  - search_drive()          │  │
│                      │  - get_drive_activity()    │  │
│                      │  - get_file_comments()     │  │
│                      │  - search_gmail()          │  │
│                      │  - search_calendar()       │  │
│                      │  - get_user_directory()    │  │
│                      │  - search_slack()          │  │
│                      │  - read_slack_thread()     │  │
│                      │  - get_slack_channel_stats()│  │
│                      └───────────┬────────────────┘  │
│                                  │                    │
│                      ┌───────────▼────────────────┐  │
│                      │  PII Redaction Middleware   │  │
│                      └───────────┬────────────────┘  │
│                                  │                    │
│              ┌───────────────────┼──────────────┐    │
│              ▼                   ▼              ▼    │
│         GWS CLI            Slack API     Vertex AI  │
│        (shell exec)         (HTTP)      Gemini API  │
└─────────────────────────────────────────────────────┘

Each tool function is ~50–150 lines of TypeScript that:

Constructs the right gws CLI command or Slack API call
Parses the JSON response
Runs it through PII redaction
Returns structured data to the agent

Observation: Both the custom tool functions and the raw gws CLI commands do the same thing — call a Google API and return JSON. The custom tools add a translation layer that’s mostly mechanical: build command string → exec → parse JSON → redact → return. The GWS CLI already handles the API call, auth, pagination, and JSON formatting.

Architecture B: CLI-Native Agent (shell exec)

┌──────────────────────────────────────────────────────┐
│  Next.js (Cloud Run, Eden GCP)                       │
│                                                       │
│  ┌──────────────┐    ┌────────────────────────────┐  │
│  │  Chat UI     │───▶│  Mastra Agent               │  │
│  │  (React)     │    │  (TypeScript)               │  │
│  └──────────────┘    │                             │  │
│                      │  Tools:                     │  │
│                      │  - run_gws(command)          │  │
│                      │  - gws_schema(method)        │  │
│                      │  - slack_mcp (MCP client)    │  │
│                      │  - pii_redact()              │  │
│                      │  - project_registry()        │  │
│                      └──────┬──────────┬───────────┘  │
│                             │          │              │
│                    ┌────────▼──┐  ┌────▼───────────┐  │
│                    │ GWS CLI   │  │ Slack MCP      │  │
│                    │ (child    │  │ Server         │  │
│                    │ process)  │  │ (mcp.slack.com)│  │
│                    └────┬──┬──┘  └────────┬───────┘  │
│                         │  │              │           │
│              Google Workspace APIs    Slack APIs      │
│                                                       │
│  PII redaction runs on ALL tool outputs before they   │
│  enter the LLM context window (Mastra processor)      │
│                                                       │
│  LLM: Vertex AI Gemini API (BAA-covered)              │
└──────────────────────────────────────────────────────┘

The agent has two generic tools for Google Workspace instead of 6+ specific ones:

`run_gws(command, params?)`

A single Mastra tool that:

Receives a gws CLI command string from the agent (e.g. drive files list, gmail users messages list)
Spawns a short-lived child process: gws <command> --params '<json>' --format json
Captures stdout (structured JSON)
Runs the output through redact() (PII anonymization)
Returns the cleaned JSON to the agent

const runGws = createTool({
  id: "run_gws",
  description: "Run a Google Workspace CLI command. Returns JSON.",
  inputSchema: z.object({
    command: z.string().describe("gws CLI command (e.g. 'drive files list')"),
    params: z.record(z.any()).optional().describe("API parameters as key-value pairs"),
  }),
  execute: async ({ command, params }) => {
    const args = ["gws", ...command.split(" ")];
    if (params) args.push("--params", JSON.stringify(params));
    args.push("--format", "json");
 
    const result = await execFile(args[0], args.slice(1), {
      env: {
        ...process.env,
        GOOGLE_WORKSPACE_CLI_CREDENTIALS_FILE: "/secrets/sa-key.json",
        GOOGLE_WORKSPACE_CLI_IMPERSONATED_USER: "admin@eden.com",
      },
    });
 
    const raw = JSON.parse(result.stdout);
    return redact(raw);
  },
});

`gws_schema(method)`

Lets the agent discover the shape of any API method at runtime:

const gwsSchema = createTool({
  id: "gws_schema",
  description: "Get the request/response schema for a gws API method.",
  inputSchema: z.object({
    method: z.string().describe("API method (e.g. 'drive.files.list')"),
  }),
  execute: async ({ method }) => {
    const result = await execFile("gws", ["schema", method]);
    return result.stdout;
  },
});

The agent can call gws_schema("gmail.users.messages.list") to learn what parameters are available, then construct the right run_gws call. This is self-discovery — no hardcoded tool schemas needed.

Slack: MCP client

The Slack MCP server (mcp.slack.com) is still live. Mastra connects to it as an MCP client. The agent gets search, history, thread reading, and user profile tools from Slack’s server directly.

If the Slack MCP server proves unreliable, we fall back to 3 custom Slack tool functions (same as Architecture A).

What stays the same

Component	Notes
Cloud Run in Eden’s GCP	Same. BAA-covered.
Vertex AI Gemini API	Same. BAA-covered LLM endpoint.
Next.js 15 + shadcn/ui	Same chat UI and dashboard.
PII redaction layer	Still required. Identity mapping, `redact()`, test suite.
Identity mapping table	Same schema, same GCP Secret Manager storage.
Service account + DWD	Same. GWS CLI inherits this auth via env vars.
Slack app + OAuth token	Same. Slack MCP server needs the token.
Google OAuth for Danny	Same. COO authenticates to the web app.
Project registry	Same. Maps project names to channels/folders.

What changes

Component	Custom Tools (A)	CLI-Native (B)
GWS data access	6 hand-built TypeScript tool functions, each wrapping a specific `gws` command	1 generic `run_gws` tool + 1 `gws_schema` tool; agent constructs commands dynamically
Slack data access	3 hand-built TypeScript tool functions wrapping Slack API calls	Slack MCP server (hosted by Slack) provides tools directly; fallback to custom tools if needed
Tool count we build	9 data access tools + PII + registry = 11 custom tools	2 GWS tools + PII + registry = 4 custom tools (+ Slack MCP or 3 custom Slack tools)
Agent’s GWS knowledge	Hardcoded in each tool’s description and input schema	Agent uses `gws_schema` to self-discover; system prompt provides a command reference
PII enforcement	In each tool function (pre-return)	In `run_gws` tool (post-exec, pre-return) + Mastra processor on Slack MCP outputs
Child processes	One `gws` exec per tool call (same as now, just hidden inside each tool)	One `gws` exec per `run_gws` call (identical — the child process pattern is the same)
New GWS API coverage	New tool function per API = new ticket	Agent already has access via `run_gws` — just needs the command in its system prompt
Cloud Run config	Single process	Single process (no sidecar — `gws` is spawned per-call, not persistent)

PII redaction in Architecture B

PII redaction is simpler than the (now-removed) MCP approach because run_gws is a tool we control:

GWS: run_gws calls redact() on the CLI output before returning it to the agent. Same enforcement point as Architecture A — tool-level, pre-return. The difference is one redact() call in one tool vs. the same call copy-pasted into 6 tools.
Slack MCP: Mastra processor intercepts all MCP tool outputs and runs redact() before they enter the LLM context. Or, if we use custom Slack tools instead of MCP, same per-tool enforcement as Architecture A.
The redact() function itself is identical in both architectures. It needs to handle arbitrary JSON shapes either way — the GWS CLI returns different schemas for Drive vs Gmail vs Calendar regardless of whether we wrap them in custom tools or pass them through run_gws.

Effort comparison

Architecture A: Custom Tools (current tickets)

Step	Tickets	Points
1. Source auth	1, 2, 3	7
2. Identity + PII	4a, 4b, 5a, 5b	14
3. Slack tools	6, 7, 8a, 8b	14
3. Agent + UI	9a, 9b	8
4. GWS tools	10-15, 16	23
5. Orchestration + deploy	17, 18	9
Total	18 tickets	75 pts

Architecture B: CLI-Native (shell exec)

Step	Work	Points
1. Source auth (same)	GCP project, service account, DWD request, Slack app	7
2. Identity + PII (same)	Mapping schema, `resolve_identity`, `redact()`, test suite	14
3. `run_gws` + `gws_schema` tools	Build the 2 generic tools, test against Drive/Gmail/Calendar/Admin, validate PII redaction on each response shape	5
4. Slack data access	Either: connect Slack MCP (3 pts) or build 3 custom tools (14 pts)	3–14
5. Agent + chat UI	Mastra agent with `run_gws` + Slack tools + project registry, system prompt with GWS command reference, Next.js chat interface, Cloud Run deploy	10
6. Cross-platform orchestration	System prompt for multi-source reasoning, project registry, parallel query execution	5
7. Validation + deploy	End-to-end testing across all GWS + Slack surfaces, anonymization audit, production deploy	4
Total (with Slack MCP)	~12 tickets	~48 pts
Total (with custom Slack)	~15 tickets	~59 pts

Delta

	Arch A (custom)	Arch B (Slack MCP)	Arch B (custom Slack)
Tickets	18	~12	~15
Points	75	~48	~59
Savings	—	~27 pts (36%)	~16 pts (21%)
Custom GWS tool code	~600-900 lines (6 tools)	~80 lines (2 generic tools)	~80 lines
Custom Slack tool code	~300-450 lines (3 tools)	~0 (MCP)	~300-450 lines

The savings come from replacing 6 specific GWS tool functions (Tickets 10-16) with 2 generic tools. The auth, PII, UI, orchestration, and deployment work is roughly the same. Agent prompt engineering effort increases slightly (the system prompt needs a GWS command reference instead of relying on typed tool schemas).

What we gain

~16-27 points less work. No hand-built tool functions for Drive, Gmail, Calendar, Admin SDK, etc.
Instant coverage of the full GWS API surface. The agent can run any gws command — Drive Activity, Comments, Tasks, Keep, Sheets, Docs — without new tool functions. If the COO asks about Google Tasks tomorrow, the agent can already access it. With custom tools, each new API surface is a new ticket.
The CLI evolves; we don’t maintain wrappers. When gws adds new features, the agent can use them immediately. Our custom tools would need manual updates per API change.
Self-discovery via gws_schema. The agent can inspect any API method’s schema at runtime and construct the right command. No hardcoded input schemas to keep in sync.
Simpler codebase. Two tool functions instead of nine. Less code to test, review, and maintain.
No sidecar process. Unlike the (removed) MCP server mode, run_gws spawns a child process per call — same pattern as Architecture A already uses inside each custom tool. No persistent sidecar to manage.

What we lose / risk

Less structured tool interface. Custom tools have typed input schemas (search_drive(query, folder_id?, owner_token?)) that guide the LLM. run_gws accepts a freeform command string — the agent must know the right gws syntax. Mitigation: system prompt includes a command reference with examples for each API.
Prompt engineering replaces code. Instead of encoding knowledge in typed tool functions, we encode it in the agent’s system prompt. This is less testable and more brittle — a prompt change could break tool routing. Mitigation: comprehensive integration tests that validate the agent calls the right gws commands for each query type.
PII redaction must handle arbitrary shapes. With custom tools, we know exactly which fields to redact in each response. With run_gws, the redaction function sees whatever the CLI returns. Mitigation: redact() already needs to handle nested JSON generically — the same email/name/phone regex patterns work regardless of response shape.
Token budget risk. The agent may request more data than needed (e.g. drive files list without --params '{"fields":"files(id,name,modifiedTime)"}'). Custom tools request only the fields they need. Mitigation: system prompt instructs the agent to use fields parameters; run_gws could enforce a default fields mask.
Command injection surface. The agent constructs shell commands. If the LLM hallucinates a malicious command, run_gws would execute it. Mitigation: whitelist allowed gws subcommands (only drive, gmail, calendar, admin, driveactivity:v2). Reject anything else. Never pass raw shell strings — use execFile (not exec) to prevent injection.
Harder to unit test. Custom tools are pure functions: input → output. run_gws requires mocking the CLI subprocess. Mitigation: mock execFile in tests; test the redact() layer independently with fixtures.

Security: command whitelist

run_gws must NOT be a general shell-exec tool. It should enforce:

const ALLOWED_SERVICES = [
  "drive", "gmail", "calendar", "admin",
  "driveactivity:v2", "sheets", "docs",
];
 
const command = input.command.split(" ");
const service = command[0];
if (!ALLOWED_SERVICES.includes(service)) {
  throw new Error(`Service '${service}' not allowed`);
}

Additionally:

Use execFile (not exec) — prevents shell metacharacter injection
Read-only operations only — no delete, update, send, insert subcommands unless explicitly allowed
Timeout on child process (10 seconds) to prevent hangs
Log every command for audit trail

Deployment

Identical to Architecture A — single Cloud Run service, no sidecar:

FROM node:22-slim AS base
RUN npm install -g @googleworkspace/cli
 
# Service account key mounted from GCP Secret Manager at runtime
# GWS CLI reads GOOGLE_WORKSPACE_CLI_CREDENTIALS_FILE env var
 
# Single process: next start (port 8080)
# gws is spawned per-call by run_gws tool, not a persistent process

Cloud Run config (same as Architecture A):

Memory: 512MB–1GB
CPU: 1 vCPU
Min instances: 1 (avoid cold start)
Secrets: Service account key, Slack OAuth token, identity mapping — all from GCP Secret Manager

Decision matrix

Criterion	A: Custom Tools	B: CLI-Native (shell exec)
Total effort	75 pts	~48-59 pts
GWS API coverage	6 specific tools	Full GWS surface (any `gws` command)
Slack control	Full (custom tools)	Depends on MCP / can fall back to custom
PII enforcement	Per-tool (tight)	Per-tool in `run_gws` (same enforcement point)
Tool interface quality	Typed schemas, clear inputs	Freeform command string + system prompt
Maintenance burden	High (6+ GWS tools to keep in sync)	Low (2 generic tools + prompt updates)
Token efficiency	High (curated fields)	Medium (agent must learn to request minimal fields)
Testing	Unit tests per tool	Integration tests + redaction unit tests
Operational complexity	Simple (one process)	Simple (same — no sidecar)
Future extensibility	New ticket per API	Update system prompt
Security surface	Minimal (hardcoded commands)	Command whitelist required
Time to M3	~4 weeks	~3–3.5 weeks

Recommendation

Architecture B is worth considering but the savings are more modest than originally estimated (~16-27 pts vs the incorrect ~33 pts). The tradeoff is clear:

Pick A if you want typed tool interfaces, straightforward unit testing, and minimal prompt engineering risk. The extra ~16-27 pts is mostly mechanical work (build and test each GWS tool function). It’s tedious but safe.
Pick B if you want fewer tickets, instant full-API coverage, and less code to maintain long-term. The tradeoff is more reliance on prompt engineering and integration testing. The security whitelist and PII redaction on arbitrary shapes add some complexity, but both are solvable.
Hybrid (B for GWS, A for Slack) is the pragmatic middle ground. The 6 GWS custom tools are the most mechanical to build — they’re all the same pattern (construct gws command → exec → parse → redact). Replacing those with run_gws saves the most effort with the least risk. Keeping custom Slack tools preserves rate-limit control and caching.

Next step: If we go with B or hybrid, update the ticket file to reflect the new architecture and re-estimate.

Brainforge Knowledge

Explorer

alt-architecture-cli-native-agent