Repository Safety Checks and Gitleaks Setup

Purpose: Prevent committing secrets and sensitive transcript material to the repository.
Related rule: .cursor/rules/no-secrets-in-repo.mdc


1. Purpose

The repo uses two local hook systems plus CI checks:

  • Husky git hooks run repo-owned JavaScript checks from .husky/.
  • pre-commit framework hooks run third-party checks from .pre-commit-config.yaml.
  • GitHub Actions rerun sensitive transcript and repository hygiene checks on pull requests.

Together they catch secrets, sensitive transcript files, Cursor asset issues, and PR hygiene problems before they land on main.


2. One-Time Setup

Install repo-managed Husky hooks

From the repo root:

npm install

The root prepare script installs Husky hooks and initializes Cursor skill submodules. After this, normal git commit runs .husky/pre-commit.

Install pre-commit

# Option A: via pip
pip install pre-commit
 
# Option B: via Homebrew (macOS)
brew install pre-commit

Install the hook

From the repo root:

pre-commit install

This installs hooks from .pre-commit-config.yaml, including Gitleaks. The first run downloads the Gitleaks binary automatically.


3. Usage

  • Automatic on commit: Husky runs scripts/scan-sensitive-transcripts.mjs --staged, scripts/pr-hygiene-check.mjs --staged, and scripts/lint-cursor-assets.mjs --staged.
  • Automatic through pre-commit: If you installed pre-commit, Gitleaks and the configured local pre-commit hooks also run on git commit.
  • Automatic on PR: .github/workflows/sensitive-transcript-check.yml scans pull requests that touch knowledge/**; .github/workflows/cursor-hygiene.yml runs repository hygiene checks and script unit tests for relevant paths.
  • Manual Gitleaks scan: pre-commit run gitleaks --all-files
  • Manual sensitive transcript staged scan: node scripts/scan-sensitive-transcripts.mjs --staged
  • Manual sensitive transcript CI-style scan: node scripts/scan-sensitive-transcripts.mjs --ci

4. Sensitive Transcript Guardrail

scripts/scan-sensitive-transcripts.mjs blocks sensitive transcript markdown from being committed or merged.

What It Scans

The scanner only checks Markdown files whose normalized path contains /transcripts/ and ends with .md, such as:

knowledge/clients/acme/transcripts/2026-05-01-team-sync.md

It does not scan non-transcript notes under resources/, non-Markdown files, or files outside a transcripts folder.

What It Flags

The guardrail flags two types of matches:

  • Sensitive filenames: examples include 1-on-1, one-on-one, compensation, salary, stipend, paternity, w-2, visa planning/sponsorship, performance review, role transition, and role discussion patterns.
  • Sensitive content: examples include visa sponsorship, H-1B, green card process, compensation or salary review, paternity leave/stipend, benefits and visa, performance improvement plans, underperformance, and “not meeting expectations.”

Content scanning is intentionally limited to the first 8,000 characters of each candidate transcript to keep hooks fast while catching the usual title and summary sections.

Local vs. CI Behavior

  • --staged checks git diff --cached --name-only --diff-filter=ACMR and reads staged file content from the git index with git show :path. Editing the working tree after staging does not bypass the check.
  • --ci checks changed files across a PR diff. It resolves GITHUB_BASE_REF and GITHUB_HEAD_REF to available refs, preferring origin/<branch> when GitHub provides branch names and falling back to HEAD for pull request merge checkouts.
  • --verbose or -v prints a success message when no sensitive transcripts are found.

If It Fails

  1. Remove the flagged transcript from git: git restore --staged path/to/file.md, then move or delete the file.
  2. Store sensitive notes outside this repository in an approved private location.
  3. If the content is a non-sensitive operational note, move it out of transcripts/ to the right knowledge/clients/{client}/resources/ or meeting-notes path.
  4. If sensitive content was already committed or pushed, treat it as a repository hygiene incident: remove it from the branch and ask an owner whether history cleanup is needed.

Avoid git commit --no-verify for this check. The CI workflow will still block PRs that add sensitive transcript material under knowledge/**.

Tests

Run the scanner tests from the repo root:

node --test tests/scripts/scan-sensitive-transcripts.test.mjs

The tests cover staged-index reads, CI filename matches, and ignored non-transcript paths.


5. False Positives

If gitleaks flags a file that does not contain real secrets (e.g. test fixtures, example configs), add it to .gitleaksignore at the repo root:

# path/to/file-with-fake-secrets.txt

The sensitive transcript scanner does not have an allowlist. Prefer moving non-transcript material to the right knowledge/ location or renaming an incorrectly named transcript. Ask a repository owner before bypassing the hook.


6. Credentials

Do not commit secrets. Use environment variables or 1Password CLI (op) for credentials. See 1password-cli-setup.md and README.md.