Running Ingestion

CLI command reference, dry-run workflow, environment setup, and troubleshooting.

The transcript pipeline runs as a CLI script from the project root. This page covers the full command reference, the recommended dry-run workflow, and troubleshooting.

CLI Command

npx tsx apps/web/scripts/ingest-transcript.ts \
  --file /path/to/transcript_refined.md \
  --week 1 \
  --day tue \
  --date 2026-03-31 \
  --project-id <uuid> \
  --account-id <uuid>

Arguments

Required Arguments

ArgumentTypeDescriptionExample
--filestringPath to the Transcribo Markdown transcript/path/to/.transcribo/output/transcript_refined.md
--weeknumberWeek number in the course1
--daystringDay of the week (tue or wed)tue
--datestringSession date in ISO format2026-03-31
--project-idUUIDPrivateLanguage project UUIDa1b2c3d4-...
--account-idUUIDTeam account UUIDe5f6g7h8-...

Optional Arguments

ArgumentTypeDefaultDescription
--instructorstring"Chris Moore"Instructor's name as it appears in the transcript
--topicstring(none)Course topic for this session (e.g., "Visual Thinking")
--output-dirstring.transcribo/outputDirectory for the Markdown session log
--dry-runbooleanfalseParse the transcript without writing to PrivateLanguage

Dry-Run Workflow: Always Test First

Before running the full pipeline, always do a dry run to verify the transcript parses correctly:

npx tsx apps/web/scripts/ingest-transcript.ts \
  --file /path/to/transcript_refined.md \
  --week 1 \
  --day tue \
  --date 2026-03-31 \
  --project-id placeholder \
  --account-id placeholder \
  --dry-run

The dry run will output:

=== Transcript Pipeline: brown-w1-tue-2026-03-31 ===

DRY RUN — parsing transcript only, no PL writes
Parsed 47 speaker turns
Speakers: Chris Moore, Student A, Student B, Student C

Check that:

  1. Speaker count looks reasonable for your session
  2. All expected speakers appear in the list
  3. No speaker name variants (e.g., "Chris" and "Chris Moore" as separate entries)

If the speaker count is zero, the transcript likely has formatting issues. See Preparing Transcripts.

Environment Variables

The full pipeline (non-dry-run) requires four environment variables:

VariablePurpose
SUPABASE_URLYour PrivateLanguage Supabase instance URL
SUPABASE_SERVICE_ROLE_KEYService role key for database writes (bypasses RLS)
OPENAI_API_KEYOpenAI API key for generating embeddings
ANTHROPIC_API_KEYAnthropic API key for idea extraction via Claude

Set them in your shell before running:

export SUPABASE_URL="https://your-instance.supabase.co"
export SUPABASE_SERVICE_ROLE_KEY="eyJ..."
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."

Or use a .env file with a tool like dotenv.

Full Pipeline Example

npx tsx apps/web/scripts/ingest-transcript.ts \
  --file .transcribo/output/transcript_refined.md \
  --week 2 \
  --day wed \
  --date 2026-04-02 \
  --instructor "Chris Moore" \
  --topic "Visual Thinking" \
  --project-id a1b2c3d4-e5f6-7890-abcd-ef1234567890 \
  --account-id f9e8d7c6-b5a4-3210-fedc-ba9876543210

Reading the Output

A successful run produces output like this:

=== Transcript Pipeline: brown-w2-wed-2026-04-02 ===

Extracted 23 ideas from 47 speaker turns
Detected 15 relationships
Ideas log written to .transcribo/output/brown-w2-wed-2026-04-02-ideas.md
Ingested 23 atoms (capture event: ce_abc123)
Wrote 15 relationships to atom_relationships

=== Pipeline Complete ===
Ideas extracted: 23
Atoms created: 23
Capture event: ce_abc123
Markdown log: .transcribo/output/brown-w2-wed-2026-04-02-ideas.md

The Markdown log file is a human-readable record of every extracted idea. Open it to review what the AI captured before sharing with students.

Session ID Format

The pipeline automatically generates a session ID from your arguments:

brown-w{week}-{day}-{date}

For example: brown-w2-wed-2026-04-02

Troubleshooting

"Missing required argument: --file"

You forgot a required argument. All six required arguments must be provided.

"No speaker turns found in ..."

The transcript file was read but no lines matched the expected format. Common causes:

  • ASCII arrows (->) instead of Unicode arrows ()
  • Missing bold markers on speaker names
  • File is empty or contains only headers/metadata

See Preparing Transcripts for formatting fixes.

"Missing SUPABASE_URL or SUPABASE_SERVICE_ROLE_KEY"

Environment variables are not set. This only applies to full runs (not dry runs). Export the required variables or check your .env file.

"Missing OPENAI_API_KEY (for embeddings)"

The OpenAI API key is needed to generate vector embeddings for search. Set OPENAI_API_KEY in your environment.

"Missing ANTHROPIC_API_KEY (for idea extraction)"

The Anthropic API key is needed for Claude to extract ideas from the transcript. Set ANTHROPIC_API_KEY in your environment.

"Pipeline failed: ..."

An unexpected error occurred. The error message will contain details. Common causes:

  • Network issues — API calls to OpenAI or Anthropic failed
  • Database issues — Supabase is unreachable or the service role key is invalid
  • Invalid UUIDs — the project-id or account-id doesn't exist in the database

Check the error message and verify your credentials and IDs.