Running Ingestion

CLI command reference, dry-run workflow, environment setup, and troubleshooting.

The transcript pipeline runs as a CLI script from the project root. This page covers the full command reference, the recommended dry-run workflow, and troubleshooting.

CLI Command

npx tsx apps/web/scripts/ingest-transcript.ts \
  --file /path/to/transcript_refined.md \
  --week 1 \
  --day tue \
  --date 2026-03-31 \
  --project-id <uuid> \
  --account-id <uuid>

Arguments

Required Arguments

Argument	Type	Description	Example
`--file`	string	Path to the Transcribo Markdown transcript	`/path/to/.transcribo/output/transcript_refined.md`
`--week`	number	Week number in the course	`1`
`--day`	string	Day of the week (`tue` or `wed`)	`tue`
`--date`	string	Session date in ISO format	`2026-03-31`
`--project-id`	UUID	PrivateLanguage project UUID	`a1b2c3d4-...`
`--account-id`	UUID	Team account UUID	`e5f6g7h8-...`

Optional Arguments

Argument	Type	Default	Description
`--instructor`	string	`"Chris Moore"`	Instructor's name as it appears in the transcript
`--topic`	string	(none)	Course topic for this session (e.g., "Visual Thinking")
`--output-dir`	string	`.transcribo/output`	Directory for the Markdown session log
`--dry-run`	boolean	`false`	Parse the transcript without writing to PrivateLanguage

Dry-Run Workflow: Always Test First

Before running the full pipeline, always do a dry run to verify the transcript parses correctly:

npx tsx apps/web/scripts/ingest-transcript.ts \
  --file /path/to/transcript_refined.md \
  --week 1 \
  --day tue \
  --date 2026-03-31 \
  --project-id placeholder \
  --account-id placeholder \
  --dry-run

The dry run will output:

=== Transcript Pipeline: brown-w1-tue-2026-03-31 ===

DRY RUN — parsing transcript only, no PL writes
Parsed 47 speaker turns
Speakers: Chris Moore, Student A, Student B, Student C

Check that:

Speaker count looks reasonable for your session
All expected speakers appear in the list
No speaker name variants (e.g., "Chris" and "Chris Moore" as separate entries)

If the speaker count is zero, the transcript likely has formatting issues. See Preparing Transcripts.

Environment Variables

The full pipeline (non-dry-run) requires four environment variables:

Variable	Purpose
`SUPABASE_URL`	Your PrivateLanguage Supabase instance URL
`SUPABASE_SERVICE_ROLE_KEY`	Service role key for database writes (bypasses RLS)
`OPENAI_API_KEY`	OpenAI API key for generating embeddings
`ANTHROPIC_API_KEY`	Anthropic API key for idea extraction via Claude

Set them in your shell before running:

export SUPABASE_URL="https://your-instance.supabase.co"
export SUPABASE_SERVICE_ROLE_KEY="eyJ..."
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."

Or use a .env file with a tool like dotenv.

Full Pipeline Example

npx tsx apps/web/scripts/ingest-transcript.ts \
  --file .transcribo/output/transcript_refined.md \
  --week 2 \
  --day wed \
  --date 2026-04-02 \
  --instructor "Chris Moore" \
  --topic "Visual Thinking" \
  --project-id a1b2c3d4-e5f6-7890-abcd-ef1234567890 \
  --account-id f9e8d7c6-b5a4-3210-fedc-ba9876543210

Reading the Output

A successful run produces output like this:

=== Transcript Pipeline: brown-w2-wed-2026-04-02 ===

Extracted 23 ideas from 47 speaker turns
Detected 15 relationships
Ideas log written to .transcribo/output/brown-w2-wed-2026-04-02-ideas.md
Ingested 23 atoms (capture event: ce_abc123)
Wrote 15 relationships to atom_relationships

=== Pipeline Complete ===
Ideas extracted: 23
Atoms created: 23
Capture event: ce_abc123
Markdown log: .transcribo/output/brown-w2-wed-2026-04-02-ideas.md

The Markdown log file is a human-readable record of every extracted idea. Open it to review what the AI captured before sharing with students.

Session ID Format

The pipeline automatically generates a session ID from your arguments:

brown-w{week}-{day}-{date}

For example: brown-w2-wed-2026-04-02

Troubleshooting

"Missing required argument: --file"

You forgot a required argument. All six required arguments must be provided.

"No speaker turns found in ..."

The transcript file was read but no lines matched the expected format. Common causes:

ASCII arrows (->) instead of Unicode arrows (→)
Missing bold markers on speaker names
File is empty or contains only headers/metadata

See Preparing Transcripts for formatting fixes.

"Missing SUPABASE_URL or SUPABASE_SERVICE_ROLE_KEY"

Environment variables are not set. This only applies to full runs (not dry runs). Export the required variables or check your .env file.

"Missing OPENAI_API_KEY (for embeddings)"

The OpenAI API key is needed to generate vector embeddings for search. Set OPENAI_API_KEY in your environment.

"Missing ANTHROPIC_API_KEY (for idea extraction)"

The Anthropic API key is needed for Claude to extract ideas from the transcript. Set ANTHROPIC_API_KEY in your environment.

"Pipeline failed: ..."

An unexpected error occurred. The error message will contain details. Common causes:

Network issues — API calls to OpenAI or Anthropic failed
Database issues — Supabase is unreachable or the service role key is invalid
Invalid UUIDs — the project-id or account-id doesn't exist in the database

Check the error message and verify your credentials and IDs.