Running Ingestion
CLI command reference, dry-run workflow, environment setup, and troubleshooting.
The transcript pipeline runs as a CLI script from the project root. This page covers the full command reference, the recommended dry-run workflow, and troubleshooting.
CLI Command
npx tsx apps/web/scripts/ingest-transcript.ts \ --file /path/to/transcript_refined.md \ --week 1 \ --day tue \ --date 2026-03-31 \ --project-id <uuid> \ --account-id <uuid>
Arguments
Required Arguments
| Argument | Type | Description | Example |
|---|---|---|---|
--file | string | Path to the Transcribo Markdown transcript | /path/to/.transcribo/output/transcript_refined.md |
--week | number | Week number in the course | 1 |
--day | string | Day of the week (tue or wed) | tue |
--date | string | Session date in ISO format | 2026-03-31 |
--project-id | UUID | PrivateLanguage project UUID | a1b2c3d4-... |
--account-id | UUID | Team account UUID | e5f6g7h8-... |
Optional Arguments
| Argument | Type | Default | Description |
|---|---|---|---|
--instructor | string | "Chris Moore" | Instructor's name as it appears in the transcript |
--topic | string | (none) | Course topic for this session (e.g., "Visual Thinking") |
--output-dir | string | .transcribo/output | Directory for the Markdown session log |
--dry-run | boolean | false | Parse the transcript without writing to PrivateLanguage |
Dry-Run Workflow: Always Test First
Before running the full pipeline, always do a dry run to verify the transcript parses correctly:
npx tsx apps/web/scripts/ingest-transcript.ts \ --file /path/to/transcript_refined.md \ --week 1 \ --day tue \ --date 2026-03-31 \ --project-id placeholder \ --account-id placeholder \ --dry-run
The dry run will output:
=== Transcript Pipeline: brown-w1-tue-2026-03-31 === DRY RUN — parsing transcript only, no PL writes Parsed 47 speaker turns Speakers: Chris Moore, Student A, Student B, Student C
Check that:
- Speaker count looks reasonable for your session
- All expected speakers appear in the list
- No speaker name variants (e.g., "Chris" and "Chris Moore" as separate entries)
If the speaker count is zero, the transcript likely has formatting issues. See Preparing Transcripts.
Environment Variables
The full pipeline (non-dry-run) requires four environment variables:
| Variable | Purpose |
|---|---|
SUPABASE_URL | Your PrivateLanguage Supabase instance URL |
SUPABASE_SERVICE_ROLE_KEY | Service role key for database writes (bypasses RLS) |
OPENAI_API_KEY | OpenAI API key for generating embeddings |
ANTHROPIC_API_KEY | Anthropic API key for idea extraction via Claude |
Set them in your shell before running:
export SUPABASE_URL="https://your-instance.supabase.co" export SUPABASE_SERVICE_ROLE_KEY="eyJ..." export OPENAI_API_KEY="sk-..." export ANTHROPIC_API_KEY="sk-ant-..."
Or use a .env file with a tool like dotenv.
Full Pipeline Example
npx tsx apps/web/scripts/ingest-transcript.ts \ --file .transcribo/output/transcript_refined.md \ --week 2 \ --day wed \ --date 2026-04-02 \ --instructor "Chris Moore" \ --topic "Visual Thinking" \ --project-id a1b2c3d4-e5f6-7890-abcd-ef1234567890 \ --account-id f9e8d7c6-b5a4-3210-fedc-ba9876543210
Reading the Output
A successful run produces output like this:
=== Transcript Pipeline: brown-w2-wed-2026-04-02 === Extracted 23 ideas from 47 speaker turns Detected 15 relationships Ideas log written to .transcribo/output/brown-w2-wed-2026-04-02-ideas.md Ingested 23 atoms (capture event: ce_abc123) Wrote 15 relationships to atom_relationships === Pipeline Complete === Ideas extracted: 23 Atoms created: 23 Capture event: ce_abc123 Markdown log: .transcribo/output/brown-w2-wed-2026-04-02-ideas.md
The Markdown log file is a human-readable record of every extracted idea. Open it to review what the AI captured before sharing with students.
Session ID Format
The pipeline automatically generates a session ID from your arguments:
brown-w{week}-{day}-{date}
For example: brown-w2-wed-2026-04-02
Troubleshooting
"Missing required argument: --file"
You forgot a required argument. All six required arguments must be provided.
"No speaker turns found in ..."
The transcript file was read but no lines matched the expected format. Common causes:
- ASCII arrows (
->) instead of Unicode arrows (→) - Missing bold markers on speaker names
- File is empty or contains only headers/metadata
See Preparing Transcripts for formatting fixes.
"Missing SUPABASE_URL or SUPABASE_SERVICE_ROLE_KEY"
Environment variables are not set. This only applies to full runs (not dry runs). Export the required variables or check your .env file.
"Missing OPENAI_API_KEY (for embeddings)"
The OpenAI API key is needed to generate vector embeddings for search. Set OPENAI_API_KEY in your environment.
"Missing ANTHROPIC_API_KEY (for idea extraction)"
The Anthropic API key is needed for Claude to extract ideas from the transcript. Set ANTHROPIC_API_KEY in your environment.
"Pipeline failed: ..."
An unexpected error occurred. The error message will contain details. Common causes:
- Network issues — API calls to OpenAI or Anthropic failed
- Database issues — Supabase is unreachable or the service role key is invalid
- Invalid UUIDs — the project-id or account-id doesn't exist in the database
Check the error message and verify your credentials and IDs.