---
name: ai-shortform-video-editor
description: Edit or prepare AI Business Boomer / short-form videos from Sam's raw talking-head clips, recording docs, and three vertical images. Use when the user gives a video/audio clip and asks Nancy to edit it, make a rough cut, create a DaVinci Resolve edit plan, sync the clip to a teleprompter, add image cutaways, captions, zoom notes, or produce an editor-ready package/MP4 for TikTok/Reels/Shorts.
---

# AI Shortform Video Editor

Use this skill when Sam provides a raw clip and wants Nancy to do the editing work normally described in the recording docs.

## Core outcome

Produce one of these, depending on the ask and available assets:

1. **Editor-ready package**: transcript, cut list, image timing, caption chunks, and DaVinci instructions.
2. **Rough edited MP4**: vertical 9:16 cut using ffmpeg when feasible.
3. **DaVinci Resolve handoff**: exact timeline notes / EDL-style cut plan for manual finishing.

Do not claim a polished final edit unless an actual MP4 was produced and inspected.

## Default inputs

- Sam raw talking-head clip, audio, or voice memo
- Google recording doc or pasted teleprompter/editor timeline
- Three vertical images/slides when available; if missing or weak, create/source a visual pack
- Target platform: TikTok / Reels / Shorts, default 9:16 under 60 seconds

## Workflow

1. **Create an edit package folder** under `automation-outputs/video-edits/<slug>-<timestamp>/`.
2. **Copy/link inputs** into the package: raw clip, images, doc export/transcript.
3. **Probe media** with `ffprobe` and generate thumbnails/contact sheet.
4. **Transcribe** the clip:
   - Prefer the `openai-whisper` skill / local Whisper CLI if available.
   - If transcription fails, extract audio and escalate with the exact blocker.
5. **Find the first audible word** and cut dead air before it. Do not hide opening silence under a chart/image.
6. **Compare transcript to teleprompter**:
   - Keep Sam's best natural wording if it is stronger than the doc.
   - Remove restarts, long pauses, filler, and duplicated lines.
7. **Create/source visuals when needed**:
   - First try existing recording-doc charts/images.
   - If visuals are missing, weak, or too repetitive, generate or source a visual pack.
   - For AI generation, prefer vertical phone-first visuals with no readable text/logos/robots; add exact text later if needed.
   - Normalize final visual assets to 1080x1920 and keep originals as backup.
   - Include a `visual_pack_notes.md` with recommended order and editor usage.
8. **Build the retention edit**:
   - First audible word should start at ~0:00.
   - Use chart/image cutaways as short visual proof, not long wallpaper.
   - Keep Sam on screen more than the images unless a chart is actively being explained.
   - For opening chart sections, prefer chart full-screen with Sam picture-in-picture bottom-right.
   - Opinions, stakes, belief correction, and CTA stay on Sam's face.
   - Add subtle push-ins on Sam's full-face shots and on face PiP when possible.
9. **Caption package**:
   - Always create a captions handoff package; do not depend on DaVinci auto-subtitles.
   - For text-heavy/chart visuals, do not place captions over the image.
   - For full-face sections, use Sam's exact words in narrow 1–2 line chunks.
10. **Produce output**:
   - If an MP4 can be safely rendered, create it and verify duration/resolution.
   - Otherwise create an editor handoff doc/markdown with exact timings.
11. **Verify**:
   - Check output file exists.
   - Use `ffprobe` for duration/resolution/audio streams.
   - Extract a few frames to verify images/crops are not broken.
   - Upload/share user-facing Drive outputs with `shmonac@gmail.com` writer access when Drive is used.

## AI Business Boomer edit rules

- Style: clean documentary/business explainer.
- Avoid cheesy robot effects, spinning transitions, random stock footage.
- Use images as proof/mechanism/action visuals, not decoration.
- Talk to the image: if a chart is on screen, Sam's line should explain the chart.
- Cut back to Sam frequently; if an image stays up too long, the video feels like a slide deck.
- Use bottom-right face PiP over a chart when both Sam and the chart need to be visible.
- Default cut order:
  1. Start on Sam or chart+Sam PiP at the first audible word
  2. Short image/chart proof hit
  3. Sam explanation / belief correction
  4. Short proof/projection image
  5. Sam business implication
  6. Short practical examples image
  7. Sam CTA
- Default export: MP4, H.264, 1080x1920, 30fps, under 60 seconds.

## When making an actual MP4

Use ffmpeg for reliable mechanical operations:

- Trim dead air, especially the opening lead-in and silent tail
- Normalize/copy audio
- Scale/crop to 1080x1920
- Insert still-image cutaways
- Add chart/image + face picture-in-picture overlays
- Add subtle push-ins on Sam/full-face shots and PiP shots
- Create caption files/packages, but do not assume this ffmpeg build can burn captions cleanly

If the edit needs detailed manual judgment, create a DaVinci handoff instead of pretending ffmpeg can do everything.

## Visual generation / sourcing rules

- For Business Boomer, prefer clean business-documentary visuals: real operators, workflow boards, dashboards, charts, business systems.
- Avoid cheesy robots, sci-fi clichés, generic stock footage vibes, and fake readable UI text.
- Generated visuals should usually have no text; if text matters, create it as an overlay or programmatic chart later.
- Create 3–6 options per video when visuals are needed.
- Final visual assets should be 1080x1920 PNG/JPG for vertical editing.
- Save prompts/source notes and a contact sheet when practical.
- If using sourced B-roll, include licensing/source notes; do not use unverified copyrighted clips.

## Caption rules

- Always create a captions package for each vertical short.
- Required package files: `captions.srt`, `captions_clean.txt`, `caption_style_notes.md`, `README_DAVINCI_HANDOFF.md`.
- If the main visual already contains text/chart content, omit captions there.
- When Sam's face is full-screen, captions should use his exact spoken words.
- Keep caption lines narrow for vertical video, ideally under ~20 characters per line.
- Use 1–2 lines max per block.
- If the SRT is intended to start at the first full-face section, say so clearly; do not tell Sam to place it at 0:00 unless the timestamps are timeline-absolute.
- Keep full transcript captions as backup when creating selective/face-only captions.

## Useful bundled resources

- `scripts/prepare_edit_package.sh <raw-video> <output-dir> [image1 image2 image3...]` creates media metadata, extracted audio, thumbnails, and a starter checklist.
- `scripts/make_vertical_short.sh <input-video> <output-mp4> [image1 image2 image3]` creates a rough 1080x1920 vertical short.
- `scripts/create_captions_package.sh <final-vertical-mp4> <output-dir> [transcript.txt]` creates a standard captions handoff package using Whisper when needed.

## Final response format

Report:

- What was created: MP4 / edit package / DaVinci plan
- Output path or Google Doc link
- Verification: duration, resolution, file exists, images/captions checked
- Any blockers: missing images, transcription failure, unusable audio, unclear target doc