fal-whisper
Fast, asynchronous Whisper transcription for audio and video files with SRT subtitle export.
skill install https://www.promptspace.in/skills/fal-whisperHigh-Speed Whisper Transcription
Integrate professional-grade audio transcription into your agent's workflow using the fal.ai Fast Whisper model. This skill provides a robust Python-based interface to transcribe audio or video files into accurate, searchable text with optional word-level timestamps.
What it does
- Async Processing: Handles transcription through an asynchronous queue (submit, poll, retrieve) designed for stability with large files.
- Local & Remote Support: Transcribe files directly from your local machine using base64 encoding or provide a public URL for cloud-hosted files.
- SRT Generation: Automatically generates industry-standard SubRip (.srt) subtitle files with precision timestamps.
- Broad Format Support: Works with MP3, MP4, M4A, WAV, FLAC, and more.
Why use this skill?
Transcribing audio is computationally expensive and difficult to get right with vanilla prompting. This skill offloads the heavy lifting to fal.ai's optimized hardware, returning structured data including text chunks and segment timestamps. It specifically solves the problem of file size limitations and provides persistent local storage for your transcripts in a dedicated directory (~/.fal-whisper/).
Output Format
The skill produces two primary outputs: a clean .txt file containing the full transcript and an optional .srt file ready for use in video editors like Premiere Pro or DaVinci Resolve.
Use cases
- Convert podcast recordings or interviews into searchable text documents.
- Generate professional SRT captions for YouTube or social media videos.
- Extract minutes and actionable notes from recorded team meetings.
- Create accessibility-compliant transcripts for educational video content.
Example
Prompt
Sample output preview is available after purchase.