🎙️ Whisper Batch Transcriber

This Python script automates speech-to-text transcription for multiple audio files using OpenAI Whisper.
It scans an audio/ folder for supported files, transcribes them, and saves all results into a single CSV file.

Features

Automatically detects all audio files in the audio/ folder
Uses OpenAI’s Whisper model for accurate transcription
Supports multiple audio formats (.mp3, .wav, .m4a, .flac, .ogg, .webm)
Saves transcripts neatly in a transcripts.csv file
Flushes results to disk after every file (prevents data loss)
Displays progress and elapsed time

Requirements

Python 3.8+

Dependencies:

pip install openai-whisper

pip install torch  # required backend for Whisper

(If using GPU, install the CUDA-enabled version of PyTorch)

Installing the Open-Source Whisper Module (Manual Method)

If you want to manually install Whisper from source, follow these steps:

Download Whisper from GitHub: 🔗 https://github.com/openai/whisper

Extract the downloaded ZIP and navigate to the folder: whisper-main

Install locally using PowerShell or VS Code terminal:

cd C:\Users\User\Downloads\whisper-main
pip install -e .

Alternatively, install dependencies manually:

pip install -r requirements.txt

OR

pip install openai-whisper

✅ This automatically installs all required dependencies, including:

numba
numpy
torch
tqdm
more-itertools
tiktoken

⚠️ Note: On Linux (x86_64), it may also install triton>=2.0.0 if supported by your system.
You do not need to install these manually unless building from source.

Installing FFmpeg (Required for Audio Processing)

Download FFmpeg from: 🔗 https://ffmpeg.org/download.html#build-windows

Extract the files and move the folder to:

C:\ffmpeg

Add FFmpeg to your system environment variables:
Go to: Control Panel → System → Advanced System Settings → Environment Variables
Under User variables, find Path

Add:

C:\ffmpeg\bin

Verify installation:

ffmpeg -version

📁 Folder Structure

project/
│
├── audio/ # Folder containing your audio files
│ ├── example1.mp3
│ ├── example2.wav
│
├── transcripts.csv # Generated output (after running the script)
│
└── transcribe.py # The main script

Usage

Clone the repository:

git clone https://github.com/monojitbgit/whisper-batch-transcriber.git
cd whisper-batch-transcriber

Place your audio files inside the audio/ folder.

Run the script:

python transcribe.py

Model Details

By default, the script loads the medium Whisper model:

model = whisper.load_model("medium")

To use a different model:

model = whisper.load_model("small")

Change the transcription language by adjusting:

transcribe_file(file_path, language="en")

(Replace "en" with your desired ISO language code.)

Notes

Whisper runs locally — no API key required
For faster performance, use a GPU (CUDA-compatible)
Large models need more memory — try small or base if you get CUDA out of memory

⚖️ Whisper Model Comparison

Option	Speed	Accuracy	Model	⏱️ CPU (Intel-i7/Ryzen-7)	⚙️ GPU (RTX 3060/3090/4090)	Size	Speed (Relative)	Accuracy (Relative)	🌐 Multilingual Support
Use `"tiny"`	Fast	Low accuracy	`tiny`	~30 sec	~5–10 sec	75 MB	⭐⭐⭐⭐ (Fastest)	⭐ (Basic)	✅ Yes
Use `"base"`	Fast	Low accuracy	`base`	~1 min	~10–20 sec	142 MB	⭐⭐⭐ (Fast)	⭐⭐ (Good)	✅ Yes
Use `"small"` with CPU optimizations	Faster	Good	`small`	~2–3 min	~20–30 sec	466 MB	⭐⭐ (Medium)	⭐⭐⭐ (Better)	✅ Yes
Use `"medium"` after reducing file size	Slow	Better	`medium`	~5–6 min	~40–60 sec	1.5 GB	⭐ (Slower)	⭐⭐⭐⭐ (Very Good)	✅ Yes
Use Google Colab GPU (`medium` / `large`)	Super Fast	Best	`large`	~10–15 min	~1.5–2 min	3.0 GB	⭐ (Slowest)	⭐⭐⭐⭐⭐ (Best)	✅ Yes

💡 Recommendations

🧠 For quick tests: Use tiny or base
⚖️ For balance (speed + accuracy): Use small
🗣️ For high-quality transcription: Use medium
🚀 For best performance: Use large on GPU or Google Colab

License

This project is licensed under the MIT License.
You’re free to use, modify, and distribute it with attribution.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
README.md		README.md
transcribe.py		transcribe.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎙️ Whisper Batch Transcriber

Features

Requirements

Installing the Open-Source Whisper Module (Manual Method)

Installing FFmpeg (Required for Audio Processing)

📁 Folder Structure

Usage

Model Details

Notes

⚖️ Whisper Model Comparison

💡 Recommendations

License

About

Uh oh!

Releases

Packages

Languages

License

monojitb19/whisper-batch-transcriber

Folders and files

Latest commit

History

Repository files navigation

🎙️ Whisper Batch Transcriber

Features

Requirements

Installing the Open-Source Whisper Module (Manual Method)

Installing FFmpeg (Required for Audio Processing)

📁 Folder Structure

Usage

Model Details

Notes

⚖️ Whisper Model Comparison

💡 Recommendations

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages