Whisper is an automatic speech recognition (ASR) system developed by OpenAI, designed to transcribe and translate audio efficiently and accurately. It supports multiple languages and use cases, ranging from real-time transcription to audio-based translation.
The Soenneker.Libraries.Whisper.CTranslate
package provides a Windows executable version of Whisper using the Faster-Whisper implementation with CTranslate2. This implementation is up to 4 times faster than OpenAI’s original Whisper system, maintaining the same accuracy while requiring less memory.
- No Python installation is needed: Simple to deploy and use for Windows users.
- Daily Updates: This package is updated daily, including updates from its entire dependency chain and the source repository.
- CUDA Support: GPU transcoding via CUDA is fully supported for faster processing on compatible systems.
You can obtain the executable in the following ways:
Download the latest release directly from the Releases section on GitHub.
dotnet install Soenneker.Libraries.Whisper.CTranslate
After installing via NuGet, the executable will be located at /Resources/whisper_ctranslate2.exe
within your project.
The following examples demonstrate how to use the executable:
Use the following command in PowerShell to transcribe an input audio file:
./whisper_ctranslate2.exe input.mp3 --model medium
This command will automatically download the required model from HuggingFace if it’s not already available locally.
Generate subtitles (in SRT format) from an input audio file:
./whisper-ctranslate2 input.mp3 --model medium --output-format srt
If you know the language of the input audio, specify it to improve accuracy and speed:
./whisper-ctranslate2 input.mp3 --model medium --language en
To see all available options and arguments:
./whisper_ctranslate2 --help
The source code and instructions for building the executable yourself can be found in the Soenneker.Runners.Whisper.CTranslate repository.