Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CONTRIBUTION: A bash shell script to run on a given input media file to generate subtitles files with the same base name #2703

Open
vijinho opened this issue Jan 4, 2025 · 0 comments

Comments

@vijinho
Copy link

vijinho commented Jan 4, 2025

whisper-wrapper.sh

I am contributing a script I wrote to easily use after the src is first built. Provided as-is, to use freely.

TLDR - Generate subtitle/text output files in the same folder as the given input video file, in the same directory

Examples

Example 1: Basic Usage with Default Settings

./whisper-wrapper.sh -i /path/to/input_video.mp4

Explanation:

  • This command specifies an input file (input_video.mp4) and uses the default output file path (/tmp/whisper.wav).
  • The script will convert input_video.mp4 to WAV format and then use whisper-cli with default arguments (-t 8 -pp -pc -otxt -ovtt -osrt) to transcribe it.

Example 2: Custom Whisper-CLI Arguments

./whisper-wrapper.sh -i /path/to/input_video.mp4 -w "-t 16 -pp -pc -l fr"

Explanation:

  • This command specifies an input file (input_video.mp4) and custom arguments for whisper-cli (-t 16 -pp -pc -l fr).
  • The script will convert input_video.mp4 to WAV format and then use the specified whisper-cli arguments to transcribe it, with the language set to French (-l fr).

bash shell-script: whisper-wrapper.sh

#!/bin/bash

# Default values
MODEL="$HOME/src/whisper.cpp/models/ggml-large-v3-turbo-q5_0.bin"
INPUT_FILE=""
OUTPUT_FILE="$TEMP/whisper-$(date "+%Y%m%d%H%M%S").wav"
WHISPER_ARGS="-m $MODEL -l en -t 12 -pp -pc -otxt -ovtt -osrt"

# Function to display usage information
usage() {
    echo "Usage: $0 -i <input_file> [-o <output_file>] [-w <whisper-args>]"
    echo "  -i, --input-file     The path to the input media file"
    echo "  -o, --output-file    Optional path to the temporary output audio file (should have .wav extension, defaults to /tmp/whisper.wav)"
    echo "  -w, --whisper-args   Arguments for whisper-cli (defaults to '$WHISPER_ARGS')"
}


# Parse named arguments
while [[ "$#" -gt 0 ]]; do
    case $1 in
        -i|--input-file) INPUT_FILE="$2"; shift ;;
        -o|--output-file) OUTPUT_FILE="$2"; shift ;;
        -w|--whisper-args) WHISPER_ARGS="$2"; shift ;;
        *) echo "Unknown parameter passed: $1"; usage; exit 1 ;;
    esac
    shift
done

# Check if the input file is provided
if [ -z "$INPUT_FILE" ]; then
    echo "Error: Input file is required."
    usage
    exit 1
fi

# Check if FFmpeg is installed
if ! command -v ffmpeg &> /dev/null; then
    echo "Error: FFmpeg is not installed. Please install it first."
    exit 1
fi

# Check if the input file exists
if [ ! -f "$INPUT_FILE" ]; then
    echo "Error: Input file '$INPUT_FILE' does not exist."
    exit 1
fi

# Ensure the output file has a .wav extension
if [[ "$OUTPUT_FILE" != *.wav ]]; then
    OUTPUT_FILE="${OUTPUT_FILE}.wav"
    echo "Note: Output file extension changed to .wav ($OUTPUT_FILE)"
fi

# Check if the output file already exists and delete it if so
if [ -f "$OUTPUT_FILE" ]; then
    echo "Warning: Output file '$OUTPUT_FILE' already exists. Deleting the existing file..."
    rm -f "$OUTPUT_FILE"
fi

# Convert the audio file using FFmpeg
echo "Converting '$INPUT_FILE' to '$OUTPUT_FILE'..."

ffmpeg -i "$INPUT_FILE" -ar 16000 -ac 1 -c:a pcm_s16le "$OUTPUT_FILE"

# Check if the conversion was successful
if [ $? -eq 0 ]; then
    echo "Conversion successful."
else
    echo "Error: FFmpeg encountered an issue during conversion."
    exit 1
fi

# Execute whisper-cli on the output WAV file
echo "Executing whisper-cli with arguments '$WHISPER_ARGS' on '$OUTPUT_FILE'..."

$HOME/src/whisper.cpp/build/bin/whisper-cli $WHISPER_ARGS -of "${INPUT_FILE%.*}" "$OUTPUT_FILE"

# Check if whisper-cli execution was successful
if [ $? -eq 0 ]; then
    echo "whisper-cli execution successful."
	rm "$OUTPUT_FILE"
else
    echo "Error: whisper-cli encountered an issue during execution."
    exit 1
fi

Script Explanation

This script is designed to automate the process of converting a media file into an audio WAV format and then transcribing that audio using the whisper-cli tool. Here's a step-by-step summary:

  1. Usage Information: The script starts by defining a usage function that explains how to use the script, including which arguments are required and optional.

  2. Argument Parsing: It uses getopts to parse named arguments (-i, -o, -w) for the input file, output file, and whisper-cli arguments respectively. If these arguments are not provided, it defaults some values (e.g., the default output file is /tmp/whisper.wav).

  3. Input Validation: The script checks if FFmpeg is installed, verifies that the input file exists, ensures the output file has a .wav extension, and removes any existing file with the same name.

  4. Audio Conversion: Using FFmpeg, the script converts the input media file into a 16kHz mono PCM WAV format, which is suitable for transcription by whisper-cli.

  5. Transcription: The script then runs whisper-cli with the specified arguments to transcribe the audio file. The output format and other settings can be customized through the -w argument.

  6. Final Steps: After transcription, the script checks if the whisper-cli execution was successful and removes the temporary WAV file used for transcription if everything goes well.

In essence, this script streamlines the process of converting any media file into a format suitable for transcription and then transcribing it using a specified tool, with user options to customize various aspects of both steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant