changelog.txt

Log for Standalone Faster-Whisper & Faster-Whisper-XXL

### Note: It's not a comprehensive log.

## r239.1 [XXL]:
New features/args:
`--batched`
`--unmerged`
`--batch_size`
`--multilingual`
`--hotwords`
`--rehot`
`--ignore_dupe_prompt`
New `vad_method`'s:
`silero_v5`
`silero_v4_fw`
`silero_v5_fw`
Removed not useful experimental options:
no_speech_strict_lvl
nullify_non_speech
prompt_max
reprompt's first option

## r194.5 [XXL]:
Change: Enable auto/default prompt presets for translation.

## r194.4 [XXL]:
Fix: --sentence was splitting digits containing dot.
Fix: auditok
Fix: Error with empty audio and pyannote_v3 vad
Fix: Disable prompt_reset_on_no_end when HST is triggered
Removed experimental condition in HST algo

## r194.2 [XXL]:
Bugfix: Broken outputs when diarize/sentence and batch input [194.1].
Fix: --sentence was splitting words on hyphens.
Fix: --sentence was splitting digits containing comma.
Change: --sentence affects all output formats except json [previously only srt & vtt].
Improvements and bugfixes for json input.
Added "One Click Transcribe" tool: https://github.com/Purfview/whisper-standalone-win/discussions/337

## r194.1 [XXL]:
New feature: CUDA support for `pyannote_v3` and `pyannote_onnx_v3` VADs
New feature: Selective output formats, for example: `--output_format srt json`
New feature: `--diarize` will auto-activate `--sentence` and all output formats will be affected except json
New `--diarize` options: `pyannote_v3.0`, `pyannote_v3.1`, `reverb_v1`, `reverb_v2`
New diarization args: `--num_speakers`, `--min_speakers`, `--max_speakers` `--diarize_dump`
Bugfix: Bug in alternative subtitle writer with max_line_width [r189.1].
Bugfix: Bug in --sentence subtitle writer if "word" is space.
Change: .cache dir is bound to exe dir instead of cwd.
pyinstaller updated to 6.11.1
CTranslate2 updated to 4.4.0 [+ Compute Capability 5.0 support]
onnxruntime_gpu updated to 1.18.0 [CUDA 12.x cuDNN 8.x]
Python updated to 3.10.11

## r193.1 [XXL]:
New feature: Speaker Diarization with `--diarize`. 
New feature: `large-v3-turbo` model autodownload.
New feature: JSON output has additional key: 'language'.
New feature: `--postfix` works without --language.
Bigfix: Rare bug in alternative subtitle writer if "words" is empty [r189.1].
Change: `--vad_alt_method` shorted to `--vad_method`.
Change: Windows 7 non supported anymore.

## r192.3.4 [XXL]:
New feature: Write intermediate files to temp folder. [ignored on dump]
New feature: Can take JSON files as input to generate subtitles from it according to the settings.
New alternative VAD method : `pyannote_v3` [previous same named option renamed to "pyannote_onnx_v3"]
New arg: `--nullify_non_speech` [for WIP experiments]

## r192.3.3 [XXL]:
CTranslate2 updated to v4.2.0
Transcription on CPU is up to ~26% faster.

## r192.3.2 [XXL]:
Various improvements and bugfixes to `--vad_alt_method`

## r192.3:
Bugfix: 'one_word' was broken in r192.2.

## r192.2:
Allowed to use 'max_line_width' and 'max_line_count' without '--sentence'.

## r192.1.1 [XXL]:
`--ff_mdx_kim2`: Preprocess audio with MDX23 Kim vocal v2 model.
`--vad_alt_method`: Alternative VAD methods - "silero_v3", "silero_v4", "pyannote_v3", "auditok", "webrtc".

## r192.1:
Bugfix: It was unnecessarily going through ffmpeg routines. [introduced in r189.1]

## r189.1:
Supports `distil-large-v3` model.
Switched auto prompt for Serbian to the latin alphabet.
Auto-disable VAD when --clip_timestamps is in use.
JSON output has additional 'text' key with all text aggregated.
Fixed --task=translate and --postfix bug.
--highlight_words now supports --max_line_width and --max_line_count
Changed patience default.
Adjusted "--hallucination_silence_threshold"s score algo, and new `--hallucination_silence_th_temp`
Auto pseudo-vad th offsets for "large-v3", can be disabled with --v3_offsets_off.
`--language_detection_threshold`
`--language_detection_segments`
`--ff_track` - Audio track selection. 
`--ff_fc` - Front-Center channel selection.
`--ff_dump` - Additionally prevents deletion of some intermediate audio files.
`--vad_dump` - Writes VAD timings to srt file.
Updated PyInstaller to v6.5.0
Fix Linux exec issue with forward compatibility. v2

## r186.1:
Fix quirks with "The last progress update is passed to the title bar." [r160.11]
Fix Linux exec issue with forward compatibility.

## r182.2:
Bugfix: PR705 was incorporated a bit incorrectly.
Bugfix: Bug with the custom prompt presets and task "translate".

## r182.1:
Includes PR705 & PR706
Excludes PR694 [aka CUDA12]
Skip micro chunks.
New args:
`--hallucination_silence_threshold`
`--clip_timestamps`
Updated PyAV to v11.0.0 [ffmpeg 6.0]
Updated PyInstaller to v6.4.0
Updated CTranslate2 to v3.24.0

## r172.4:
Bugfix: With some CPUs rnndn filters were not working.
Improved: `--sentence` doesn't treat ellipses (...) as the end of sentence, unless ellipses are at the end of a segment.
Changed default of `--initial_prompt` to `auto`.

## r172.3:
Better error handling for the audio filters.

## r172.2:
Increased "recursion" limit from 1000 to 10000.

## r172.1:
Supports "Distil-Whisper" models:
`distil-large-v2`
`distil-medium.en`
`distil-small.en`
New args:
`--max_new_tokens`
`--chunk_length`

## r167.5:
Additional option for `--prompt_reset_on_no_end`
Various audio filters:
`--ff_dump`     
`--ff_mp3`     
`--ff_sync`      
`--ff_rnndn_sh` 
`--ff_rnndn_xiph`
`--ff_fftdn`    
`--ff_tempo`     
`--ff_gate`     
`--ff_speechnorm`
`--ff_loudnorm` 
`--ff_silence_suppress` 
`--ff_lowhighpass`

## r167.4:
Bugfix: Bug if couldn't detect the number of CPU's cores.
Updated psutil to v5.9.7

## r167.3:
Bugfix: Illogical "Avoid computing higher temperatures on no_speech". [It could cause bad hallucination loops]
https://github.com/openai/whisper/pull/1903
https://github.com/SYSTRAN/faster-whisper/pull/625

## r167.2:
Fix: Commit #165 was missing in `r167.1` release.

## r167.1:
`large-v3` now is using the official model.
Updated PyInstaller to v6.3.0
Updated CTranslate2 to v3.23.0

## r160.12:
Doesn't add to prompt segments triggered by the common hallucinations in hallucinations_list.

New parameters :
`--max_comma_cent`
`--min_dist_to_end`

## r160.11:
Bugfix:  `--prompt_reset_on_temperature` was broken. Regression since r139.
Improved `--initial_prompt` default preset, there is alternative experimental `auto` preset.
Improved `--sentence` with the ignore words list, words like `Mr.` and ect..
The last progress update is passed to the title bar.

New parameters :
`--prompt_max`
`--reprompt` [enabled by default]
`--prompt_reset_on_no_end` [enabled by default]

## r160.10:
Bugfix: The final line disappeared if the final word wasn't punctuated when using  `--sentence`.

## r160.9:
Bugfix: The final word could disappear in some combination of the new parameters from r160.8.
New parameter : --standard_asia

## r160.8:
Improved --skip to work with all output folders.

New parameters :
`--sentence`
`--standard`
`--max_comma`
`--max_gap`
`--max_line_width`
`--max_line_count`

## r160.7:
Bugfix: Wildcard input could fail if brackets were in a folder's name.
Improved `--one_word` with the second option.
Improved `--skip` to work with all output formats. If output is "all" - checks only "srt".
New `--check_files` argument.
Added a catch for invalid `audio` input.
Turns off word_timestamps if output format is "text".

Exposed options :
`--prefix`
`--suppress_blank`
`--without_timestamps`
`--max_initial_timestamp`

## r160.6:
Bugfix: Autodownload wasn't working for `large-v3` [ regression in r160.5 ]

## r160.5:
large-v3 use fp16 model by default. [Removed other v3 options]
New switch: --one_word
Updated Pyinstaller to v6.2.0

## r160.4:
Bugfix: -prompt=None wasn't working.
UTF-8 chars are now supported in SE's "console".

## r160.3:
'large-v3' model.
'Cantonese' language.

## r160.2:
Failed release with a code typo.
Was online for ~15 mins...

## r160.1:
Bugfix: Fixed bug on macOS -> https://github.com/Purfview/whisper-standalone-win/issues/86