-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows process crashes when the GPU model is unloaded #71
Comments
The same error is reported in #64 and is related to the cuDNN installation. Can you check that? |
Really? But I have already installed Cudnn according to the Nivida documentation, and if I use medium-ct2 instead of large-v2-ct2, switch languages, or process other files, this type of problem will not occur. |
How much VRAM does your GPU have? |
about 6000Mb and 4000-5000Mb when running |
Possibly you are running out of memory for this specific file. Can you try using |
It still didn't work. I convert the audio file's format from aac to mp3. But it still didn't work. |
Is it possible for you to share this audio file? |
the video file |
I don't reproduce the error on Windows 11 with CUDA 11.8.0 and cuDNN 8.8.1. Are you using the latest version of faster-whisper? |
The first thing I do when this error occurred is to update the faster-whisper |
Seems like that it's difficult to reproduce the error. Since this problem only occurs with this unique file under certain conditions (I have also processed various other files since then, and the result is normal operation). Perhaps this issue can be put on hold? |
Does it also crash when you manually unload the model with |
It will when processing certain files (five out of approximately ninety files). If this error occurs when it processes a file, it will occur no matter how many times it processes the file. |
I found a way to run the transcription in a separate process so that even though it exits that child process, it doesn't exit your main script. Here is a working example:
|
same issues |
same issue and open a Process to run works for me |
@ProjectEGU @yslion @DoodleBears Are you all using the library on Windows? |
I can now reproduce the issue on Windows. It is somehow related to the temperature fallback. Can you try setting |
Glad to know that the reason has been detected. With this setting, the program runs nomally. |
Yes, I am using the library on Windows 10, I will try |
I have a possible fix for this issue in OpenNMT/CTranslate2#1201, but I can't test on my Windows machine today. Can you help testing?
|
Yes, I will try it now, by the way I try |
I install the wheel.
|
when using Sorry for did not keep the log, I remember it mentioned: def transcribe_speeches(self):
log.init_logging(debug=True)
# NOTE: 读取音频文件
logger.info(f"开始语音转文字")
whisper = WhisperModel(WHISPER_MODEL, device="cuda", compute_type="float16")
speeches_num = len(self.speeches)
for index, speech in enumerate(self.speeches):
logger.debug(f"开始识别 {speech.audio_path}")
speech_text = ''
# NOTE: 识别音频文件
segments, _ = whisper.transcribe(
audio=speech.audio_path,
language='zh',
vad_filter=False,
temperature=0,
initial_prompt='以下是普通话的句子。'
)
segments = list(segments)
if len(segments) == 0:
logger.warning(f"识别结果为空: {speech.audio_path}")
else:
speech_text = ','.join([segment.text for segment in segments])
logger.info(f"识别结果({index+1}/{speeches_num}): {speech_text}")
self.speeches[index].text = speech_text
logger.info(f"结束语音转文字: {self.speeches}")
# queue.put(self.speeches)
# FIXME: 卸载模型后会导致程序终止
del whisper |
same issue with windows 11 |
I was able to avoid that error with the temperature=0 setting. Will this setting adversely affect the transcribe results? I searched the whisper repo, but couldn't find a satisfactory answer. |
Yes disabling the temperature fallback can affect the results. The fallback is mostly useful to recover from cases where the model generates the same token in a loop. |
Thank you. My test results were the same as you said. |
My runtime environment is Python 3.11.4, CUDA 11.8.0, graphics card driver 522.06, and cudnn-windows-x86_64-8.9.3.28. I am using the faster-whisper project, and when I try to load the model using GPU, Python returns -1073740791 (0xC0000409) error. However, when I use CPU, the error does not occur. I have tried various solutions, including the ones you mentioned above, such as installing CUDA environment, adding system variables, and modifying the temperature to 0. None of them have worked. Whenever I iterate over the segments, CUDA crashes, and the program terminates. Finally, when I test and print print(torch.cuda.is_available()) to check if CUDA device is recognized as True, the program runs without any issues. My personal estimation is that there might be an issue with the initialization and release of CUDA in CT2. |
Go check if you install zlib refer to https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#install-zlib-windows |
@JamePeng This is a different issue. The issue described in this thread is a crash when unloading the model. The error you get generally means that the program cannot locate the cuDNN and/or Zlib libraries. There are already several discussions about this. |
@guillaumekln ok, it worked now, thanks for your help |
Do we have any updates on resolving this issue? Currently, using the workaround of setting |
I compileted my app with nuitka, and then run it as Administrastor User , it will not crash when unload model. |
I have the same problem. My config python 3.10.7 CUDA ToolKit 11.8 cuDNN 8.9.6 and add to PATH. If i change temperature=0, i get looping |
i had the same problem and i think i fixed it in my case by moving the faster whisper import inside the function that needs/uses it. But, keep in mind that I am using faster whisper through stable whisper, and i need to import some stuff from the faster whisper library. I previously imported it globally in the top and found that my app will sometimes crashes after loading and reloading different model, but then after moving it to only inside the function that uses it somehow the crash is gone |
I have the same issue. I reinstall pytorch with this command |
I can consistently reproduce it with the latest master, Python 3.11.1 and Cuda 12.5 on Windows 10, 3 minutes of audio and tiny model, with the following simple code: from faster_whisper import WhisperModel
model = WhisperModel("tiny", device="cuda", compute_type="auto")
segments, info = model.transcribe("js.wav")
for segment in segments:
print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text)) I only installed Cuda Toolkit 12.5, not cuDNN. At no point does the system max out on CPU, GPU or memory. If device="cpu" is forced, the issue does not occur, nor does it with temperature=0.0 as stated above. Curiously, it also does not occur if I don't iterate to the end of If I put |
Windows 10 using only cpu, data type int8 / float32, both may crash. Same feedback as above. Possible reproductions: Multiple larger audios (10+ minutes/16k/ac 1/wav), for loop continuous recognition, in the last few audio file tasks, at the end of the segments iteration, regardless of whether del model or not, and regardless of whether multiple recognition tasks share a single model, or each task creates a single model, all may crash. Temperature has been set to 0, condition_on_previous_text has been set to false, beam_size best_of has been set to 1 Single task execution, even with large audio, rarely crashes. Crashes mostly occur when multiple tasks in a row continue. Tried creating a process for each task, and when one process finishes and then starts another, it still crashes! Looking at the dump crash info for windows.
Not as long as the continuous running of multiple tasks will necessarily crash, there is a certain probability of crash, sometimes more than a dozen tasks in a row to execute without error, sometimes three or five tasks may crash! |
Did anyone find out yet if this is a bug in faster-whisper or in ctranslate2? |
This workaround worked for me as well. Windows 11, Python 3.13.0, CUDA 12.4 |
#71 (comment) |
I'm currently hitting this as well -- looking forward to the fix. |
Any update on this? Seems to occur for me when trying to transcribe multiple files in sequence. This happens while launching as a subprocess (mentioned in #71) and with Temperature set to 0. |
Happens to me every time I use faster_whisper model, I can run it only once per process. |
Thanks for your work first. It's useful.
Howerver, there's still something wrong. It returns -1073740791 (0xC0000409) when dealing with a audio file in chinese. I have defined a function in which the variable 'result' is used to accept the 'segments' returned by the fast-whisper. It's normal in this function, but abnormal after being returned by the function.
The line 'print(result)' works.
But after the result is returned, python returns -1073740791 (0xC0000409) and terminates
When changing the model or the language, it went properly.
Confused.
The text was updated successfully, but these errors were encountered: