bug: can enable GPU acceleration with cuda not installed - model fails to start #3762

johnhaire89 · 2024-08-13T03:11:12Z

I have searched the existing issues

Current behavior

I was playing with Jan for the first time and realised that GPU acceleration wasn't enabled.
I toggled the "GPU Acceleration" switch to enable it for my NVIDIA RTA A2000 with no error.

When I next typed into the chat window, Jan wasn't able to start the model.

Problem was that I didn't have CUDA toolkit installed.
Per SO answer at https://stackoverflow.com/a/55717476, nvidia-smi shows the supported CUDA version, but nvcc --version should be used to check the installed version.
I installed CUDA Toolkit and it's back to working like magic.

This is probably more a feature request then a bug, but that toggle should probably show an error if I try to enable GPU acceleration for a nvidia card when CUDA toolkit isn't installed.

Minimum reproduction step

Start with a Windows PC with a NVidia gpu and CUDA Toolkit not installed (per nvcc --version)

Under Settings > Advanced Settings, select the GPU and toggle the switch - toast says "Successfully turned on GPU acceleration"
Try to start Mistral Instruct 7B Q4 - model fails to start

Expected behavior

When I try to enable GPU Acceleration for a Nvidia GPU in an environment where CUDA Toolkit isn't installed, I should get a helpful error.
Maybe a warning can be displayed next to GPU in the dropdown?

Screenshots / Logs

2024-08-13T02:27:29.268Z [CORTEX]::Debug: Spawn cortex at path: C:\Users\username\jan\extensions\@janhq\inference-cortex-extension\dist\bin\win-cuda-12-0\cortex-cpp.exe, and args: 1,127.0.0.1,3928
2024-08-13T02:27:29.268Z [CORTEX]::Debug: Spawning cortex subprocess...
2024-08-13T02:27:29.268Z [APP]::C:\Users\username\jan\extensions\@janhq\inference-cortex-extension\dist\bin\win-cuda-12-0
2024-08-13T02:27:29.380Z [CORTEX]::Debug: cortex is ready
2024-08-13T02:27:29.380Z [CORTEX]::Debug: Loading model with params {"cpu_threads":15,"ctx_len":2048,"prompt_template":"{system_message} [INST] {prompt} [/INST]","llama_model_path":"C:\\Users\\username\\jan\\models\\mistral-ins-7b-q4\\Mistral-7B-Instruct-v0.3-Q4_K_M.gguf","ngl":33,"system_prompt":"","user_prompt":" [INST] ","ai_prompt":" [/INST]","model":"mistral-ins-7b-q4"}
2024-08-13T02:27:29.391Z [CORTEX]::Debug: 20240813 02:27:29.291000 UTC 34396 INFO  cortex-cpp version: default_version - main.cc:73
20240813 02:27:29.292000 UTC 34396 INFO  cortex.llamacpp version: 0.1.20-30.06.24 - main.cc:78
20240813 02:27:29.292000 UTC 34396 INFO  Server started, listening at: 127.0.0.1:3928 - main.cc:81
20240813 02:27:29.292000 UTC 34396 INFO  Please load your model - main.cc:82
20240813 02:27:29.292000 UTC 34396 INFO  Number of thread is:20 - main.cc:89
20240813 02:27:29.383000 UTC 25336 INFO  CPU instruction set: fpu = 1| mmx = 1| sse = 1| sse2 = 1| sse3 = 1| ssse3 = 1| sse4_1 = 1| sse4_2 = 1| pclmulqdq = 1| avx = 1| avx2 = 1| avx512_f = 0| avx512_dq = 0| avx512_ifma = 0| avx512_pf = 0| avx512_er = 0| avx512_cd = 0| avx512_bw = 0| has_avx512_vl = 0| has_avx512_vbmi = 0| has_avx512_vbmi2 = 0| avx512_vnni = 0| avx512_bitalg = 0| avx512_vpopcntdq = 0| avx512_4vnniw = 0| avx512_4fmaps = 0| avx512_vp2intersect = 0| aes = 1| f16c = 1| - server.cc:277
20240813 02:27:29.392000 UTC 25336 ERROR Could not load engine: Could not load library "C:\Users\username\jan\extensions\@janhq\inference-cortex-extension\dist\bin\win-cuda-12-0/engines/cortex.llamacpp/engine.dll"
The specified module could not be found.

 - server.cc:290

2024-08-13T02:27:29.392Z [CORTEX]::Debug: Load model success with response {}
2024-08-13T02:27:29.398Z [CORTEX]::Debug: Validate model state failed with response "Conflict"
2024-08-13T02:27:29.398Z [CORTEX]::Error: Validate model status failed
2024-08-13T02:27:29.397Z [CORTEX]::Debug: Validate model state with response 409
2024-08-13T02:28:29.958Z [CORTEX]::Debug: Request to kill cortex
2024-08-13T02:28:29.958Z [CORTEX]::Debug: Killing PID 21376

Jan version

0.5.2

In which operating systems have you tested?

macOS
Windows
Linux

Environment details

Windows 11
NVIDIA RTX A2000 8GB Laptop GPU8192MB VRAM
CUDA toolkit not installed

The text was updated successfully, but these errors were encountered:

louis-jan · 2024-08-28T03:30:58Z

@Van-QA @imtuyethan I think that is something we implemented regarding error handling in the past? Which leads the user to the CUDA additional installation page.

dan-menlo · 2024-08-28T16:31:38Z

@johnhaire89 FYI, Jan is in the process of overhauling how we deal with llama.cpp binaries and GPU dependencies.

llama.cpp now bundles its CUDA dependencies
Jan will likely shift towards bundling CUDA dependencies together with the llama.cpp engine

@Van-QA I will keep this bug open. Once we clean up PM systems, let's link the 2 epics that would solve this bug. My style is to only close bugs once the corresponding feature is shipped.

Jan should embed llama.cpp through Cortex + cortex.llamacpp
cortex engines llama.cpp install should also pull CUDA dependences, cc @namchuai (FYI)

dan-menlo · 2024-09-08T12:39:19Z

Handling this bug as part of janhq/cortex.cpp#1165

louis-jan · 2024-09-17T09:32:12Z

Hi @dan-homebrew @imtuyethan. This is a known issue, there is a fix in 0.5.4: #3552.

Show a corresponding error message.
Allow users to install dependencies.

We have this step to let user install additional dependencies right in the app (without redirecting users out of the app).

In the next update of integrating cortex-cpp engine pull, there should be no extra request to install these dependencies, BUT there this error message would really help in case there is a Driver/Cuda update that does not work with the pulled engine & it's dependencies.

imtuyethan · 2024-10-14T08:30:01Z

The fix is included in Jan's path to cortex.cpp: #3690

imtuyethan · 2024-11-04T10:49:14Z

Deprecated issue, users are prompted to install CUDA toolkit in 0.5.7 when it's not available.

johnhaire89 added the type: bug Something isn't working label Aug 13, 2024

imtuyethan assigned louis-jan Aug 14, 2024

louis-jan assigned imtuyethan and Van-QA Aug 28, 2024

imtuyethan unassigned Van-QA, imtuyethan and louis-jan Aug 28, 2024

freelerobot transferred this issue from janhq/jan Sep 6, 2024

dan-menlo assigned louis-jan and nguyenhoangthuan99 Sep 8, 2024

dan-menlo mentioned this issue Sep 9, 2024

planning: Cortex Hardware API janhq/cortex.cpp#1165

Closed

11 tasks

github-project-automation bot added this to Jan & Cortex Oct 3, 2024

github-project-automation bot moved this to Investigating in Jan & Cortex Oct 3, 2024

gabrielle-ong transferred this issue from janhq/cortex.cpp Oct 3, 2024

imtuyethan modified the milestone: v0.5.7 Oct 14, 2024

freelerobot added category: local engines category: cortex.cpp Related to cortex.cpp category: providers Local & remote inference providers and removed category: local providers labels Oct 14, 2024

imtuyethan added the move to Cortex label Nov 1, 2024

imtuyethan closed this as completed Nov 4, 2024

github-project-automation bot moved this from Investigating to Review + QA in Jan & Cortex Nov 4, 2024

imtuyethan moved this from Review + QA to Completed in Jan & Cortex Nov 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: can enable GPU acceleration with cuda not installed - model fails to start #3762

bug: can enable GPU acceleration with cuda not installed - model fails to start #3762

johnhaire89 commented Aug 13, 2024

louis-jan commented Aug 28, 2024

dan-menlo commented Aug 28, 2024

dan-menlo commented Sep 8, 2024 •

edited

Loading

louis-jan commented Sep 17, 2024 •

edited

Loading

imtuyethan commented Oct 14, 2024 •

edited

Loading

imtuyethan commented Nov 4, 2024

bug: can enable GPU acceleration with cuda not installed - model fails to start #3762

bug: can enable GPU acceleration with cuda not installed - model fails to start #3762

Comments

johnhaire89 commented Aug 13, 2024

Current behavior

Minimum reproduction step

Expected behavior

Screenshots / Logs

Jan version

In which operating systems have you tested?

Environment details

louis-jan commented Aug 28, 2024

dan-menlo commented Aug 28, 2024

dan-menlo commented Sep 8, 2024 • edited Loading

louis-jan commented Sep 17, 2024 • edited Loading

imtuyethan commented Oct 14, 2024 • edited Loading

imtuyethan commented Nov 4, 2024

dan-menlo commented Sep 8, 2024 •

edited

Loading

louis-jan commented Sep 17, 2024 •

edited

Loading

imtuyethan commented Oct 14, 2024 •

edited

Loading