Skip to content

Commit

Permalink
Avoid activating GPU when we can't use the GPU
Browse files Browse the repository at this point in the history
Loading the GPU module imposes about a second of latency. If the user
isn't passing the -ngl 35 flag and it isn't Apple Silicon, then don't
bother running all the GPU-related initialization routines.
  • Loading branch information
jart committed Jan 4, 2024
1 parent 50bdf69 commit 01b9aaf
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 1 deletion.
2 changes: 1 addition & 1 deletion llama.cpp/common.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -562,7 +562,7 @@ bool gpt_params_parse_ex(int argc, char ** argv, gpt_params & params) {
break;
}
params.n_gpu_layers = std::stoi(argv[i]);
if (params.n_gpu_layers == 0) {
if (params.n_gpu_layers <= 0) {
FLAG_gpu = LLAMAFILE_GPU_DISABLE;
}
} else if (arg == "--gpu-layers-draft" || arg == "-ngld" || arg == "--n-gpu-layers-draft") {
Expand Down
7 changes: 7 additions & 0 deletions llama.cpp/main/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,13 @@ int main(int argc, char ** argv) {
__builtin_unreachable();
}

if (!IsXnuSilicon() &&
(!has_argument(argc, argv, "-ngl") &&
!has_argument(argc, argv, "--gpu-layers") &&
!has_argument(argc, argv, "--n-gpu-layers"))) {
FLAG_gpu = LLAMAFILE_GPU_DISABLE;
}

if (!has_argument(argc, argv, "--cli") &&
(has_argument(argc, argv, "--server") ||
(!has_argument(argc, argv, "-p") &&
Expand Down

0 comments on commit 01b9aaf

Please sign in to comment.