Skip to content

Commit

Permalink
Correct CUDA worker invocation.
Browse files Browse the repository at this point in the history
  • Loading branch information
wilson committed Feb 2, 2023
1 parent 5ffc306 commit eb38aa0
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 6 deletions.
8 changes: 3 additions & 5 deletions dask_cloudprovider/aws/ecs.py
Original file line number Diff line number Diff line change
Expand Up @@ -437,8 +437,8 @@ def __init__(
self._nthreads = nthreads
_command = [
"dask",
"cuda" if self._gpu else None,
"worker",
"--resources GPU={}".format(self._gpu) if self._gpu else None,
self.scheduler,
"--name",
str(self.name),
Expand Down Expand Up @@ -700,7 +700,7 @@ class ECSCluster(SpecCluster, ConfigMixin):
... worker_gpu=1)
By setting the ``worker_gpu`` option to something other than ``None`` will cause the cluster
to run ``dask worker --resources GPU=n`` as the worker startup command. Setting this option will also change
to run ``dask cuda worker`` as the worker startup command. Setting this option will also change
the default Docker image to ``rapidsai/rapidsai:latest``, if you're using a custom image
you must ensure the NVIDIA CUDA toolkit is installed with a version that matches the host machine
along with ``dask-cuda``.
Expand Down Expand Up @@ -1265,10 +1265,8 @@ async def _create_worker_task_definition_arn(self):
e
for e in [
"dask",
"cuda" if self._worker_gpu else None,
"worker",
"--resources GPU={}".format(self._worker_gpu)
if self._worker_gpu
else None,
"--nthreads",
"{}".format(
max(int(self._worker_cpu / 1024), 1)
Expand Down
2 changes: 1 addition & 1 deletion doc/source/gpus.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Each cluster manager handles this differently but generally you will need to con

- Configure the hardware to include GPUs. This may be by changing the hardware type or adding accelerators.
- Ensure the OS/Docker image has the NVIDIA drivers. For Docker images it is recommended to use the [RAPIDS images](https://hub.docker.com/r/rapidsai/rapidsai/).
- Set the ``worker_module`` config option to ``dask_cuda.cli.dask_cuda_worker`` or set ``resources`` to include ``GPU=n`` where ``n`` is the number of GPUs you require.
- Set the ``worker_module`` config option to ``dask_cuda.cli.dask_cuda_worker`` or set ``resources`` to include ``GPU=n`` where ``n`` is the number of GPUs you require. This will cause ``dask cuda worker`` to be used in place of ``dask worker``.

In the following AWS :class:`dask_cloudprovider.aws.EC2Cluster` example we set the ``ami`` to be a Deep Learning AMI with NVIDIA drivers, the ``docker_image`` to RAPIDS, the ``instance_type``
to ``p3.2xlarge`` which has one NVIDIA Tesla V100 and the ``worker_module`` to ``dask_cuda.cli.dask_cuda_worker``.
Expand Down

0 comments on commit eb38aa0

Please sign in to comment.