Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Containerfile #5141

Merged
merged 6 commits into from
Jan 15, 2025
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
151 changes: 151 additions & 0 deletions Tools/machines/perlmutter-nersc/Containerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
# Base System and Essential Tools Installation
ax3l marked this conversation as resolved.
Show resolved Hide resolved
FROM nvidia/cuda:12.6.0-devel-ubuntu22.04 AS base
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To test: we could use -devel to build all dependencies but then copy their artifacts in the -runtime image. To check if those are much smaller or not.


# Set up environment variables
ENV DEBIAN_FRONTEND=noninteractive \
SW_DIR=/opt/software \
FORCE_UNSAFE_CONFIGURE=1
sinha-r marked this conversation as resolved.
Show resolved Hide resolved

# Install essential system dependencies including MPI libraries
RUN apt-get update && apt-get install -y --no-install-recommends \
autoconf \
build-essential \
ca-certificates \
coreutils \
curl \
environment-modules \
gfortran \
git \
openssh-server \
python3 \
python3-pip \
python3-dev \
python3-venv \
unzip \
vim \
libmpich-dev \
Copy link
Member

@ax3l ax3l Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do anything to get GPU-Aware MPI on Perlmutter to run?
Vanilla without containers, I have to set this to tell the Cray compiler wrappers to get it right:

# necessary to use CUDA-aware MPI and run a job
export CRAY_ACCEL_TARGET=nvidia80

Since we compile with GCC here, I am not sure how to do it.

cmake \
libblas-dev \
liblapack-dev \
g++ \
cuda-toolkit-12-2 \
pkg-config \
libbz2-dev \
zlib1g-dev \
libpng-dev \
libzstd-dev \
&& rm -rf /var/lib/apt/lists/*

# Install c-blosc from source
FROM base AS c-blosc

RUN git clone -b v1.21.1 https://github.com/Blosc/c-blosc.git /tmp/c-blosc && \
cd /tmp/c-blosc && \
mkdir build && cd build && \
cmake -DCMAKE_INSTALL_PREFIX=/usr .. && \
make -j$(nproc) && \
make install && \
rm -rf /tmp/c-blosc
sinha-r marked this conversation as resolved.
Show resolved Hide resolved

# Install ADIOS2 from source
FROM base AS adios2

# Ensure c-blosc is installed before ADIOS2
COPY --from=c-blosc /usr /usr

# Verify the location of Blosc library
RUN find /usr -name 'libblosc*'

# Ensure Blosc library paths are correctly configured
ENV BLOSC_LIBRARY=/usr/lib/libblosc.so.1.21.1
ENV BLOSC_INCLUDE_DIR=/usr/include

# Install ADIOS2
RUN git clone -b v2.8.3 https://github.com/ornladios/ADIOS2.git /tmp/adios2 && \
cd /tmp/adios2 && \
mkdir build && cd build && \
cmake -DADIOS2_USE_Blosc=ON \
-DBLOSC_LIBRARY=${BLOSC_LIBRARY} \
-DBLOSC_INCLUDE_DIR=${BLOSC_INCLUDE_DIR} \
-DADIOS2_USE_Fortran=OFF \
-DADIOS2_USE_Python=OFF \
-DADIOS2_USE_ZeroMQ=OFF \
-DADIOS2_USE_BZip2=ON \
-DADIOS2_USE_ZFP=OFF \
-DADIOS2_USE_SZ=OFF \
-DADIOS2_USE_MGARD=OFF \
-DADIOS2_USE_PNG=ON \
-DCMAKE_INSTALL_PREFIX=/usr .. && \
make -j$(nproc) && \
make install && \
rm -rf /tmp/adios2

# Install BLAS++ and LAPACK++
FROM base AS blaspp_lapackpp

RUN git clone -b v2024.05.31 https://github.com/icl-utk-edu/blaspp.git /tmp/blaspp && \
cd /tmp/blaspp && \
mkdir build && cd build && \
cmake -Duse_openmp=OFF -Dgpu_backend=cuda -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=/usr -DBLAS_LIBRARIES=/usr/lib/x86_64-linux-gnu/libblas.so -DLAPACK_LIBRARIES=/usr/lib/x86_64-linux-gnu/liblapack.so .. && \
make -j$(nproc) && \
make install && \
rm -rf /tmp/blaspp

RUN git clone -b v2024.05.31 https://github.com/icl-utk-edu/lapackpp.git /tmp/lapackpp && \
cd /tmp/lapackpp && \
mkdir build && cd build && \
cmake -DCMAKE_CXX_STANDARD=17 -Dbuild_tests=OFF -DCMAKE_INSTALL_PREFIX=/usr -DLAPACK_LIBRARIES=/usr/lib/x86_64-linux-gnu/liblapack.so .. && \
make -j$(nproc) && \
make install && \
rm -rf /tmp/lapackpp

# Install heFFTe
FROM base AS heffte

RUN git clone -b v2.4.0 https://github.com/icl-utk-edu/heffte.git /tmp/heffte #&& \
# cd /tmp/heffte && \
#mkdir build && cd build && \
# Disable LTO for CUDA builds
# CXXFLAGS="-fno-lto" CFLAGS="-fno-lto"
sinha-r marked this conversation as resolved.
Show resolved Hide resolved
# cmake -DBUILD_SHARED_LIBS=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_STANDARD=17 -DHeffte_ENABLE_CUDA=ON -DCMAKE_INSTALL_PREFIX=/usr .. && \
# make -j$(nproc) && \
#make install && \
#rm -rf /tmp/heffte
ax3l marked this conversation as resolved.
Show resolved Hide resolved

# Final Image
FROM base AS final

# Copy installed software from previous stages
COPY --from=c-blosc /usr /usr
COPY --from=adios2 /usr /usr
COPY --from=blaspp_lapackpp /usr /usr
#COPY --from=heffte /usr /usr

# Create and activate Python virtual environment
RUN python3 -m venv /opt/venv && \
/opt/venv/bin/pip install --no-cache-dir \
wheel \
numpy \
pandas \
scipy \
matplotlib \
jupyter \
scikit-learn \
openpmd-api \
yt \
cupy-cuda12x \
torch \
optimas[all] \
cython \
packaging \
build \
setuptools

# Set up the environment for the virtual environment
ENV PATH="/opt/venv/bin:${PATH}"

# Set up entrypoint
ENTRYPOINT ["/bin/bash", "-c"]

# Default command
CMD ["/bin/bash"]
ax3l marked this conversation as resolved.
Show resolved Hide resolved