Argonne Leadership Computing Facility (ALCF)#

Polaris#

Polaris is a 44 petaflops system based on the HPE Appolo Gen10+ platform. It is composed of heterogenous computing capabilities with 1 AMD EPYC “Milan” processor and 4 NVIDIA A100 GPUs per node.

Already installed module#

This installation procedure shows you how to access the installed DeepHyper module on Polaris. After logging in Polaris, to access Deephyper run the following commands:

$ module load conda/2022-09-08
$ conda activate base

Then to verify the installation do:

$ python
>>> import deephyper
>>> deephyper.__version__
'0.4.2'

Warning

The deephyper installation provided in the conda module is not always up to date. If you need a more recent version of DeepHyper, please refer to the conda-environment installation procedure.

This script creates a conda environment activation script activate-dhenv.sh in the build directory, which can be sourced to activate the created environment, and a redis.conf file, which should be referenced when starting a Redis storage server.

Installation from source#

This installation procedure shows you how to build DeepHyper from source on Polaris. This installation, will provide DeepHyper’s default set of features with MPI backend for the Evaluator and the Redis backend for the Storage. After logging in Polaris, the following script can be executed from a build directory:

file: install/alcf/polaris.sh#
 1#!/bin/bash
 2
 3# Generic installation script for DeepHyper on ALCF's Polaris.
 4# This script is meant to be run on the login node of the machine.
 5# It will install DeepHyper and its dependencies in the current directory.
 6# A good practice is to create a `build` folder and launch the script from there,
 7# e.g. from the root of the DeepHyper repository:
 8# $ mkdir build && cd build && ../install/alcf/polaris.sh
 9# The script will also create a file named `activate-dhenv.sh` that will
10# Setup the environment each time it is sourced `source activate-dhenv.sh`.
11
12set -xe
13
14# Load modules available on the current system
15module load PrgEnv-gnu/8.3.3
16module load llvm/release-15.0.0
17module load conda/2022-09-08
18
19# Copy the base conda environment
20conda create -p dhenv --clone base -y
21conda activate dhenv/
22pip install --upgrade pip
23
24# Install RedisJSON with Spack
25# Install Spack
26git clone -c feature.manyFiles=true https://github.com/spack/spack.git
27. ./spack/share/spack/setup-env.sh
28
29git clone https://github.com/deephyper/deephyper-spack-packages.git
30
31# Create and activate the `redisjson` environment
32spack env create redisjson
33spack env activate redisjson
34
35# Add the DeepHyper Spack packages to the environment
36spack repo add deephyper-spack-packages
37
38# Add the `redisjson` Spack package to the environment
39spack add redisjson
40
41# Build the environment
42spack install
43
44# Install the DeepHyper's Python package
45git clone -b master https://github.com/deephyper/deephyper.git
46pip install -e "deephyper/[default,mpi,redis-hiredis]"
47
48# Create activation script
49touch activate-dhenv.sh
50echo "#!/bin/bash" >> activate-dhenv.sh
51
52# Append modules loading and conda activation
53echo "" >> activate-dhenv.sh
54echo "module load PrgEnv-gnu/8.3.3" >> activate-dhenv.sh
55echo "module load llvm/release-15.0.0" >> activate-dhenv.sh
56echo "module load conda/2022-09-08" >> activate-dhenv.sh
57echo "conda activate $PWD/dhenv/" >> activate-dhenv.sh
58
59# Append Spack activation
60echo "" >> activate-dhenv.sh
61echo ". $PWD/spack/share/spack/setup-env.sh" >> activate-dhenv.sh
62echo "spack env activate redisjson" >> activate-dhenv.sh
63
64# Create Redis configuration
65touch redis.conf
66
67# Accept all connections from the network
68echo "bind 0.0.0.0" >> redis.conf
69
70# Add the RedisJSON module to the configuration file
71cat $(spack find --path redisjson | grep -o "/.*/redisjson.*")/redis.conf >> redis.conf
72
73# Disable protected mode (i.e., no password required when connecting to Redis)
74echo "protected-mode no" >> redis.conf

This script creates a conda environment activation script activate-dhenv.sh in the build directory, which can be sourced to activate the created environment, and a redis.conf file, which should be referenced when starting a Redis storage server.

Theta#

Theta is a 11.69 petaflops system based on the second-generation Intel Xeon Phi processor at Argonne Leadership Computing Facility (ALCF). It serves as a stepping stone to the ALCF’s next leadership-class supercomputer, Aurora. Theta is a massively parallel, many-core system based on Intel processors and interconnect technology, a new memory space, and a Lustre-based parallel file system, all integrated by Cray’s HPC software stack.

Already installed module#

This installation procedure shows you how to access the installed DeepHyper module on Theta. After logging in Theta, to access Deephyper run the following commands:

$ module load conda/2021-09-22
$ conda activate base

Then to verify the installation do:

$ python
>>> import deephyper
>>> deephyper.__version__
'0.3.0'

Conda environment#

This installation procedure shows you how to create your own Conda virtual environment and install DeepHyper in it.

After logging in Theta, go to your project folder (replace PROJECTNAME by your own project name):

$ cd /lus/theta-fs0/projects/PROJECTNAME

Then create the dhknl environment:

$ module load miniconda-3
$ conda create -p dhknl python=3.8 -y
$ conda activate dhknl/

It is then required to have the following additionnal dependencies:

$ conda install gxx_linux-64 gcc_linux-64 -y

Finally install DeepHyper in the previously created dhknl environment:

$ pip install pip --upgrade
$ # DeepHyper + Analytics Tools (Parsing logs, Plots, Notebooks)
$ pip install deephyper[analytics]
$ conda install tensorflow -c intel -y

Note

Horovod can be installed to use data-parallelism during the evaluations of DeepHyper. To do so use pip install deephyper[analytics,hvd] while or after installing.

Jupyter Notebooks#

To use Jupyter notebooks on Theta go to Theta Jupyter and use your regular authentication method. The Jupyter Hub tutorial from Argonne Leadership Computing Facility might help you in case of troubles.

To create a custom Jupyter kernel run the following from your activated Conda environment:

$ python -m ipykernel install --user --name deephyper --display-name "Python (deephyper)"

Now when openning a notebook from Jupyter Hub at ALCF make sure to use the Python (deephyper) kernel before executing otherwise you will not have all required dependencies.

ThetaGPU#

ThetaGPU is an extension of Theta and is comprised of 24 NVIDIA DGX A100 nodes at Argonne Leadership Computing Facility (ALCF). See the documentation of ThetaGPU from the Datascience group at Argonne National Laboratory for more information. The system documentation from the ALCF can be accessed here.

Already installed module#

This installation procedure shows you how to access the installed DeepHyper module on ThetaGPU. It may be useful to wrap these commands in this activate-dhenv.sh script :

file: activate-dhenv.sh#
#!/bin/bash

. /etc/profile

module load conda/2022-07-01
conda activate base

To then effectively call this activation script in your scripts, you can use source ..., here is an exemple to test the good activation of the conda environment (replace the $PROJECT_NAME with your project, e-g: #COBALT -A datascience) :

file: job-test-activation.sh#
#!/bin/bash
#COBALT -q single-gpu
#COBALT -n 1
#COBALT -t 20
#COBALT -A $PROJECT_NAME
#COBALT --attrs filesystems=home,theta-fs0,grand,eagle

source activate-dhenv.sh
python -c "import deephyper; print(f'DeepHyper version: {deephyper.__version__}')"

You should obtain a DeepHyper version: x.x.x in the output cobaltlog file from this job after submitting it with :

$ qsub-gpu job-test-activation.sh

Conda environment#

This installation procedure shows you how to create your own Conda virtual environment and install DeepHyper in it.

As this procedure needs to be performed on ThetaGPU, we will directly execute it in this job-install-dhenv.sh submission script (replace the $PROJECT_NAME with the name of your project allocation, e-g: #COBALT -A datascience):

file: job-install-dhenv.sh#
#!/bin/bash
#COBALT -q single-gpu
#COBALT -n 1
#COBALT -t 60
#COBALT -A $PROJECT_NAME
#COBALT --attrs filesystems=home,theta-fs0,grand

. /etc/profile

# create the dhgpu environment:
module load conda/2022-07-01

conda create -p dhenv --clone base -y
conda activate dhenv/

# install DeepHyper in the previously created dhgpu environment:
pip install pip --upgrade
pip install deephyper["analytics"]

Then submit this job by executing the following command :

$ qsub-gpu job-test-activation.sh

Once this job is finished you can test the good installation by creating this activate-dhenv.sh script and submitting the job-test-activation.sh job from Already installed module:

file: activate-dhenv.sh#
#!/bin/bash

. /etc/profile

module load conda/2022-07-01
conda activate dhenv/

mpi4py installation#

You might need to additionaly install mpi4py to your environment in order to use functionnalities such as the "mpicomm" evaluator, you simply need to add this after pip install deephyper["analytics"] :

$ git clone https://github.com/mpi4py/mpi4py.git
$ cd mpi4py/
$ MPICC=mpicc python setup.py install
$ cd ..

Internet Access#

If the node you are on does not have outbound network connectivity, set the following to access the proxy host:

$ export http_proxy=http://proxy.tmi.alcf.anl.gov:3128
$ export https_proxy=http://proxy.tmi.alcf.anl.gov:3128

Cooley#

Warning

This page is outdated and refers to last known installation procedures for Cooley.

Cooley is a GPU cluster at Argonne Leadership Computing Facility (ALCF). It has a total of 126 compute nodes; each node has 12 CPU cores and one NVIDIA Tesla K80 dual-GPU card.

Before installating DeepHyper, go to your project folder:

cd /lus/theta-fs0/projects/PROJECTNAME
mkdir cooley && cd cooley/

DeepHyper can be installed on Cooley by following these commands:

git clone https://github.com/deephyper/deephyper.git --depth 1
./deephyper/install/cooley.sh

Then, restart your session.

Warning

You will note that a new file ~/.bashrc_cooley was created and sourced in the ~/.bashrc. This is to avoid conflicting installations between the different systems available at the ALCF.

Note

To test you installation run:

./deephyper/tests/system/test_cooley.sh

A manual installation can also be performed with the following set of commands:

# Install Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-py39_4.9.2-Linux-x86_64.sh -O miniconda.sh
bash $PWD/miniconda.sh -b -p $PWD/miniconda
rm -f miniconda.sh

# Install Postgresql
wget http://get.enterprisedb.com/postgresql/postgresql-9.6.13-4-linux-x64-binaries.tar.gz -O postgresql.tar.gz
tar -xf postgresql.tar.gz
rm -f postgresql.tar.gz

# adding Cuda
echo "+cuda-10.2" >> ~/.soft.cooley
resoft

source $PWD/miniconda/bin/activate

# Create conda env for DeepHyper
conda create -p dh-cooley python=3.8 -y
conda activate dh-cooley/
conda install gxx_linux-64 gcc_linux-64 -y
# DeepHyper + Analytics Tools (Parsing logs, Plots, Notebooks)
pip install deephyper[analytics,balsam]
conda install tensorflow-gpu

Warning

The same .bashrc is used both on Theta and Cooley. Hence adding a module load instruction to the .bashrc will not work on Cooley. In order to solve this issue you can add a specific statement to your .bashrc file and create separate bashrc files for Theta and Cooley and use them as follows.

# Theta Specific
if [[ $HOSTNAME = *"theta"* ]];
then
    source ~/.bashrc_theta
# Cooley Specific
else
    source ~/.bashrc_cooley
fi