Zeus GPGPU:Basics

Z Komputery Dużej Mocy w ACK CYFRONET AGH
Skocz do:nawigacja, szukaj


Name of the access machine

Rules are the same name as for the rest of the Zeus cluster: Name of access machine

Disk resources

Rules are the same name as for the rest of the Zeus cluster: Disk resources

Available software

Description of available software is here.

CUDA / OpenCL

Gaining access to tools, particularly setting up proper variables $CUDADIR, $PATH and $LD_LIBRARY_PATH requires loading CUDA module first, using command module add cuda.

Running jobs

Batch system: Torque

Resources manager: Moab

For purpose of running jobs the knowledge of basic commands of batch system is required.

Besides setting up typical parameters essential for batch system, it is essential to specify gpus parameter, which indicates how many graphic cards are requested on single compute node. Software often requires the graphic card to be available for many processes. In that case you need to set it's working mode for shared, specifying gpus=1:shared or gpus=2:shared. Other possible modes are exclusive_thread and exclusive_process

If that's necessary you can check identifiers of graphic cards allocated by batch system, to do that you need to view content of file indicated by $PBS_GPUFILE.

Queues description

Queue name Maximum jobs number per user Maximum job duration Additional information
gpgpu 16 no limit two GPGPU cards in every node

Description of compute nodes properties

Property Description
mhz2933 processor speed
mem74gb total amount of RAM on compute server
n2-2 location, not necessary for computing
gpgpu informs that this node has GPGPU card

Example scripts for batch system

TeraChem (parallelization for two graphic cards)

#!/bin/sh

# TeraChem can run on a single node only
#PBS -l nodes=1:ppn=2:terachem:gpus=2:exclusive_process

#PBS -N sample_terachem
#PBS -q gpgpu

cd $PBS_O_WORKDIR

# initializing proper environment for TeraChem
module add gpu/terachem

# actual job
$TERACHEMRUN ch.inp > ch.log

TeraChem (parallelization for eight graphic cards)

#!/bin/sh

# TeraChem can run on a single node only
#PBS -l nodes=1:ppn=8:terachem:gpus=8:exclusive_process

#PBS -N sample_terachem
#PBS -q gpgpu

cd $PBS_O_WORKDIR

# initializing proper environment for TeraChem
module add gpu/terachem

# actual job
$TERACHEMRUN ch.inp > ch.log

NAMD

#!/bin/sh

#PBS -l nodes=3:ppn=12:gpus=2:shared
#PBS -N sample_namd
#PBS -q gpgpu

cd $PBS_O_WORKDIR

# initializing proper environment for NAMD with GPU support
module add gpu/namd

# actual job
runnamd stmv.namd > stmv2_2x2.log 

GAMESS

#!/bin/sh

# the number of GPUs requested
# at the moment it must be set to 2 per node
#PBS -l nodes=2:ppn=4:gpus=2:exclusive_process

#PBS -N sample_gamess 
#PBS -q gpgpu

# changing directory to the one from which the job is submitted
cd $PBS_O_WORKDIR

# initializing proper environment for GAMESS with GPU support
module add gpu/gamess

# actual job
rungms noq15 >& noq15.log


GAMESS (older version)

#!/bin/sh

# this version of GAMESS is single-node only
#PBS -l nodes=1:ppn=4:gpus=2:exclusive_process

#PBS -N sample_gamess 
#PBS -q gpgpu

# changing directory to the one from which the job is submitted
cd $PBS_O_WORKDIR

# initializing proper environment for GAMESS 2010.R1 with GPU support
module add gpu/gamess/2010.R1

# actual job
$GMSRUN noq15 >& noq15.log

Rules on GPGPU part of Zeus

  • Only jobs using GPUs can be started
  • Performing computations on access server is strictly prohibited. Administrators will terminate that jobs without warning.
  • For compiling a program use following command: qsub -I -q gpgpu -l nodes=1:ppn=1:gpus=1
    This command logs user on compute node, where compilation should be done.
  • In order to compute, it is necessary to specify grant identifier using "-A" parameter.

Detailed description of grants can be found here.