Zeus GPGPU:Basics: Różnice pomiędzy wersjami

Z Komputery Dużej Mocy w ACK CYFRONET AGH
Skocz do:nawigacja, szukaj
Linia 26: Linia 26:
 
For purpose of running jobs the knowledge of [[Basics:PBS|basic commands of batch system]] is required.  
 
For purpose of running jobs the knowledge of [[Basics:PBS|basic commands of batch system]] is required.  
  
Besides setting up typical parameters essential for batch system, it is essential to specify '''gpus''' parameter, which indicates how many graphic cards are requested on single computing node.
+
Besides setting up typical parameters essential for batch system, it is essential to specify '''gpus''' parameter, which indicates how many graphic cards are requested on single compute node.
 
Software often requires the graphic card to be available for many processes. In that case you need to set it's working mode for ''shared'', specifying '''gpus=1:shared''' or '''gpus=2:shared'''. Other possible modes are ''exclusive_thread'' and ''exclusive_process''
 
Software often requires the graphic card to be available for many processes. In that case you need to set it's working mode for ''shared'', specifying '''gpus=1:shared''' or '''gpus=2:shared'''. Other possible modes are ''exclusive_thread'' and ''exclusive_process''
  
Linia 39: Linia 39:
 
|}
 
|}
  
== Description of computing nodes properties ==
+
== Description of compute nodes properties ==
  
 
{| class="wikitable" style="text-align:center;"
 
{| class="wikitable" style="text-align:center;"
Linia 47: Linia 47:
 
| mhz2933 || processor speed
 
| mhz2933 || processor speed
 
|-
 
|-
| mem74gb || total amount of RAM on computing server
+
| mem74gb || total amount of RAM on compute server
 
|-
 
|-
 
| n2-2 || location, not necessary for computing
 
| n2-2 || location, not necessary for computing
Linia 159: Linia 159:
 
* Performing computations on accessing server is strictly prohibited. Administrators will terminate that jobs without warning.
 
* Performing computations on accessing server is strictly prohibited. Administrators will terminate that jobs without warning.
  
* For compiling a program use following command: <code>qsub -I -q gpgpu -l nodes=1:ppn=1:gpus=1</code><br/>This command logs user on computing node, where compilation should be done.
+
* For compiling a program use following command: <code>qsub -I -q gpgpu -l nodes=1:ppn=1:gpus=1</code><br/>This command logs user on compute node, where compilation should be done.
  
 
* In order to compute, it is necessary to specify grant identifier using "-A" parameter.
 
* In order to compute, it is necessary to specify grant identifier using "-A" parameter.
 
Detailed description of grants can be found [[Grants|here]].
 
Detailed description of grants can be found [[Grants|here]].

Wersja z 10:44, 13 mar 2013


Name of the accessing machine

Rules are the same name as for the rest of the Zeus cluster: Name of accessing machine

Disk resources

Rules are the same name as for the rest of the Zeus cluster: Disk resources

Available software

Description of available software is here.

CUDA / OpenCL

Gaining access to tools, particularly setting up proper variables $CUDADIR, $PATH and $LD_LIBRARY_PATH requires loading CUDA module first, using command module add cuda.

Running jobs

Batch system: Torque

Resources manager: Moab

For purpose of running jobs the knowledge of basic commands of batch system is required.

Besides setting up typical parameters essential for batch system, it is essential to specify gpus parameter, which indicates how many graphic cards are requested on single compute node. Software often requires the graphic card to be available for many processes. In that case you need to set it's working mode for shared, specifying gpus=1:shared or gpus=2:shared. Other possible modes are exclusive_thread and exclusive_process

If that's necessary you can check identifiers of graphic cards allocated by batch system, to do that you need to view content of file indicated by $PBS_GPUFILE.

Queues description

Queue name Maximum jobs number per user Maximum job duration Additional information
gpgpu 16 no limit two GPGPU cards in every node

Description of compute nodes properties

Property Description
mhz2933 processor speed
mem74gb total amount of RAM on compute server
n2-2 location, not necessary for computing
gpgpu informs that this node has GPGPU card

Example scripts for batch system

TeraChem (parallelization for two graphic cards)

#!/bin/sh

# TeraChem can run on a single node only
#PBS -l nodes=1:ppn=2:terachem:gpus=2:exclusive_process

#PBS -N sample_terachem
#PBS -q gpgpu

cd $PBS_O_WORKDIR

# initializing proper environment for TeraChem
module add gpu/terachem

# actual job
$TERACHEMRUN ch.inp > ch.log

TeraChem (parallelization for eight graphic cards)

#!/bin/sh

# TeraChem can run on a single node only
#PBS -l nodes=1:ppn=8:terachem:gpus=8:exclusive_process

#PBS -N sample_terachem
#PBS -q gpgpu

cd $PBS_O_WORKDIR

# initializing proper environment for TeraChem
module add gpu/terachem

# actual job
$TERACHEMRUN ch.inp > ch.log

NAMD

#!/bin/sh

#PBS -l nodes=3:ppn=12:gpus=2:shared
#PBS -N sample_namd
#PBS -q gpgpu

cd $PBS_O_WORKDIR

# initializing proper environment for NAMD with GPU support
module add gpu/namd

# actual job
runnamd stmv.namd > stmv2_2x2.log 

GAMESS

#!/bin/sh

# the number of GPUs requested
# at the moment it must be set to 2 per node
#PBS -l nodes=2:ppn=4:gpus=2:exclusive_process

#PBS -N sample_gamess 
#PBS -q gpgpu

# changing directory to the one from which the job is submitted
cd $PBS_O_WORKDIR

# initializing proper environment for GAMESS with GPU support
module add gpu/gamess

# actual job
rungms noq15 >& noq15.log


GAMESS (older version)

#!/bin/sh

# this version of GAMESS is single-node only
#PBS -l nodes=1:ppn=4:gpus=2:exclusive_process

#PBS -N sample_gamess 
#PBS -q gpgpu

# changing directory to the one from which the job is submitted
cd $PBS_O_WORKDIR

# initializing proper environment for GAMESS 2010.R1 with GPU support
module add gpu/gamess/2010.R1

# actual job
$GMSRUN noq15 >& noq15.log

Rules on GPGPU part of Zeus

  • Only jobs using GPUs can be started
  • Performing computations on accessing server is strictly prohibited. Administrators will terminate that jobs without warning.
  • For compiling a program use following command: qsub -I -q gpgpu -l nodes=1:ppn=1:gpus=1
    This command logs user on compute node, where compilation should be done.
  • In order to compute, it is necessary to specify grant identifier using "-A" parameter.

Detailed description of grants can be found here.