DPLACE(1) DPLACE(1)
NAME
dplace - a NUMA memory placement tool
SYNOPSIS
dplace [-place placement_file]
[-datapagesize n-bytes]
[-stackpagesize n-bytes]
[-textpagesize n-bytes]
[-migration threshold]
[-propagate]
[-mustrun]
[-v[erbose]]
program [program-arguments]
DESCRIPTION
The given program is executed after placement policies are set up
according to command line arguments and the specifications described in
placement_file.
OPTIONS
-place placement_file
Placement information is read from placement_file. If this argument
is omitted, no input file is read. See dplace(5) for correct
placement file format.
-datapagesize n-bytes
Data and heap page sizes will be of size n-bytes. Valid page sizes
are 16k multiplied by a non negative integer powers of 4 up to a
maximum size of 16m. Valid page sizes are 16k, 64k, 256k, 1m, 4m,
and 16m.
-stackpagesize n-bytes
Stack page sizes will be of size n-bytes. Valid page sizes are 16k
multiplied by a non negative integer powers of 4 up to a maximum
size of 16m. Valid page sizes are 16k, 64k, 256k, 1m, 4m, and 16m.
-textpagesize n-bytes
Text page sizes will be of size n-bytes. Valid page sizes are 16k
multiplied by a non negative integer powers of 4 up to a maximum
size of 16m. Valid page sizes are 16k, 64k, 256k, 1m, 4m, and 16m.
-migration threshold
Page migration threshold is set to threshold. This value specifies
the maximum percentage difference between the number of remote
memory accesses and local memory accesses (relative to maximum
counter values ) for a given page, before a migration request event
occurs. A special argument of 0 will turn page migration off.
Page 1
DPLACE(1) DPLACE(1)
-propagate
Migration and page size information will be inherited by descendents
which are exec'ed.
-mustrun
When threads are attached to memories or cpus they are run
mandatorily.
-verbose or -v
Detailed diagnostic information is written to standard error.
EXAMPLE
To place data according to the file placement_file for the executable
a.out that would normally be run by:
% a.out < in > out
one would simply
% dplace -place placementfile a.out < in > out .
An example placement file placement_file, when a.out is two threaded
might look like:
# placementfile
memories 2 in topology cube # set up 2 memories which are close
threads 2 # number of threads
run thread 0 on memory 1 # run the first thread on the 2nd memory
run thread 1 on memory 0 # run the 2nd thread on the first memory
This specification, would request 2 nearby memories from the operating
system. At creation, the threads are requested to run on an available cpu
which is local to the specified memory. As data and stack space is
touched or faulted in, physical memory is allocated from the memory which
is local to the thread which initiated the fault.
This can be written in a scalable way for a variable number of threads
using the environment variable NP as follows:
# scalable placementfile
memories $NP in topology cube # set up memories which are close
threads $NP # number of threads
# run the last thread on the first memory etc.
distribute threads $NP-1:0:-1 across memories
USING MPI
Since most MPI implementations use $MPI_NP+1 threads; where the first
thread is mainly inactive. One might use the placement file:
# scalable placementfile for MPI
memories ($MPINP + 1)/2 in topology cube # set up memories which are close
threads $MPINP + 1 # number of threads
Page 2
DPLACE(1) DPLACE(1)
# ignore the lazy thread
distribute threads 1:$MPINP across memories
When using MPI with dplace, users should set MPI_NP to the appropriate
number of threads and run their dynamic executable directly from dplace;
do not use mpirun.
LARGE PAGES
Some applications run more efficiently using large pages. To run a
program a.out utilizing 64k pages for both stack and data, a placement
file is not necessary. One need only invoke the command:
dplace -datapagesize 64k -stackpagesize 64k a.out
from the shell.
PHYSICAL PLACEMENT
Physical placement can also be accomplished using dplace. The following
placement file:
# physical placementfile for 3 specific memories and 6 threads
memories 3 in topology physical near \
/hw/module/2/slot/n4/node \
/hw/module/3/slot/n2/node \
/hw/module/4/slot/n3/node
threads 6
#the first two threads (0 & 1 ) will run on /hw/module/2/slot/n4/node
#the second two threads (2 & 3 ) will run on /hw/module/3/slot/n2/node
#the last two threads (4 & 5 ) will run on /hw/module/4/slot/n3/node
distribute threads across memories
specifies three physical nodes using the proper /hw path. To find out the
names of the memory nodes on the machine you are using, type "find /hw
-name node -print" at the shell command prompt.
DEFAULTS
If command line arguments are omitted, dplace chooses the following set
of defaults:
place /dev/null
datapagesize 16k
stackpagesize 16k
textpagesize 16k
migration off
propagate off
mustrun off
verbose off
Page 3
DPLACE(1) DPLACE(1)
RESTRICTIONS
Programs must be dynamic executables; non shared executables behavior are
are unaffected by dplace. Placement files will only affect direct
descendents of dplace. Parallel applications must be based on the
sproc(2) or fork(2) mechanism. Page sizes for regions which are not
stack, text, or data can not be specified with dplace (eg: SYSV shared
memory). Regions shared by multiple processes (eg: DSO text) are faulted
in with the pagesize settings of the faulting process. Dplace sets the
environment variable _DSM_OFF which will disable libmp's own DSM
directives and environment variables.
ENVIRONMENT
Dplace recognizes and uses the environment variables PAGESIZE_DATA,
PAGESIZE_STACK and PAGESIZE_TEXT. When using these variables it is
important to note that the units are in kilobytes. The command line
option will override environment variable setting.
ERRORS
If errors are encountered in the placement file, dplace will print a
diagnostic message to standard error specifying where the error occurred
in the placement file and abort execution.
SEE ALSO
dplace(3), dplace(5), dprof(1), numa(5), mmci(5), dlook(1)
Page 4