Skip to content
GitLab
Menu
Projects
Groups
Snippets
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
William Stonewall Monroe
horovod-environment
Commits
5710770e
Commit
5710770e
authored
Jan 23, 2019
by
William Stonewall Monroe
Browse files
Add new job file for running horovod benchmarks
parent
74f55c69
Changes
1
Show whitespace changes
Inline
Side-by-side
horovod-benchmark.job
0 → 100644
View file @
5710770e
#
!/
bin
/
bash
#
SBATCH
--
share
#
SBATCH
--
partition
=
pascalnodes
#
SBATCH
--
exclusive
#
#
Name
your
job
to
make
it
easier
for
you
to
track
#
#
SBATCH
--
job
-
name
=
keras_mpi
#
#
Set
your
error
and
output
files
#
#
SBATCH
--
error
=
keras_resnet_3node_mpi
.
err
#
SBATCH
--
output
=
keras_resnet_3node_mpi
.
out
#
SBATCH
--
ntasks
=
12
#
SBATCH
--
gres
=
gpu
:
4
#
SBATCH
-
N3
#
#
Tell
the
scheduler
only
need
12
hours
#
#
SBATCH
--
time
=
12
:
00
:
00
#
SBATCH
--
cpus
-
per
-
task
=
2
#
SBATCH
--
mem
-
per
-
cpu
=
16000
#
#
Set
your
email
address
and
request
notification
when
you
job
is
complete
or
if
it
fails
#
#
SBATCH
--
mail
-
type
=
FAIL
#
SBATCH
--
mail
-
user
=
$
USER
@uab.edu
module
load
Anaconda3
/
5.2
.
0
module
load
cuda91
/
toolkit
module
load
OpenMPI
/
3.1
.
2
-
gcccuda
-
2018
b
source
activate
distributedLearning
time
mpirun
-
np
$
SLURM_NTASKS
-
bind
-
to
none
-
map
-
by
slot
-
mca
pml
ob1
-
mca
btl_tcp_if_include
ib0
python
/
data
/
user
/
$
USER
/
benchmarks
/
scripts
/
tf_cnn_benchmarks
/
tf_cnn_benchmarks
.
py
--
model
resnet101
--
batch_size
64
--
variable_update
horovod
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment