Commit 35977806 authored by William Stonewall Monroe's avatar William Stonewall Monroe
Browse files

Update README.md to use more standard markup for command line instructions

parent bd04a2e8
......@@ -13,46 +13,64 @@ cd horovod-environment
# request gpu resources (one way of doing it), this needs to be done everytime
`sinteractive --ntasks=8 --time=08:00:00 --exclusive --partition=pascalnodes -N2 --gres=gpu:4`
```
sinteractive --ntasks=8 --time=08:00:00 --exclusive --partition=pascalnodes -N2 --gres=gpu:4
```
# load modules, this needs to be done everytime
`module load Anaconda3/5.2.0`
```
module load Anaconda3/5.2.0
`module load cuda91`
module load cuda91
`module load OpenMPI/3.1.2-gcccuda-2018b`
module load OpenMPI/3.1.2-gcccuda-2018b
```
# create anaconda environment
Download distribLearn2.yml from this repo
`conda env create -f distribLearn2.yml --name distributedLearning`
```
conda env create -f distribLearn2.yml --name distributedLearning
```
## source activate env needs to be done everytime
`source activate distributedLearning`
```
source activate distributedLearning
```
These next 3 bits only need to be done to setup the env
`conda update automat`
```
conda update automat
`pip uninstall horovod`
pip uninstall horovod
`pip install --no-cache-dir horovod`
pip install --no-cache-dir horovod
```
# navigate to an example
`This can be downloaded from https://github.com/uber/horovod`
This can be downloaded from https://github.com/uber/horovod
`cd /data/user/blazerid/horovod-master/examples/`
```
cd /data/user/blazerid/horovod-master/examples/
`mpirun -np 8 -bind-to none -map-by slot -mca pml ob1 -mca btl_tcp_if_include ib0 python keras_mnist.py`
mpirun -np 8 -bind-to none -map-by slot -mca pml ob1 -mca btl_tcp_if_include ib0 python keras_mnist.py
```
# or run benchmarks
`git clone -b cnn_tf_v1.10_compatible https://github.com/tensorflow/benchmarks`
```
git clone -b cnn_tf_v1.10_compatible https://github.com/tensorflow/benchmarks
cd benchmarks/
`cd benchmarks/`
mpirun -np 8 -bind-to none -map-by slot -mca pml ob1 -mca btl_tcp_if_include ib0 python scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py --model resnet101 --batch_size 64 --variable_update horovod
```
`mpirun -np 8 -bind-to none -map-by slot -mca pml ob1 -mca btl_tcp_if_include ib0 python scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py --model resnet101 --batch_size 64 --variable_update horovod`
For the resnet101 benchmark test,
running using 4 GPUs across 1 nodes gives: total images/sec: 491.34
running using 8 GPUs across 2 nodes gives: total images/sec: 915.31
running using 12 GPUs across 3 nodes gives: total images/sec: 1450.00
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment