Scope
Scope may be downloaded from https://github.com/c3sr/scope/releases
master |
---|
A benchmark framework developed by the IBM-ILLINOIS Center for Cognitive Computing Systems Research (C3SR) in collaboration with the IMPACT group at the University of Illinois.
Primary maintainers:
Project Advisors:
- Prof. Wen-mei Hwu (UofI)
- Dr. Jinjun Xiong (IBM Research)
Various benchmark suites using Scope are under development:
- Comm|Scope - CUDA/NUMA data transfer performance (Carl Pearson, UIUC)
- NCCL|Scope - GPU collective communication performance (Sarah Hashash, UIUC)
- Histo|Scope - CUDA histogram techniques (Carl Pearson, UIUC)
- DDL|Scope - IBM Distributed Deep Learning Library benchmarks (Vandana Kulkarni, UIUC)
- TCU|Scope - CUDA/TCU performance primitives (Abdul Dakkak, UIUC)
- FrameworkLayer|Scope - Evaluation of neural network layers across frameworks (Cheng Li and Abdul Dakkak, UIUC)
- CUDNN|Scope - Evaluation of neural network layers using CuDNN(Cheng Li and Abdul Dakkak, UIUC)
- Misc|Scope - experimental or miscellaneous benchmarks
Quick Start
- Install CMake 3.12+
- clone, checkout the lastest release, update submodules to match, and build
git clone https://github.com/c3sr/scope.git --recursive cd scope git checkout v1.3.2 # or the latest, `git tag --list` git submodule update # match benchmark versions mkdir build && cd build cmake .. -DENABLE_COMM=ON # or other scopes make -j`nproc` ./scope --benchmark_list_tests=true # list all scopes
Install CMake >= 3.12
User install of CMake 3.12 (preferred)
If your system has CMake < 3.12, we suggest installing CMake 3.12+ in the user's $HOME
directory.
On x86-64, the following will download CMake 3.12.0 and install it in $HOME
/software/cmake-3.12.0.
cd /tmp
wget https://cmake.org/files/v3.12/cmake-3.12.0-Linux-x86_64.sh
mkdir -p $HOME/software/cmake-3.12.0
sudo sh cmake-3.12.0-Linux-x86_64.sh --prefix=$HOME/software/cmake-3.12.0 --exclude-subdir
You will then need to add HOME/.bashrc
:
export PATH="$PATH:$HOME/software/cmake-3.12.0/bin"`
On ppc64le, you will need to download the CMake source from the CMake website and build it.
System install of CMake 3.12
If you don't already know how to do this before reading, this is probably not the right option for you.
First, uninstall any existing system install of CMake.
Then, follow the User install instructions above, but choose a system prefix for the installation.
Compile
To compile the project run the following commands (making sure nvcc is in your $PATH, which is typically at /usr/local/cuda/bin/nvcc)
git clone https://github.com/c3sr/scope.git --recursive
cd scope
mkdir -p build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make
The build system uses Hunter to download all dependencies.
If you have trouble downloading dependencies, check to make sure Hunter/CMake can use SSL.
Or you can forego Hunter entirely and provide your own dependencies.
You will need to enable the particular scopes that provide the benchmarks you want to run
Scope | CMake Option |
---|---|
CuDNN | -DENABLE_CUDNN=1 |
NCCL | -DENABLE_NCCL=1 |
Comm | -DENABLE_COMM=1 |
Example | -DENABLE_EXAMPLE=1 (default) |
if you get errors about nvcc not supporting your gcc compiler, then you may want to use
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_CUDA_HOST_COMPILER=`which gcc-6` ..
You can optionally choose your own CUDA archs that you would like to be compiled:
cmake -DNVCC_ARCH_FLAGS="2.0 2.1 2.0 2.1 3.0 3.2 3.5 3.7 5.0 5.2 5.3" ..
The accepted syntax is the same as the CUDA_SELECT_NVCC_ARCH_FLAGS
syntax in the FindCUDA module.
You can disable or enable individual scopes
cmake -DENABLE_MISC=0 ...
The submodules should automatically be checked out.
If not, try checking them out yourself:
git submodule update --init --recursive
or to update modules to the proper verions
git submodule update --recursive --remote
Available Benchmarks
The available benchmarks and descriptions are listed here. You can list all the benchmarks with
./scope --benchmark_list_tests=true
you can filter the benchmarks that are run with a regular expression passed to --benchmark_filter
.
./scope --benchmark_filter=[regex]
for example
./scope --benchmark_filter=SGEMM
futher controls over the benchmarks are explained in the --help
option
Run all the benchmarks
This is not generally recommended, as it will take quite some time.
./scope
The above will output to stdout something like
------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
------------------------------------------------------------------------------
SGEMM/1000/1/1/-1/1 5 us 5 us 126475 K=1 M=1000 N=1 alpha=-1 beta=1
SGEMM/128/169/1728/1/0 539 us 534 us 1314 K=1.728k M=128 N=169 alpha=1 beta=0
SGEMM/128/729/1200/1/0 1042 us 1035 us 689 K=1.2k M=128 N=729 alpha=1 beta=0
SGEMM/192/169/1728/1/0 729 us 724 us 869 K=1.728k M=192 N=169 alpha=1 beta=0
SGEMM/256/169/1/1/1 9 us 9 us 75928 K=1 M=256 N=169 alpha=1 beta=1
SGEMM/256/729/1/1/1 35 us 35 us 20285 K=1 M=256 N=729 alpha=1 beta=1
SGEMM/384/169/1/1/1 18 us 18 us 45886 K=1 M=384 N=169 alpha=1 beta=1
SGEMM/384/169/2304/1/0 2475 us 2412 us 327 K=2.304k M=384 N=169 alpha=1 beta=0
SGEMM/50/1000/1/1/1 10 us 10 us 73312 K=1 M=50 N=1000 alpha=1 beta=1
SGEMM/50/1000/4096/1/0 6364 us 5803 us 100 K=4.096k M=50 N=1000 alpha=1 beta=0
SGEMM/50/4096/1/1/1 46 us 45 us 13491 K=1 M=50 N=4.096k alpha=1 beta=1
SGEMM/50/4096/4096/1/0 29223 us 26913 us 20 K=4.096k M=50 N=4.096k alpha=1 beta=0
SGEMM/50/4096/9216/1/0 55410 us 55181 us 10 K=9.216k M=50 N=4.096k alpha=1 beta=0
SGEMM/96/3025/1/1/1 55 us 51 us 14408 K=1 M=96 N=3.025k alpha=1 beta=1
SGEMM/96/3025/363/1/0 1313 us 1295 us 570 K=363 M=96 N=3.025k alpha=1 beta=0
Output as JSON using
./scope --benchmark_out_format=json --benchmark_out=test.json
or preferably
./scope --benchmark_out_format=json --benchmark_out=`hostname`.json
Repeat benchmark runs with
./scope --benchmark_repetitions=5
Plot Benchmark JSON files
Try the ScopePlot python package.
pip install scope_plot
On Minsky With PowerAI
cd build && rm -fr * && OpenBLAS=/opt/DL/openblas cmake -DCMAKE_BUILD_TYPE=Release .. -DOpenBLAS=/opt/DL/openblas
Disable CPU frequency scaling
If you see this error:
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
you might want to disable the CPU frequency scaling while running the benchmark.
On ubuntu, install
apt install linux-tools-$(uname -r)
then
sudo cpupower frequency-set --governor performance
./scope
sudo cpupower frequency-set --governor powersave
Run with Docker
Install nvidia-docker
, then, list the available benchmarks.
nvidia-docker run --rm raiproject/microbench:amd64-latest bench --benchmark_list_tests
You can run benchmarks in the following way (probably with the --benchmark_filter
flag).
nvidia-docker run --privileged --rm -v `readlink -f .`:/data -u `id -u`:`id -g` raiproject/microbench:amd64-latest ./numa-separate-process.sh dgx bench /data/sync2
--privileged
is needed to set the NUMA policy if NUMA benchmarks are to be run.-v `readlink -f .`:/data
maps the current directory into the container as/data
.--benchmark_out=/data/\`hostname`.json
tells thebench
binary to write the json output files to/data
in the container, which is mapped to the current directory.-u `id -u`:`id -g`
tells docker to run as userid -u
and groupid -g
, which is the current user and group. This means that files that docker produces will be modifiable from the host system without root permission.
Hunter Toolchain File
If some of the third-party code compiled by hunter needs a different compiler, you can create a cmake toolchain file to set various cmake variables that will be globally used when building that code. You can then pass this file into cmake
cmake -DCMAKE_TOOLCHAIN_FILE=toolchain.cmake ...
Adding a new benchmark
If you would like to develop a benchmark suite, read here for more information.
Also, check out the Example|Scope for a template to get started