forked from facebookresearch/param
-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fbshipit-source-id: 8e11243d48d1abf2f035216bf1d75e5563c86bcb
- Loading branch information
0 parents
commit 7671b66
Showing
17 changed files
with
3,419 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
[submodule "train/workloads/dlrm"] | ||
path = train/workloads/dlrm | ||
url = https://github.com/facebookresearch/dlrm.git | ||
branch = dist_exp |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
# Code of Conduct | ||
|
||
## Our Pledge | ||
|
||
In the interest of fostering an open and welcoming environment, we as | ||
contributors and maintainers pledge to make participation in our project and | ||
our community a harassment-free experience for everyone, regardless of age, body | ||
size, disability, ethnicity, sex characteristics, gender identity and expression, | ||
level of experience, education, socio-economic status, nationality, personal | ||
appearance, race, religion, or sexual identity and orientation. | ||
|
||
## Our Standards | ||
|
||
Examples of behavior that contributes to creating a positive environment | ||
include: | ||
|
||
* Using welcoming and inclusive language | ||
* Being respectful of differing viewpoints and experiences | ||
* Gracefully accepting constructive criticism | ||
* Focusing on what is best for the community | ||
* Showing empathy towards other community members | ||
|
||
Examples of unacceptable behavior by participants include: | ||
|
||
* The use of sexualized language or imagery and unwelcome sexual attention or | ||
advances | ||
* Trolling, insulting/derogatory comments, and personal or political attacks | ||
* Public or private harassment | ||
* Publishing others' private information, such as a physical or electronic | ||
address, without explicit permission | ||
* Other conduct which could reasonably be considered inappropriate in a | ||
professional setting | ||
|
||
## Our Responsibilities | ||
|
||
Project maintainers are responsible for clarifying the standards of acceptable | ||
behavior and are expected to take appropriate and fair corrective action in | ||
response to any instances of unacceptable behavior. | ||
|
||
Project maintainers have the right and responsibility to remove, edit, or | ||
reject comments, commits, code, wiki edits, issues, and other contributions | ||
that are not aligned to this Code of Conduct, or to ban temporarily or | ||
permanently any contributor for other behaviors that they deem inappropriate, | ||
threatening, offensive, or harmful. | ||
|
||
## Scope | ||
|
||
This Code of Conduct applies within all project spaces, and it also applies when | ||
an individual is representing the project or its community in public spaces. | ||
Examples of representing a project or community include using an official | ||
project e-mail address, posting via an official social media account, or acting | ||
as an appointed representative at an online or offline event. Representation of | ||
a project may be further defined and clarified by project maintainers. | ||
|
||
## Enforcement | ||
|
||
Instances of abusive, harassing, or otherwise unacceptable behavior may be | ||
reported by contacting the project team at <[email protected]>. All | ||
complaints will be reviewed and investigated and will result in a response that | ||
is deemed necessary and appropriate to the circumstances. The project team is | ||
obligated to maintain confidentiality with regard to the reporter of an incident. | ||
Further details of specific enforcement policies may be posted separately. | ||
|
||
Project maintainers who do not follow or enforce the Code of Conduct in good | ||
faith may face temporary or permanent repercussions as determined by other | ||
members of the project's leadership. | ||
|
||
## Attribution | ||
|
||
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, | ||
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html | ||
|
||
[homepage]: https://www.contributor-covenant.org | ||
|
||
For answers to common questions about this code of conduct, see | ||
https://www.contributor-covenant.org/faq |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# Contributing to PARAM_Bench | ||
We want to make contributing to this project as easy and transparent as | ||
possible. | ||
|
||
## Pull Requests | ||
We actively welcome your pull requests. | ||
|
||
1. Fork the repo and create your branch from `master`. | ||
2. If you've added code that should be tested, add tests. | ||
3. If you've changed APIs, update the documentation. | ||
4. Ensure the test suite passes. | ||
5. Make sure your code lints. | ||
6. If you haven't already, complete the Contributor License Agreement ("CLA"). | ||
|
||
## Contributor License Agreement ("CLA") | ||
In order to accept your pull request, we need you to submit a CLA. You only need | ||
to do this once to work on any of Facebook's open source projects. | ||
|
||
Complete your CLA here: <https://code.facebook.com/cla> | ||
|
||
## Issues | ||
We use GitHub issues to track public bugs. Please ensure your description is | ||
clear and has sufficient instructions to be able to reproduce the issue. | ||
|
||
Facebook has a [bounty program](https://www.facebook.com/whitehat/) for the safe | ||
disclosure of security bugs. In those cases, please go through the process | ||
outlined on that page and do not file a public issue. | ||
|
||
## License | ||
By contributing to PARAM-Bench, you agree that your contributions will be licensed | ||
under the LICENSE file in the root directory of this source tree. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
MIT License | ||
|
||
Copyright (c) Facebook, Inc. and its affiliates. | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all | ||
copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# PARAM | ||
|
||
PARAM Benchmarks is a repository of communication and compute micro-benchmarks as well as full workloads for evaluating training and inference platforms. | ||
|
||
PARAM complements two broad categories of commonly used benchmarks: | ||
1. C++ based stand-alone compute and communication benchmarks using cuDNN, MKL, NCCL, MPI libraries - e.g. NCCL tests (https://github.com/NVIDIA/nccl-tests), OSU MPI benchmarks (https://mvapich.cse.ohio-state.edu/benchmarks/), and DeepBench (https://github.com/baidu-research/DeepBench). | ||
2. Application benchmarks such as Deep Learning Recommendation Model (DLRM) and the broader MLPerf benchmarks. MLPerf is a de-facto industry standard benchmark covering a wide range of AI workloads including computer vision and natural language processing. Recent addition of DLRM to MLPerf 0.7 is a great step towards making it more representative of FB’s AI workloads. | ||
|
||
Our inital release of PARAM benchmarks focuses on AI training and comprises of: | ||
1. Communication: PyTorch based collective benchmarks across arbitrary message sizes, effectiveness of compute-communication overlap, and DLRM communication patterns in fwd/bwd pass | ||
2. Compute: PyTorch based GEMM, embedding lookup, and linear layer | ||
3. DLRM: tracks the `ext_dist` branch of DRLM benchmark use Facebook's DLRM benchmark (https://github.com/facebookresearch/dlrm). In short, PARAM fully relies on DLRM benchmark for end-to-end workload evaluation; with additional extensions as required for scale-out AI training platforms. | ||
|
||
In essence, PARAM bridges the gap between stand-alone C++ benchmarks and PyTorch/Tensorflow based application benchmarks. This enables us gain deep insights into the inner workings of the system architecture as well as identify framework-level overheads by stressing all subcomponents of a system. | ||
|
||
## Version | ||
|
||
0.1 : Initial release | ||
|
||
## Requirements | ||
|
||
pytorch | ||
future | ||
numpy | ||
|
||
## License | ||
|
||
PARAM benchmarks is released under the MIT license. Please see the [`LICENSE`](LICENSE) file for more information. | ||
|
||
## Contributing | ||
|
||
We actively welcome your pull requests! Please see [`CONTRIBUTING.md`](CONTRIBUTING.md) and [`CODE_OF_CONDUCT.md`](CODE_OF_CONDUCT.md) for more info. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
torch | ||
future | ||
numpy |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
# PARAM benchmark - Communication benchmarks | ||
|
||
PARAM-Comms is an effort to develop unified benchmarking framework to | ||
characterize training platform backends. | ||
|
||
The PARAM-Comms benchmark offers a single point solution to perform both top-down | ||
(DLRM application) and bottoms-up (collectives) operations for any given | ||
communication backend. | ||
|
||
Currently the benchmark supports Pytorch-NCCL backend, and PyTorch-XLA backend. | ||
|
||
The bottoms-up benchmark (`comms.py`) is designed similar to nccl-tests, and the | ||
top-down benchmark (`dlrm.py`) is similar to opensource dlrm benchmark except it | ||
only implements communication primitives. | ||
|
||
## Usage: | ||
|
||
### The bottoms-up benchmark (`comms.py`) | ||
```bash | ||
mpirun -np <num-processes> -N <processes per node> --hostfile <file contains host list> ./comms.py \ | ||
--master-ip 127.0.0.1 | ||
--b <begin-size> \ | ||
--e <end-size> \ | ||
--n <num-iters> \ | ||
--f <step-factor> \ | ||
--z <blocking/non-blocking> \ | ||
--collective <collective-to-test> | ||
``` | ||
Example: | ||
```bash | ||
mpirun -np 16 -N 8 --hostfile ./hfile ./comms.py --master-ip 127.0.0.1 --b 8 --e 256M --n 100 \ | ||
--f 2 --z 1 --collective all_to_all | ||
``` | ||
|
||
### The top-down benchmark (`dlrm.py`) | ||
```bash | ||
mpirun -np <num-processes> -N <processes per node> --hostfile <file contains host list> ./dlrm.py \ | ||
--master-ip <master-node-ip-address> | ||
--arch-sparse-feature-size <spare-feature-size> \ | ||
--arch-embedding-size <embedding-table-sizes> \ | ||
--arch-mlp-bot <layer-dimensions of bottom layers> \ | ||
--arch-mlp-top <layer-dimensions of top layers> \ | ||
--mini-batch-size <mini-batch-sizes> \ | ||
--num-batches <number-of-batches> | ||
``` | ||
Example: | ||
```bash | ||
mpirun -np 16 -N 8 --hostfile ./hfile ./dlrm.py --master-ip <node-1-ip> --mini-batch-size 32 \ | ||
--num-batches 100 \ | ||
--arch-mlp-bot 1024-256 \ | ||
--arch-sparse-feature-size 64 \ | ||
--arch-embedding-size "10000-10000" | ||
``` |
Oops, something went wrong.