Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Documentation] sagemaker-debugger open source documentation pre-launch #506

Open
wants to merge 49 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
aa789c7
update TF 2.2 smdebug features
mchoi8739 Aug 10, 2020
df74588
add details
mchoi8739 Aug 10, 2020
2fa0fdb
Update code samples/notes for new pySDK and smdebug/add and fix links
mchoi8739 Aug 10, 2020
6857d6c
add 'New features' note
mchoi8739 Aug 10, 2020
8be632a
minor fix
mchoi8739 Aug 10, 2020
d787f4b
minor fix
mchoi8739 Aug 10, 2020
6c00d2a
fix formatting
mchoi8739 Aug 10, 2020
4b6e0de
minor fix
mchoi8739 Aug 10, 2020
54c12ce
lint
mchoi8739 Aug 10, 2020
9e079dd
lint
mchoi8739 Aug 13, 2020
4afb5fc
minor structure change
mchoi8739 Aug 13, 2020
9c20ef2
minor fix
mchoi8739 Aug 13, 2020
293f770
minor fix
mchoi8739 Aug 13, 2020
4996feb
incorporate comments
mchoi8739 Aug 13, 2020
782e8c6
incorporate comments / lift limitation note
mchoi8739 Aug 13, 2020
aa7fcc5
incorporate comments
mchoi8739 Aug 13, 2020
83ad970
include pypi links
mchoi8739 Aug 13, 2020
3f2beff
minor fix
mchoi8739 Aug 13, 2020
fd1b1c2
incorporate comments
mchoi8739 Aug 13, 2020
463f0b4
incorporate comments
mchoi8739 Aug 13, 2020
72e48df
incorporate comments
mchoi8739 Aug 13, 2020
557eae1
version addition
mchoi8739 Aug 13, 2020
1eee9c6
version addition
mchoi8739 Aug 13, 2020
70a594b
Merge branch 'master' of https://github.com/awslabs/sagemaker-debugger
mchoi8739 Aug 31, 2020
19754a1
add details
mchoi8739 Aug 31, 2020
dd13c6c
add footnote
mchoi8739 Aug 31, 2020
cefd9df
Merge branch 'master' of https://github.com/awslabs/sagemaker-debugger
mchoi8739 Sep 4, 2020
f10a3a1
Merge branch 'master' of https://github.com/awslabs/sagemaker-debugger
mchoi8739 Sep 19, 2020
fcc0236
Merge branch 'master' of https://github.com/awslabs/sagemaker-debugger
Jun 7, 2021
7778131
Merge branch 'master' of https://github.com/awslabs/sagemaker-debugger
Jun 23, 2021
9edd714
pre-launch smdebug readthedocs website
Jun 23, 2021
437d9d7
add readthedocs yml
Jun 23, 2021
84edbae
rm all warnings
Jun 23, 2021
c6a94ea
rm pip protobuf
Jun 23, 2021
5d32864
test: try /usr/local dir
mchoi8739 Jun 25, 2021
b481d94
Merge branch 'master' of https://github.com/awslabs/sagemaker-debugge…
mchoi8739 Jun 29, 2021
248de9e
incorp comments
mchoi8739 Jun 30, 2021
f5051b5
Merge branch 'master' of https://github.com/awslabs/sagemaker-debugge…
mchoi8739 Jun 30, 2021
d11b76e
Trigger Build for testing RTD PR builder
mchoi8739 Jul 1, 2021
2fe16db
sync
mchoi8739 Jul 26, 2021
d847ffc
rm smdebug from env.yml
mchoi8739 Jul 26, 2021
90d84a5
Merge branch 'master' of github.com:awslabs/sagemaker-debugger into w…
mchoi8739 Jul 26, 2021
9e8fac5
Trigger Build
mchoi8739 Jul 26, 2021
7bed697
Trigger Build
mchoi8739 Jul 28, 2021
98b8153
Merge branch 'master' of github.com:awslabs/sagemaker-debugger into w…
mchoi8739 Jul 28, 2021
b170fd6
Merge branch 'master' into website
mchoi8739 Feb 7, 2022
ccc802c
Add unified RTD search to RTD website (#610)
atqy Aug 16, 2022
4a2746e
add licensing information (#612)
atqy Aug 16, 2022
9cb753d
Add RTD Search Filters (#618)
atqy Sep 1, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions .readthedocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# .readthedocs.yml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/conf.py
fail_on_warning: false

# Build documentation with MkDocs
#mkdocs:
# configuration: mkdocs.yml

# Optionally build your docs in additional formats such as PDF
#formats:
# - pdf

conda:
environment: docs/environment.yml

# Optionally set the version of Python and requirements required to build your docs
python:
version: 3.6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why Python 3.6? can we use Python 3.9?

install:
- method: setuptools
path: .
19 changes: 19 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -173,3 +173,22 @@
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.


======================================================================================
Amazon SageMaker Debugger Subcomponents:

The Amazon SageMaker Debugger Examples project contains subcomponents with separate
copyright notices and license terms. Your use of the source code for the
these subcomponents is subject to the terms and conditions of the following
licenses. See licenses/ for text of these licenses.

If a folder hierarchy is listed as subcomponent, separate listings of
further subcomponents (files or folder hierarchies) part of the hierarchy
take precedence.

=======================================================================================
2-clause BSD license
=======================================================================================
docs/_static/kendrasearchtools.js
docs/_templates/search.html
74 changes: 43 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,33 +57,45 @@ pip install smdebug
For a complete overview of Amazon SageMaker Debugger to learn how it works, go to the [Use Debugger in AWS Containers](https://docs.aws.amazon.com/sagemaker/latest/dg/debugger-container.html) developer guide.

### AWS Deep Learning Containers with zero code change
Debugger is installed by default in AWS Deep Learning Containers with TensorFlow, PyTorch, MXNet, and XGBoost. The following framework containers enable you to use Debugger with no changes to your training script, by automatically adding [SageMaker Debugger's Hook](docs/api.md#glossary).

The following frameworks are available AWS Deep Learning Containers with the deep learning frameworks for the zero script change experience.
Debugger is installed by default in AWS Deep Learning Containers
(TensorFlow, PyTorch, MXNet) and the SageMaker XGBoost containers. The
training containers are bundled and tested for integration with the
SMDebug library the entire SageMaker platform.

| Framework | Version |
| --- | --- |
| [TensorFlow](docs/tensorflow.md) | 1.15, 2.1.0, 2.2.0, 2.3.0, 2.3.1 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2.4 and 2.5 are also supported

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorporated

| [MXNet](docs/mxnet.md) | 1.6, 1.7 |
| [PyTorch](docs/pytorch.md) | 1.4, 1.5, 1.6 |
| [XGBoost](docs/xgboost.md) | 0.90-2, 1.0-1 ([As a built-in algorithm](docs/xgboost.md#use-xgboost-as-a-built-in-algorithm))|
Comment on lines -67 to -69
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smdebug is supported on the latest versions of all available DLCs.

See page.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorporated

To find a complete list of available Deep Learning Containers, See
[General Framework Containers](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#general-framework-containers) in the AWS Deep Learning Container
repository.

**Note**: Debugger with zero script change is partially available for TensorFlow v2.1.0. The `inputs`, `outputs`, `gradients`, and `layers` built-in collections are currently not available for these TensorFlow versions.
This enables you to use Debugger with no changes to your training
script, by automatically adding `hook-api`.

### AWS training containers with script mode
The following frameworks are available AWS Deep Learning Containers with
the deep learning frameworks for the zero script change experience.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we explaining what is 'zero script change experience' in the doc? If yes, can we link it here?
In other locations, I am seeing the lines such as 'no changes to your training script'

The `smdebug` library supports frameworks other than the ones listed above while using AWS containers with script mode. If you want to use SageMaker Debugger with one of the following framework versions, you need to make minimal changes to your training script.
### Frameworks supported by the SMDebug library

| Framework | Versions |
| --- | --- |
| [TensorFlow](docs/tensorflow.md) | 1.13, 1.14, 1.15, 2.1.0, 2.2.0, 2.3.0, 2.3.1 |
| Keras (with TensorFlow backend) | 2.3 |
| [MXNet](docs/mxnet.md) | 1.4, 1.5, 1.6, 1.7 |
| [PyTorch](docs/pytorch.md) | 1.2, 1.3, 1.4, 1.5, 1.6 |
| [XGBoost](docs/xgboost.md) | 0.90-2, 1.0-1 (As a framework)|
The SMDebug library supports machine learning frameworks for SageMaker
training jobs with script mode and custom training containers. If you
want to use SageMaker Debugger with one of the following framework
versions, you need to make minimal changes to your training script using
the SMDebug library.

| Framework | Versions |
|---------------------------------|------------------------------------------------------------|
| `tensorflow` | 1.13, 1.14, 1.15, 2.1.0, 2.2.0, 2.3.0, 2.3.1, 2.4.1, 2.5.0 |
| Keras (with TensorFlow backend) | 2.3 |
| `mxnet` | 1.4, 1.5, 1.6, 1.7, 1.8 |
| `pytorch` | 1.2, 1.3, 1.4, 1.5, 1.6, 1.8, 1.9 |
| `xgboost` | 0.90-2, 1.0-1, 1.2-1 (As a framework) |

### Debugger on custom containers or local machines
You can also fully use the Debugger features in custom containers with the SageMaker Python SDK. Furthermore, `smdebug` is an open source library, so you can install it on your local machine for any advanced use cases that cannot be run in the SageMaker environment and for constructing `smdebug` custom hooks and rules.

You can also fully use the Debugger features in custom containers with
the SageMaker Python SDK. Furthermore, `smdebug` is an open source
library, so you can install it on your local machine for any advanced
use cases that cannot be run in the SageMaker environment and for
constructing `smdebug` custom hooks and rules.

---

Expand All @@ -110,10 +122,10 @@ To see a complete list of built-in rules and their functionalities, see [List of
You can use Debugger with your training script on your own container making only a minimal modification to your training script to add Debugger's `Hook`.
For an example template of code to use Debugger on your own container in TensorFlow 2.x frameworks, see [Run Debugger in custom container](#Run-Debugger-in-custom-container).
See the following instruction pages to set up Debugger in your preferred framework.
- [TensorFlow](docs/tensorflow.md)
- [MXNet](docs/mxnet.md)
- [PyTorch](docs/pytorch.md)
- [XGBoost](docs/xgboost.md)
- [TensorFlow](tensorflow.md)
- [MXNet](mxnet.md)
- [PyTorch](pytorch.md)
- [XGBoost](xgboost.md)

#### Using SageMaker Debugger on custom containers

Expand Down Expand Up @@ -177,7 +189,7 @@ When you run the `sagemaker_simple_estimator.fit()` API,
SageMaker will automatically monitor your training job for you with the Rules specified and create a `CloudWatch` event that tracks the status of the Rule,
so you can take any action based on them.

If you want additional configuration and control, see [Running SageMaker jobs with Debugger](docs/sagemaker.md) for more information.
If you want additional configuration and control, see [Running SageMaker jobs with Debugger](sagemaker.md) for more information.

#### Run Debugger in custom container

Expand Down Expand Up @@ -235,23 +247,23 @@ print(f"Loss values during evaluation were {trial.tensor('CrossEntropyLoss:0').v
## SageMaker Debugger in Action
- Through the model pruning process using Debugger and `smdebug`, you can iteratively identify the importance of weights and cut neurons below a threshold you define. This process allows you to train the model with significantly fewer neurons, which means a lighter, more efficient, faster, and cheaper model without compromising accuracy. The following accuracy versus the number of parameters graph is produced in Studio. It shows that the model accuracy started from about 0.9 with 12 million parameters (the data point moves from right to left along with the pruning process), improved during the first few pruning iterations, kept the quality of accuracy until it cut the number of parameters down to 6 million, and start sacrificing the accuracy afterwards.

![Debugger Iterative Model Pruning using ResNet](docs/resources/results_resnet.png?raw=true)
![Debugger Iterative Model Pruning using ResNet](resources/results_resnet.png?raw=true)
Debugger provides you tools to access such training process and have a complete control over your model. See [Using SageMaker Debugger and SageMaker Experiments for iterative model pruning](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-debugger/pytorch_iterative_model_pruning/iterative_model_pruning_resnet.ipynb) notebook for the full example and more information.

- Use Debugger with XGBoost in SageMaker Studio to save feature importance values and plot them in a notebook during training. ![Debugger XGBoost Visualization Example](docs/resources/xgboost_feature_importance.png?raw=true)
- Use Debugger with XGBoost in SageMaker Studio to save feature importance values and plot them in a notebook during training. ![Debugger XGBoost Visualization Example](resources/xgboost_feature_importance.png?raw=true)

- Use Debugger with TensorFlow in SageMaker Studio to run built-in rules and visualize the loss. ![Debugger TensorFlow Visualization Example](docs/resources/tensorflow_rules_loss.png?raw=true)
- Use Debugger with TensorFlow in SageMaker Studio to run built-in rules and visualize the loss. ![Debugger TensorFlow Visualization Example](resources/tensorflow_rules_loss.png?raw=true)

---

## Further Documentation and References

| Section | Description |
| --- | --- |
| [SageMaker Training](docs/sagemaker.md) | SageMaker users, we recommend you start with this page on how to run SageMaker training jobs with SageMaker Debugger |
| Frameworks <ul><li>[TensorFlow](docs/tensorflow.md)</li><li>[PyTorch](docs/pytorch.md)</li><li>[MXNet](docs/mxnet.md)</li><li>[XGBoost](docs/xgboost.md)</li></ul> | See the frameworks pages for details on what's supported and how to modify your training script if applicable |
| [APIs for Saving Tensors](docs/api.md) | Full description of our APIs on saving tensors |
| [Programming Model for Analysis](docs/analysis.md) | For description of the programming model provided by the APIs that enable you to perform interactive exploration of tensors saved, as well as to write your own Rules monitoring your training jobs. |
| [SageMaker Training](sagemaker.md) | SageMaker users, we recommend you start with this page on how to run SageMaker training jobs with SageMaker Debugger |
| Frameworks <ul><li>[TensorFlow](tensorflow.md)</li><li>[PyTorch](pytorch.md)</li><li>[MXNet](mxnet.md)</li><li>[XGBoost](xgboost.md)</li></ul> | See the frameworks pages for details on what's supported and how to modify your training script if applicable |
| [APIs for Saving Tensors](api.md) | Full description of our APIs on saving tensors |
| [Programming Model for Analysis](analysis.md) | For description of the programming model provided by the APIs that enable you to perform interactive exploration of tensors saved, as well as to write your own Rules monitoring your training jobs. |


## License
Expand Down
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
Loading