Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Documentation] sagemaker-debugger open source documentation pre-launch #506

Open
wants to merge 49 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 34 commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
aa789c7
update TF 2.2 smdebug features
mchoi8739 Aug 10, 2020
df74588
add details
mchoi8739 Aug 10, 2020
2fa0fdb
Update code samples/notes for new pySDK and smdebug/add and fix links
mchoi8739 Aug 10, 2020
6857d6c
add 'New features' note
mchoi8739 Aug 10, 2020
8be632a
minor fix
mchoi8739 Aug 10, 2020
d787f4b
minor fix
mchoi8739 Aug 10, 2020
6c00d2a
fix formatting
mchoi8739 Aug 10, 2020
4b6e0de
minor fix
mchoi8739 Aug 10, 2020
54c12ce
lint
mchoi8739 Aug 10, 2020
9e079dd
lint
mchoi8739 Aug 13, 2020
4afb5fc
minor structure change
mchoi8739 Aug 13, 2020
9c20ef2
minor fix
mchoi8739 Aug 13, 2020
293f770
minor fix
mchoi8739 Aug 13, 2020
4996feb
incorporate comments
mchoi8739 Aug 13, 2020
782e8c6
incorporate comments / lift limitation note
mchoi8739 Aug 13, 2020
aa7fcc5
incorporate comments
mchoi8739 Aug 13, 2020
83ad970
include pypi links
mchoi8739 Aug 13, 2020
3f2beff
minor fix
mchoi8739 Aug 13, 2020
fd1b1c2
incorporate comments
mchoi8739 Aug 13, 2020
463f0b4
incorporate comments
mchoi8739 Aug 13, 2020
72e48df
incorporate comments
mchoi8739 Aug 13, 2020
557eae1
version addition
mchoi8739 Aug 13, 2020
1eee9c6
version addition
mchoi8739 Aug 13, 2020
70a594b
Merge branch 'master' of https://github.com/awslabs/sagemaker-debugger
mchoi8739 Aug 31, 2020
19754a1
add details
mchoi8739 Aug 31, 2020
dd13c6c
add footnote
mchoi8739 Aug 31, 2020
cefd9df
Merge branch 'master' of https://github.com/awslabs/sagemaker-debugger
mchoi8739 Sep 4, 2020
f10a3a1
Merge branch 'master' of https://github.com/awslabs/sagemaker-debugger
mchoi8739 Sep 19, 2020
fcc0236
Merge branch 'master' of https://github.com/awslabs/sagemaker-debugger
Jun 7, 2021
7778131
Merge branch 'master' of https://github.com/awslabs/sagemaker-debugger
Jun 23, 2021
9edd714
pre-launch smdebug readthedocs website
Jun 23, 2021
437d9d7
add readthedocs yml
Jun 23, 2021
84edbae
rm all warnings
Jun 23, 2021
c6a94ea
rm pip protobuf
Jun 23, 2021
5d32864
test: try /usr/local dir
mchoi8739 Jun 25, 2021
b481d94
Merge branch 'master' of https://github.com/awslabs/sagemaker-debugge…
mchoi8739 Jun 29, 2021
248de9e
incorp comments
mchoi8739 Jun 30, 2021
f5051b5
Merge branch 'master' of https://github.com/awslabs/sagemaker-debugge…
mchoi8739 Jun 30, 2021
d11b76e
Trigger Build for testing RTD PR builder
mchoi8739 Jul 1, 2021
2fe16db
sync
mchoi8739 Jul 26, 2021
d847ffc
rm smdebug from env.yml
mchoi8739 Jul 26, 2021
90d84a5
Merge branch 'master' of github.com:awslabs/sagemaker-debugger into w…
mchoi8739 Jul 26, 2021
9e8fac5
Trigger Build
mchoi8739 Jul 26, 2021
7bed697
Trigger Build
mchoi8739 Jul 28, 2021
98b8153
Merge branch 'master' of github.com:awslabs/sagemaker-debugger into w…
mchoi8739 Jul 28, 2021
b170fd6
Merge branch 'master' into website
mchoi8739 Feb 7, 2022
ccc802c
Add unified RTD search to RTD website (#610)
atqy Aug 16, 2022
4a2746e
add licensing information (#612)
atqy Aug 16, 2022
9cb753d
Add RTD Search Filters (#618)
atqy Sep 1, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions .readthedocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# .readthedocs.yml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/conf.py
fail_on_warning: false

# Build documentation with MkDocs
#mkdocs:
# configuration: mkdocs.yml

# Optionally build your docs in additional formats such as PDF
#formats:
# - pdf

conda:
environment: docs/environment.yml

# Optionally set the version of Python and requirements required to build your docs
python:
version: 3.6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why Python 3.6? can we use Python 3.9?

install:
- method: setuptools
path: .
40 changes: 20 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,10 +63,10 @@ The following frameworks are available AWS Deep Learning Containers with the dee

| Framework | Version |
| --- | --- |
| [TensorFlow](docs/tensorflow.md) | 1.15, 2.1.0, 2.2.0, 2.3.0, 2.3.1 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2.4 and 2.5 are also supported

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorporated

| [MXNet](docs/mxnet.md) | 1.6, 1.7 |
| [PyTorch](docs/pytorch.md) | 1.4, 1.5, 1.6 |
| [XGBoost](docs/xgboost.md) | 0.90-2, 1.0-1 ([As a built-in algorithm](docs/xgboost.md#use-xgboost-as-a-built-in-algorithm))|
Comment on lines -67 to -69
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smdebug is supported on the latest versions of all available DLCs.

See page.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorporated

| [TensorFlow](tensorflow.md) | 1.15, 2.1.0, 2.2.0, 2.3.0, 2.3.1 |
| [MXNet](mxnet.md) | 1.6, 1.7 |
| [PyTorch](pytorch.md) | 1.4, 1.5, 1.6 |
| [XGBoost](xgboost.md) | 0.90-2, 1.0-1 ([As a built-in algorithm](docs/xgboost.md#use-xgboost-as-a-built-in-algorithm))|

**Note**: Debugger with zero script change is partially available for TensorFlow v2.1.0. The `inputs`, `outputs`, `gradients`, and `layers` built-in collections are currently not available for these TensorFlow versions.

Expand All @@ -76,11 +76,11 @@ The `smdebug` library supports frameworks other than the ones listed above while

| Framework | Versions |
| --- | --- |
| [TensorFlow](docs/tensorflow.md) | 1.13, 1.14, 1.15, 2.1.0, 2.2.0, 2.3.0, 2.3.1 |
| [TensorFlow](tensorflow.md) | 1.13, 1.14, 1.15, 2.1.0, 2.2.0, 2.3.0, 2.3.1 |
| Keras (with TensorFlow backend) | 2.3 |
| [MXNet](docs/mxnet.md) | 1.4, 1.5, 1.6, 1.7 |
| [PyTorch](docs/pytorch.md) | 1.2, 1.3, 1.4, 1.5, 1.6 |
| [XGBoost](docs/xgboost.md) | 0.90-2, 1.0-1 (As a framework)|
| [MXNet](mxnet.md) | 1.4, 1.5, 1.6, 1.7 |
| [PyTorch](pytorch.md) | 1.2, 1.3, 1.4, 1.5, 1.6 |
| [XGBoost](xgboost.md) | 0.90-2, 1.0-1 (As a framework)|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorporated


### Debugger on custom containers or local machines
You can also fully use the Debugger features in custom containers with the SageMaker Python SDK. Furthermore, `smdebug` is an open source library, so you can install it on your local machine for any advanced use cases that cannot be run in the SageMaker environment and for constructing `smdebug` custom hooks and rules.
Expand Down Expand Up @@ -110,10 +110,10 @@ To see a complete list of built-in rules and their functionalities, see [List of
You can use Debugger with your training script on your own container making only a minimal modification to your training script to add Debugger's `Hook`.
For an example template of code to use Debugger on your own container in TensorFlow 2.x frameworks, see [Run Debugger in custom container](#Run-Debugger-in-custom-container).
See the following instruction pages to set up Debugger in your preferred framework.
- [TensorFlow](docs/tensorflow.md)
- [MXNet](docs/mxnet.md)
- [PyTorch](docs/pytorch.md)
- [XGBoost](docs/xgboost.md)
- [TensorFlow](tensorflow.md)
- [MXNet](mxnet.md)
- [PyTorch](pytorch.md)
- [XGBoost](xgboost.md)

#### Using SageMaker Debugger on custom containers

Expand Down Expand Up @@ -177,7 +177,7 @@ When you run the `sagemaker_simple_estimator.fit()` API,
SageMaker will automatically monitor your training job for you with the Rules specified and create a `CloudWatch` event that tracks the status of the Rule,
so you can take any action based on them.

If you want additional configuration and control, see [Running SageMaker jobs with Debugger](docs/sagemaker.md) for more information.
If you want additional configuration and control, see [Running SageMaker jobs with Debugger](sagemaker.md) for more information.

#### Run Debugger in custom container

Expand Down Expand Up @@ -235,23 +235,23 @@ print(f"Loss values during evaluation were {trial.tensor('CrossEntropyLoss:0').v
## SageMaker Debugger in Action
- Through the model pruning process using Debugger and `smdebug`, you can iteratively identify the importance of weights and cut neurons below a threshold you define. This process allows you to train the model with significantly fewer neurons, which means a lighter, more efficient, faster, and cheaper model without compromising accuracy. The following accuracy versus the number of parameters graph is produced in Studio. It shows that the model accuracy started from about 0.9 with 12 million parameters (the data point moves from right to left along with the pruning process), improved during the first few pruning iterations, kept the quality of accuracy until it cut the number of parameters down to 6 million, and start sacrificing the accuracy afterwards.

![Debugger Iterative Model Pruning using ResNet](docs/resources/results_resnet.png?raw=true)
![Debugger Iterative Model Pruning using ResNet](resources/results_resnet.png?raw=true)
Debugger provides you tools to access such training process and have a complete control over your model. See [Using SageMaker Debugger and SageMaker Experiments for iterative model pruning](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-debugger/pytorch_iterative_model_pruning/iterative_model_pruning_resnet.ipynb) notebook for the full example and more information.

- Use Debugger with XGBoost in SageMaker Studio to save feature importance values and plot them in a notebook during training. ![Debugger XGBoost Visualization Example](docs/resources/xgboost_feature_importance.png?raw=true)
- Use Debugger with XGBoost in SageMaker Studio to save feature importance values and plot them in a notebook during training. ![Debugger XGBoost Visualization Example](resources/xgboost_feature_importance.png?raw=true)

- Use Debugger with TensorFlow in SageMaker Studio to run built-in rules and visualize the loss. ![Debugger TensorFlow Visualization Example](docs/resources/tensorflow_rules_loss.png?raw=true)
- Use Debugger with TensorFlow in SageMaker Studio to run built-in rules and visualize the loss. ![Debugger TensorFlow Visualization Example](resources/tensorflow_rules_loss.png?raw=true)

---

## Further Documentation and References

| Section | Description |
| --- | --- |
| [SageMaker Training](docs/sagemaker.md) | SageMaker users, we recommend you start with this page on how to run SageMaker training jobs with SageMaker Debugger |
| Frameworks <ul><li>[TensorFlow](docs/tensorflow.md)</li><li>[PyTorch](docs/pytorch.md)</li><li>[MXNet](docs/mxnet.md)</li><li>[XGBoost](docs/xgboost.md)</li></ul> | See the frameworks pages for details on what's supported and how to modify your training script if applicable |
| [APIs for Saving Tensors](docs/api.md) | Full description of our APIs on saving tensors |
| [Programming Model for Analysis](docs/analysis.md) | For description of the programming model provided by the APIs that enable you to perform interactive exploration of tensors saved, as well as to write your own Rules monitoring your training jobs. |
| [SageMaker Training](sagemaker.md) | SageMaker users, we recommend you start with this page on how to run SageMaker training jobs with SageMaker Debugger |
| Frameworks <ul><li>[TensorFlow](tensorflow.md)</li><li>[PyTorch](pytorch.md)</li><li>[MXNet](mxnet.md)</li><li>[XGBoost](xgboost.md)</li></ul> | See the frameworks pages for details on what's supported and how to modify your training script if applicable |
| [APIs for Saving Tensors](api.md) | Full description of our APIs on saving tensors |
| [Programming Model for Analysis](analysis.md) | For description of the programming model provided by the APIs that enable you to perform interactive exploration of tensors saved, as well as to write your own Rules monitoring your training jobs. |


## License
Expand Down
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
Loading