Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Documentation] sagemaker-debugger open source documentation pre-launch #506

Open
wants to merge 49 commits into
base: master
Choose a base branch
from

Conversation

mchoi8739
Copy link
Contributor

Description of changes:

readthedocs build log: https://readthedocs.org/projects/sagemaker-debugger/builds/14082688/
pre-launched doc: https://sagemaker-debugger.readthedocs.io/en/website/

Style and formatting:

I have run pre-commit install to ensure that auto-formatting happens with every commit.

Issue number, if available

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@mchoi8739 mchoi8739 changed the title [Website] sagemaker-debugger open source documentation pre-launch [Documentation] sagemaker-debugger open source documentation pre-launch Jun 23, 2021
@codecov-commenter
Copy link

codecov-commenter commented Jun 23, 2021

Codecov Report

Merging #506 (b170fd6) into master (d864bb4) will decrease coverage by 24.72%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #506       +/-   ##
===========================================
- Coverage   75.35%   50.62%   -24.73%     
===========================================
  Files         127      117       -10     
  Lines       11117    10590      -527     
===========================================
- Hits         8377     5361     -3016     
- Misses       2740     5229     +2489     
Impacted Files Coverage Δ
smdebug/analysis/utils.py 23.52% <ø> (-50.99%) ⬇️
smdebug/core/hook.py 73.73% <ø> (-13.70%) ⬇️
smdebug/exceptions.py 47.61% <ø> (-16.67%) ⬇️
...mdebug/profiler/analysis/notebook_utils/heatmap.py 0.00% <ø> (-13.01%) ⬇️
...debug/profiler/analysis/python_profile_analysis.py 90.90% <ø> (ø)
...profiler/analysis/utils/profiler_data_to_pandas.py 36.07% <ø> (-0.92%) ⬇️
...er/analysis/utils/python_profile_analysis_utils.py 88.40% <ø> (ø)
smdebug/rules/rule.py 34.37% <ø> (-50.48%) ⬇️
smdebug/rules/rule_invoker.py 18.18% <ø> (ø)
smdebug/trials/trial.py 56.82% <ø> (-37.61%) ⬇️
... and 79 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d864bb4...b170fd6. Read the comment docs.

@@ -63,10 +63,10 @@ The following frameworks are available AWS Deep Learning Containers with the dee

| Framework | Version |
| --- | --- |
| [TensorFlow](docs/tensorflow.md) | 1.15, 2.1.0, 2.2.0, 2.3.0, 2.3.1 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2.4 and 2.5 are also supported

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorporated

Comment on lines -67 to -69
| [MXNet](docs/mxnet.md) | 1.6, 1.7 |
| [PyTorch](docs/pytorch.md) | 1.4, 1.5, 1.6 |
| [XGBoost](docs/xgboost.md) | 0.90-2, 1.0-1 ([As a built-in algorithm](docs/xgboost.md#use-xgboost-as-a-built-in-algorithm))|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smdebug is supported on the latest versions of all available DLCs.

See page.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorporated

README.md Outdated
Comment on lines 79 to 83
| [TensorFlow](tensorflow.md) | 1.13, 1.14, 1.15, 2.1.0, 2.2.0, 2.3.0, 2.3.1 |
| Keras (with TensorFlow backend) | 2.3 |
| [MXNet](docs/mxnet.md) | 1.4, 1.5, 1.6, 1.7 |
| [PyTorch](docs/pytorch.md) | 1.2, 1.3, 1.4, 1.5, 1.6 |
| [XGBoost](docs/xgboost.md) | 0.90-2, 1.0-1 (As a framework)|
| [MXNet](mxnet.md) | 1.4, 1.5, 1.6, 1.7 |
| [PyTorch](pytorch.md) | 1.2, 1.3, 1.4, 1.5, 1.6 |
| [XGBoost](xgboost.md) | 0.90-2, 1.0-1 (As a framework)|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorporated

Copy link
Contributor

@ndodda-amazon ndodda-amazon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left mostly nits, otherwise looks good.


# Optionally set the version of Python and requirements required to build your docs
python:
version: 3.6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why Python 3.6? can we use Python 3.9?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can use Debugger with your training script on your own container
making only a minimal modification to your training script to add
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: by making

Using SageMaker Debugger on custom containers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Debugger is available for any deep learning models that you bring to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "any deep learning model" or "all deep learning models"

~~~~~~~~~~~~~~~~~~~~

Below is a comprehensive list of the built-in collections that are
managed by SageMaker Debugger. The Hook identifes the tensors that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: identifies

``XGBoost`` METRICS
============== ===========================

If for some reason, you want to disable the saving of these collections,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should tell customers to set debugger_hook_config=False in the estimator, this is a simpler alternative.

name="weights",
parameters={ "parameter": "value" })

The parameters can be one of the following. The meaning of these
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The meaning of these parameters -> These parameters

- docutils==0.15.2
- bokeh
- ipython
- pandas
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we pin the versions of bokeh, ipython, and pandas here?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The available ``hook_parameters`` keys are listed in the following. The meaning
of these parameters will be clear as you review the sections of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The meaning of these parameters -> These parameters

.. method:: create_from_json_file(json_file_path (str)

Takes the path of a file which holds the json configuration of the hook,
and creates hook from that configuration. This is an optional parameter.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: creates a hook


### AWS training containers with script mode
The following frameworks are available AWS Deep Learning Containers with
the deep learning frameworks for the zero script change experience.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we explaining what is 'zero script change experience' in the doc? If yes, can we link it here?
In other locations, I am seeing the lines such as 'no changes to your training script'


However, for some advanced use cases where you need access to customized
tensors from targeted parts of a training script, you can manually
construct the hook object. The SMDebug library provides hook classes to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using the 'SMDebug' consistently? In other locations I am seeing it is mentioned as 'smdebug'

Support
-------

- Zero Script Change experience where you need no modifications to your
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. If we are introducing a new term 'Zero script change experience', it needs to be explained some where.

Migration to Deep Learning Containers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TBD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why TBD?

mchoi8739 and others added 4 commits February 7, 2022 14:35
* add unified search to RTD website

* configure rtd build environment to make it functional
* add licensing information

* add licensing information
* add licensing information

* add search filter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants