Skip to content
This repository has been archived by the owner on Oct 16, 2024. It is now read-only.

Docs for extensions #179

Closed
wants to merge 6 commits into from
Closed

Docs for extensions #179

wants to merge 6 commits into from

Conversation

mike0sv
Copy link
Contributor

@mike0sv mike0sv commented Sep 16, 2022

This is second PR for #171 (after #172); Also rel. #181
It adds separate page for each MLEM extension. For now most of them consist only of auto-generated content:

  1. short description taken from module docstring (see Add docstrings to extensions mlem#413)
  2. (optional) requirements
  3. Implementation reference - generated reference for each MLEM class implemented in this extension

There are also two sections that are mostly empty: description and examples. They are meant to be filled manually.
However I filled them for first pages in each category (model/data/etc) just as an example.

Open questions:

  • Where to put this. For now it resides in a separate sidebar section. There is also an option to merge it with User Guide
  • Formatting. I did my best, but we def can do better :). Please take a look at multiple pages and give your suggestions. Preferably use first page (models/sklearn) to leave comments and change suggestions so we have them all in one place. I will adapt bootstrap script and re-generate all of the pages, no need to leave them everywhere.
  • Actual content. As @shcheklein says, the actual work is writing those manual sections. Once we are finished with first two Qs, we can start on this one. Since it's a lot of work, I suggest we do each extension in a separate PR. Also no need to wait for all of them - we can just hide unfinished ones (just from sidebar or actually remove the files)

In review app: https://mlem-ai-feature-0-3-0-e-qpg1qt.herokuapp.com/doc/extensions

@shcheklein shcheklein temporarily deployed to mlem-ai-feature-0-3-0-e-qpg1qt September 16, 2022 17:18 Inactive
@github-actions
Copy link

3d346d3

Link Check Report

All 43 links passed!

CML watermark

@shcheklein
Copy link
Member

Actual content. As @shcheklein says, the actual work is writing those manual sections. Once we are finished with first two Qs, we can start on this one. Since it's a lot of work, I suggest we do each extension in a separate PR. Also no need to wait for all of them - we can just hide unfinished ones (just from sidebar or actually remove the files)

sounds good to me @mike0sv !

@mike0sv mike0sv requested review from jorgeorpinel and removed request for 0x2b3bfa0 September 19, 2022 11:47
@aguschin aguschin mentioned this pull request Sep 19, 2022
12 tasks
@mike0sv mike0sv changed the base branch from main to release/0.3.0 September 19, 2022 12:58
@aguschin
Copy link
Contributor

Where to put this. For now it resides in a separate sidebar section. There is also an option to merge it with User Guide

Let's first see what extensions we have and what we want to write as docs:

  1. Data formats (to save and import) - numpy, pandas, etc (split save and import maybe?)
  2. ML frameworks (to save and import) - onnx, tensorflow, etc
  3. Ways to serve - fastapi, rabbitmq, etc
  4. Artifacts to build - docker, pip package, onnx again, etc
  5. Places to deploy to - heroku, k8s, etc
  6. Git providers to work with - GH, GL, BB

I assume, eventually we would like to cover all of these. For the beginning, "deploy" and "build" probably gets the priority.

In docs, these items could be subpages of User Guide, e.g.: User Guide > Deployment > Heroku or User Guide > ML frameworks > Onnx.

@aguschin
Copy link
Contributor

Ahah, I was writing the previous comment without checking out the docs deployment to preview env.

I wonder if we can merge some pages into what you generate. E.g. "Working with data" could be moved here "Data". Looks like there are not so many opportunities though, cause Models/Deployemnts/Serve/Build was already covered in GS. So we can just add a link to GS there.

@mike0sv mike0sv mentioned this pull request Sep 19, 2022
content/docs/extensions/deployment/docker.md Show resolved Hide resolved
content/docs/extensions/model/sklearn.md Show resolved Hide resolved
from mlem.api import save, load


data, y = load_iris(return_X_y=True, as_frame=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we could auto-generate this from our tests, but not sure it's possible. We have asserts, fixtures, etc, it's going to be a challenge to remove them and make the code both working and readable.

content/docs/extensions/storage/dvc.md Show resolved Hide resolved
@mike0sv
Copy link
Contributor Author

mike0sv commented Sep 19, 2022

Had a call with @aguschin, we decided to go this way:

Target docs section for specific user groups. There are 4 of them:

  1. Novice - first time mlemming. His section is Get Started of course. It should showcase mlem out-of-the-box features without additional configuration
  2. Average user - uses only out-of-the-box features, but extensively. User Guide is for him - it should explain mlem concepts in detail, but without internals. Basically all mlem out-of-the-box features with every possible additional configuration (e.g how to use mlem vs sagemaker), but again without implementation details - just everything needed to use it
  3. Advanced user - somebody who is in need of custom implementations and should know internal details. We need new section for those, eg "Advanced Usage"
  4. Contributor - the same as 3. plus contributing guide

So, we will merge Extensions section into User Guide: each extensions/<type> page will become user-guide/how to do <feature type> in mlem (not exact wording of course) and each extensions/<type>/<subtype> will become a subpage how to do <type> with <subtype>. So rn extension pages are more like mlem.contrib.<ext> python module description, and they will become about "how to use mlem with ".

We also will re-write some User Guide pages to remove internal details from them. Or completely move some pages to "Advanced Usage" (I'm looking at you, mlem abcs and extending). Also, implementation reference sections of extension pages will be cut from them and moved into new "Objects Reference" page (maybe it will be top section like api and command reference, maybe under "Advanced Usage")

Does this make sense?

@aguschin
Copy link
Contributor

@jorgeorpinel @shcheklein WDYT about what Mike said?

@shcheklein
Copy link
Member

Yep, sounds very reasonable to me, folks.

@jorgeorpinel jorgeorpinel added the A: docs Area: user documentation (gatsby-theme-iterative) label Sep 21, 2022
Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the actual work is writing those manual sections

I'm just going to leave a couple comments on one of the index pages for now (probably apply to all):

Comment on lines +1 to +4
# Model extensions

Model extensions add support for new types models that MLEM can covert into MLEM
model objects in [`save` API method](/doc/api-reference/save)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try to explain this at a higher level. What functionality do model extensions give to users? OK they extend MLEM model objects I think, how is that useful? Should we start with a recap of what a model object is (used for) first?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to address this in #188 in "Object Reference" section, e.g. see this page https://mlem-ai-new-docs-struct-hy95mv.herokuapp.com/doc/object-reference/model. If you have some feedback, you can post it here or do it in #188.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unable to leave feedback at this level in #188 atm, it's too huge.

Comment on lines +6 to +10
Typicaly they will implement [ModelType](/doc/user-guide/mlem-abcs#modeltype)
and [ModelIO](/doc/user-guide/mlem-abcs#modelio) interfaces.

Some also implement [DataType](/doc/user-guide/mlem-abcs#datatype) interface if
specific data objects are needed for model to work.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now just a copy edit summarizing this part but let's consider (or make an issue about) auto-linking many of these ABCs as well as any other reference-like terms (commands, extensions even, other types, etc).

Suggested change
Typicaly they will implement [ModelType](/doc/user-guide/mlem-abcs#modeltype)
and [ModelIO](/doc/user-guide/mlem-abcs#modelio) interfaces.
Some also implement [DataType](/doc/user-guide/mlem-abcs#datatype) interface if
specific data objects are needed for model to work.
Typically they will implement [ModelType] and [ModelIO] interfaces. Some also
implement the [DataType] interface if specific MLEM data objects are needed for
model to work.
[modeltype]: /doc/user-guide/mlem-abcs#modeltype
[modelio]: /doc/user-guide/mlem-abcs#modelio
[datatype]: /doc/user-guide/mlem-abcs#datatype

Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use first page (models/sklearn) to leave comments and change suggestions

👇🏼

@@ -0,0 +1,69 @@
# Scikit-Learn Models Support
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URLs and titles aren't very consistent: sklearn vs. Scikit-Learn Models Support. Consistency reduces confusion and can help SEO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, it's concise, which is good to me. Easier to remember if you want to type it in URL or do something with the link.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you on this. Just keep in mind URL vs page titles please 🙂 If they're different it should be a reasoned, conscious decision, I think.

Comment on lines +3 to +4
[ModelType](/doc/user-guide/mlem-abcs#modeltype) implementations for any
sklearn-compatible classes as well as `Pipeline`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Product questions (rhetorical i.e. these need links or better explanations IMO):

  • What are sklearn-compatible classes ?
  • What is Pipeline?

However, this intro is a bit low-level (even for a reference). It's OK to be specific about what the extension is/does but can we try to zoom out, use more general language, etc? Something like "Enables you to connect Scikit-Learn classes to MLEM by implementing a model object..." (just an idea)

Comment on lines +6 to +8
## Description

**TODO**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't seen any descriptions yet. Any ideas for what will be in these sections? Maybe leave it for the user guide if it's unclear now.

Comment on lines +10 to +16
## Requirements

```bash
pip install mlem[sklearn]
# or
pip install scikit-learn
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Installation?
  • bash->cli?

Comment on lines +18 to +20
## Examples

### Saving and loading Scikit-Learn model
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to use ## Example: Title if you want them to get listed in the right-hand ToC automatically:

image

content/docs/extensions/model/sklearn.md Show resolved Hide resolved
@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Sep 21, 2022

Target docs section for specific user groups. There are 4 of them:

Thanks @mike0sv and @aguschin, great analysis. I've had the chance to think and research on this topic quite a bit and my view is somewhat different although the logic is similar. PTAL at this proposed framework. However that conversation goes beyond the scope of this PR (mainly a class reference IMO) so I'll post a hidden answer down here but let's move it elsewhere after that?

In the proposed docs framework, we work with conceptual levels and expected traffic (along with other properties) instead of audiences (but again, it's similar). These levels also roughly match marketing/SEO parts of the user journey (TOFU, MOFU, BOFU).

Some key comparisons:

  • An audience you're missing is explorers who are finding solutions. They look at very high level materials e.g. use cases and may or may not be technical enough or have time to look at anything else. We then hope the pass it on to others in the team to actually try, learn, adopt, etc.
  • Novice: GS is indeed the most important section for adoption. But it doesn't have to be limited to vanilla usage. Its main goal is to get the new user to the product value ASAP: This criteria tells you what non-essential info to skip.
  • Average: I think more of recurrent users (of whatever proficiency), and they tend to use CLI or API (IDE) help a lot or the technical references the most, so its good that these are comprehensive and self-contained. They may read parts of the UG now and then (to go from novice to average or from average to advanced, for example).
  • Advanced: I don't think there's need for special advanced usage sections. All docs (esp. mid/low-level) tend to be structured from basic/overview to detailed/advanced.
  • Contributor: They usually get to the README first but having a contributors guide is def. nice.

In the end we have to decide on a view so there's no precise right or wrong, and a lot of this is biased towards DVC. So we should definitely be flexible and adapt the framework to also accommodate MLEM product and docs needs.

@jorgeorpinel
Copy link
Contributor

rn extension pages are more like mlem.contrib. python module description, and they will become about "how to use mlem with ".

I see the auto-generated references as very far from this plan. "How to" guides could be a great addition but it seems like we still need the more technical references and these are different tasks so the guides should be a different PR I think. Also, I'd make a single guide per object category and use tabs to consolidate all type into a single page.

@aguschin aguschin mentioned this pull request Sep 23, 2022
18 tasks
@mike0sv mike0sv mentioned this pull request Oct 4, 2022
16 tasks
@aguschin
Copy link
Contributor

@jorgeorpinel @mike0sv, I'm closing this PR, since we decided that everything should be done in #188. @jorgeorpinel, I'll use your feedback above for #188.

@aguschin aguschin closed this Oct 11, 2022
aguschin added a commit that referenced this pull request Oct 27, 2022
Docs structure overhaul as discussed in #179 
This PR is meant to replace #179 and #182 

What was done:
- Created subpages for all major features (models/data/serving/building/deploy) in user-guide for each
- Moved there hand-crafted documentation from extension docs in #179
- index pages are from GS
- Renamed Extension section into Object Reference
- restructured it. was: `<ext type>/<ext name>`, now: `<object group>/<ext name>`. Now multipurpose extensions (eg docker has build & deploy) has two pages
- Object Reference is now fully autogenerated
- Added builtin implementations there too
- Moved details about mlem abcs and mlem objects from UG
- Rewritten GS
- New User Guide pages
- New Object reference index pages written
- Use Cases cleaned up

TODO:
- [x] rewrite GS Deployment
- [x] rewrite UG Deployment (+ explain what State is)
- [x] update UG Heroku
- [x] merge UG K8s @madhur-tandon 
- [x] update UG Sagemaker - @mike0sv, [there are your TODOs left](https://mlem-ai-new-docs-struct-hy95mv.herokuapp.com/doc/user-guide/deploying/sagemaker)
- [x] write UG Docker @madhur-tandon 
- [x] update CLI reference
- [x] update API reference
- [x] merge UG export to venvs @madhur-tandon 
- [x] rewrite Project structure
- [x] update UC - remote `.mlem/` dir and `mlem.api.ls` mentions
- [x] search and remove all other mentions of `.mlem/` dir and `mlem.api.ls`
- [x] search and update all other mentions of `mlem init`
- [x] search and update all other mentions of `--conf` or `-c`
- [x] fix broken links @madhur-tandon 
- [x] add `mlem checkenv` to command reference. Why it wasn't generated I wonder?
@jorgeorpinel jorgeorpinel added the C: user-guide Content of /doc/user-guide label Oct 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A: docs Area: user documentation (gatsby-theme-iterative) C: user-guide Content of /doc/user-guide
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants