Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs Round 1 #175

Merged
merged 49 commits into from
Jul 3, 2024
Merged

Docs Round 1 #175

merged 49 commits into from
Jul 3, 2024

Conversation

willtebbutt
Copy link
Member

@willtebbutt willtebbutt commented May 29, 2024

This is the first round of documentation. The plan is to explain Tapir.jl's rule abstraction, and the tangent type system. The audience is intended to be those who are potentially interested in developin Tapir, rather than just using it.

This is not ready for review yet.

Note: if reading, be aware that I'm making extensive use of docstrings to create the documentation as this reduces the amount of duplication of documentation, and makes it easy to use doctests to avoid letting the documentation going out-of-sync with the code in future. Therefore, you should either use the docs preview, or build it locally, to get a proper sense of what the docs look like.

edit: todo

  • improve explanation of derivatives
  • improve X^T X explanation
  • improve first example involving mutation
  • improve citations
  • provide generalised explanation of rrule interface

@willtebbutt willtebbutt marked this pull request as draft May 29, 2024 16:56
Copy link
Contributor

github-actions bot commented May 29, 2024

Performance Ratio:
Warning: results are very approximate!

┌────────────────────────────┬────────┬─────────┬─────────────┬─────────┐
│                      Label │  Tapir │  Zygote │ ReverseDiff │  Enzyme │
│                     String │ String │  String │      String │  String │
├────────────────────────────┼────────┼─────────┼─────────────┼─────────┤
│                        sum │   42.8 │   0.463 │        3.01 │   0.638 │
│                       _sum │   6.77 │   528.0 │        26.9 │   0.118 │
│                   kron_sum │   81.7 │    3.43 │       216.0 │    26.1 │
│              kron_view_sum │   99.2 │    11.5 │       248.0 │    9.38 │
│      naive_map_sin_cos_exp │   4.14 │ missing │        8.62 │    2.82 │
│            map_sin_cos_exp │   4.79 │    1.86 │        7.53 │    3.41 │
│      broadcast_sin_cos_exp │   4.68 │    2.61 │        1.65 │    2.82 │
│                 simple_mlp │   8.82 │    3.02 │        11.2 │    3.07 │
│                     gp_lml │   15.5 │    4.34 │     missing │ missing │
│ turing_broadcast_benchmark │   8.73 │ missing │        26.2 │ missing │
└────────────────────────────┴────────┴─────────┴─────────────┴─────────┘

Copy link
Contributor

@mhauru mhauru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read through the Intro and Algorithmic Differentiation pages, and took a peek at the Mathematical Interpretation of Functions one as well. Overall, this is very clearly written, and helped me a lot. I have essentially no previous experience of AD implementations and how they work.

I left a few comments. Related to the first comment, I think my largest confusion was with the infinitesimals/differentials \text{d} x. The text also at a couple of points switched between that and \dot{x}. They seem to be considered as elements of the same set as x itself, which I found confusing given they are infinitesimal and more like tangents. I'm not sure what the best thing to do here is, but given that the later section about implementation seems to have to introduce the notion of tangent types anyway, I wonder if it would be useful to do so here in mathsland already. If not that, then a note about how the reader should think of \text{d} x/\dot{x} would be good.

docs/src/algorithmic_differentiation.md Outdated Show resolved Hide resolved
docs/src/algorithmic_differentiation.md Outdated Show resolved Hide resolved
@torfjelde
Copy link

Had a skim through it all and it looks very nice:)

I don't really have any concrete feedback beyond what @mhauru has already given. I was also a bit confused by the switch between $d x$ and $\dot{x}$, in part because it eludes to the fact that you should actually define the gradient as the time-derivative for a particular curve at that given point but this is (AFAIK) not mentioned anywhere. Buuut those who know this can probably can probably handle this minor thing, while those who don't know this won't ofc have this moment of "so is this the time derivative or is it something else?" 🙃

But all in all, I definitively think the docs achieve what it sets out to do! Many of the things you've mentioned to me about Tapir.jl before now makes more sense and I understand some of the design choices made much better:)

@willtebbutt willtebbutt changed the title WIP: Docs Round 1 Docs Round 1 Jul 2, 2024
@willtebbutt willtebbutt marked this pull request as ready for review July 2, 2024 10:34
@yebai yebai merged commit 7cc96a4 into main Jul 3, 2024
17 checks passed
@yebai yebai deleted the wct/rrule-docs branch July 3, 2024 12:04
@willtebbutt willtebbutt mentioned this pull request Jul 3, 2024
willtebbutt added a commit that referenced this pull request Jul 4, 2024
#175 bumped some stuff in src, so there should have been a release made.
@willtebbutt willtebbutt mentioned this pull request Jul 4, 2024
willtebbutt added a commit that referenced this pull request Jul 4, 2024
#175 bumped some stuff in src, so there should have been a release made.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants