Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type checking unyt ? #296

Open
neutrinoceros opened this issue Oct 15, 2022 · 16 comments
Open

Type checking unyt ? #296

neutrinoceros opened this issue Oct 15, 2022 · 16 comments

Comments

@neutrinoceros
Copy link
Member

This is mostly a question to @ngoldbaum and @jzuhone:
How would you guys feel about progressively adding type hints and a type checking stage to CI ? To be clear I'm thinking about doing it at least partially myself, because numpy is almost 100% "typed" now and IMO it would make sense to follow their lead. This is a long term goal as this could be quite an undertaking (though maybe not !), so I wanted to get your sentiment on it first.

@ngoldbaum
Copy link
Member

I’m pretty ambivalent about it, I don’t think it’s worth putting a ton of effort in for marginal user benefit.

@neutrinoceros
Copy link
Member Author

The thing is that I really don't have a clear idea how much work it would take. Hypothetically, were the change actually not that big, would you consider it ?

@ngoldbaum
Copy link
Member

I don’t see how you could do it without a lot of code churn. If you’d like to try to convince me, go ahead, but you will need to convince me before I merge PRs adding type annotations.

@neutrinoceros
Copy link
Member Author

I think I need to convince myself first 😅
if I do, I may submit a proof of concept PR, but for now NEP 18 is much higher priority to me.

@neutrinoceros
Copy link
Member Author

I'd like to try doing this in progressive steps. That is, while I don't intend to add type hints everywhere in the code base in one go, I do see value in sprinkling them as I'm working on various aspects of the code.
In order enable this, we first need to resolve (or explicitly ignore) all errors that mypy unyt can currently raise, and add type checking to CI. I have convinced myself this first step is a reasonable amount of work so I will open a PR to do just that. After that, we can add type hints wherever and whenever we want if it feels right.

@ngoldbaum
Copy link
Member

Like I said back in October, you’re going to need to convince me it’s worthwhile to add type annotations before I merge PRs adding them.

@neutrinoceros
Copy link
Member Author

Well an argument could be made that #355 is valuable in itself; because mypy in general will check consistency of one's code and its dependencies, by running type-checking in unyt's CI we can at a minimum guarantee that downstream code will not discover inconsistencies they cannot control within unyt. We've been working on very progressively adding type information in yt so I can testify that there's at least one code depending on unyt that does this.

Beyond that, I can think of a couple reasons why type annotations are useful to have for unyt's own good:

  1. modern IDEs understand type hints and integrate them via hovering and auto-completion, which improves coding experience for anyone who relies on the annotated library.
  2. it's validated documentation. Unlike types in docstrings (which we do use), that are only validated manually and always at risk of falling out of sync when code changes.
  3. type-checking doesn't just verify that type hints are valid and consistent, it can also detect actual bugs. For instance, taking a very simplified version of unyt_quantity.from_string
import re
_UNIT_PATTERN = r"([α-ωΑ-Ωa-zA-Z]+(\*\*([+/-]?[0-9]+)|[*/])?)+"
_UNIT_REGEXP = re.compile(_UNIT_PATTERN)

def unit_from_string(s:str) -> str:
    return re.match(_UNIT_REGEXP, s.strip()).group()

mypy will produce

t.py:7: error: Item "None" of "Optional[Match[str]]" has no attribute "group"  [union-attr]

because re.match may return None, in which case the code would normally raise AttributeError at runtime. In fact unyt_quantity.from_string was originally written in a typed-check code of mine, which helped me make sure the error messages were clear and intended.

Another example of classic gotcha that mypy can detect even with minimal amount of annotations

def get_letter(n:int) -> str:
    if n == 1:
        return "a"
    elif n == 2:
        return "b"
    # ... you get the idea
    elif n == 26:
       return "z"

mypy will detect that my function can return None instead of a string if n is any int that doesn't match any branch.

t.py:1: error: Missing return statement  [return]
  1. I often find myself wondering what types are things supposed to be. The process may take an inordinate amount of time in some cases. Adding type hints after I figure it out saves me time next time I work on the same region of the code.

@neutrinoceros
Copy link
Member Author

I should add: no matter the amount of type hints we hints we want to have (even 0), adding type-checking to the supply chain can help with discovering subtle bugs. The idea isn't to make a bazillion of PRs to go from 0 to 100% typed code, but rather to enable this aid (mypy) so we (I ?) can add type hints whenever working on something else. Of course we're free to add type hints without using a checker, but for anything more than trivial it's a risk of introducing errors.

@ngoldbaum
Copy link
Member

My main objection is adding any types at all (especially in a piecemeal way) will increase the cognitive overhead of reading the code. I don’t personally see the benefit for my personal usages with unyt, and I’d prefer not to have to learn the details of python typing just to support reviewing your PRs that add something I don’t want to deal with.

For now I’m not going to merge PRs that add type hints to unyt. If that means I don’t get more PRs from you I guess that’s too bad.

It leaves a bad taste in my mouth that you started doing work to add this after my last comment was that you’d need to convince me that it’s worth doing.

@neutrinoceros
Copy link
Member Author

It leaves a bad taste in my mouth that you started doing work to add this after my last comment was that you’d need to convince me that it’s worth doing.

I'm sorry you feel this way, it wasn't my intention that you felt unheard. Rather, I intended #355 as a concrete demonstration that kicking this off wasn't as big of a deal as one might have thought (it wasn't clear even to me before I just did it).

For now I’m not going to merge PRs that add type hints to unyt. If that means I don’t get more PRs from you I guess that’s too bad.

I don't think this will completely ruin my motivation to contribute, and I hope it didn't ruin your trust in me either.

My main objection is adding any types at all (especially in a piecemeal way) will increase the cognitive overhead of reading the code.

all I can say is that in my experience, given enough time, the opposite becomes true, but I understand this isn't something I can really convince you of before jumping in.

I don’t personally see the benefit for my personal usages with unyt, and I’d prefer not to have to learn the details of python typing just to support reviewing your PRs that add something I don’t want to deal with.

You seem to have stronger opinions on this matter than I previously thought, which may explain this mishap.
Again I am sorry for how I handled things, I genuinely didn't anticipate you'd receive it poorly.

@basnijholt
Copy link

basnijholt commented Jan 31, 2024

For what it is worth, I am evaluating different libraries that deal with units. Just the fact that this code is not properly typed is a reason to skip unyt IMO.

About

My main objection is adding any types at all (especially in a piecemeal way) will increase the cognitive overhead of reading the code.

I definitely agree with @neutrinoceros here, the code is much easier to understand once you have worked with type hints a bit.

In my experience, being a full-time Python developer for almost a decade, adding type hints and using mypy has uncovered bugs in basically every project I worked on. The benefit might not be immediately obvious because it might identify only a handful of bugs, however, the major advantage is identifying bugs while writing new code. Before you execute the code it already helps you a lot.

Additionally, having type-hints is also great for the users of your package because it allows editors to make meaningful suggestions.

Hope you'll reconsider at some point.

@berceanu
Copy link
Contributor

@basnijholt which unit library did you end up choosing in the end, asking out of curiosity

@basnijholt
Copy link

astropy.units because it seems to have the best supports with numerical packages. The downside is that is comes with a lot of code that I won't use.

@JonathanRayner
Copy link

@ngoldbaum I'll just add that type hinting/checking is considered standard for all serious new projects, especially in industry. If you want people to use this library (and the benefits of contribution/extension, etc.), an obtuse attitude towards keeping up with standards is not a good idea (especially when people are volunteering to help with PR's to update).

I find it bizarre that you are obviously advertising this library for others to use (writing a paper on it, documentation, lots of stylistic language about a broader benefit to science), yet veto reasonable requests with no other reason than you don't like how it will affect your personal usage. Which is it? Do you want to advance science and contribute to the community and in return receive the benefits of prestige and community contribution, or do you just want to be emperor of your own toy project? Either is of course fine, no one is forced to use your library. But you should not be disingenuous about what the purpose of this library is if it is just for your usage.

Maybe after writing some more Rust you will be less adversarial to type checking in python? Perhaps your views have changed?

@jzuhone
Copy link
Contributor

jzuhone commented May 30, 2024

@JonathanRayner FWIW, Nathan is no longer maintaining this project--that is up to me and @neutrinoceros.

I'm personally open to exploring type checking--I am unfortunately very swamped at the moment with other things.

@neutrinoceros
Copy link
Member Author

I'll add that while I don't have personal motivation to pursue this myself now, I would happily review PRs if anyone wants to give it a spin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants