Skip to content

Numpy HPy migration notes: blockers and concerns

Stepan Sindelar edited this page Jul 12, 2022 · 7 revisions
  • NumPy uses the METH_FASTCALL | METH_KEYWORDS convention and has its own argument parser for that
  • metaclass support for heap types is missing in CPython GitHub issue for this
  • tp_vectorcall is not supposed to be used for heap types
  • NumPy accesses tp_ slots directly. Edit: not an issue since NumPy is moving away from that.
    • to compare them (We could provide bool HPyType_CheckSlot(HPyContext*, HPy, HPyDef expected))
    • to read things like tp_name, tp_base, tp_dict
    • for fast paths bypassing some CPython logic (e.g.: getting attribute without raising exception if missing)

Migration path concerns:

  • NumPy API: expose second capsule and header(s) with HPy based APIs?
    • legacy NumPy APIs would eventually delegate to the HPy versions
    • opportunity to get rid of legacy NumPy APIs/do some NumPy API cleanup

Architecture/code style concerns:

  • PyArrayObject* -> HPy removes type information and type checking
    • Numpy uses the struct types in many helper/infrastructure functions, changing all those to HPy removes the type information, which makes the code less pleasant to work with and more error prone
    • Sometimes it is desirable to pass around additional argument for PyArrayObject* alongside the HPy handle if the struct was already retrieved - this is cumbersome
  • Some ideas:
    • generate additional struct that holds both the handle and the struct and helper methods for it, e.g., typedef struct { HPy handle; PyArrayObject *data; } HPyArray;
    • generate additional struct to wrap just the handle to "attach" type information to it + generate conversion helpers
    • depending on whether we see similar patters in other packages, this could be just infrastructure in numpy port codebase or provided by HPy
    • alternatively: always pass two arguments for everything, e.g.: foo(HPyContext *ctx, HPy h_arr, PyArrayObject *arr), where arr can be NULL and the callee will be responsible to lazily initialize it before using it, we can have some convenience macros for that
    • the "arg clinic" can pass the struct as an argument (https://github.com/hpyproject/hpy/issues/129), which would have also performance benefits, but does not solve the question of how to pass that around internal Numpy helper functions
Clone this wiki locally