feat: add pretty nbytes repr to .show and jupyter repr #3348

pfackeldey · 2024-12-17T01:42:14Z

This is a little PR that adds (optionally, default is False) a new line to .show(nbytes=True) and the jupyter repr of high-level ak.Arrays and ak.Records. It's for the lazy (like me) who don't want to do the calculation in their head...

Preview (.show()):

Preview (Jupyter repr):

What do you think?

agoose77 · 2024-12-17T11:36:46Z

Love this improvement! There's the usual caveat note about nbytes changing depending upon how you share an array, but that's true already!

pfackeldey · 2024-12-17T13:38:03Z

Yes 👍, also it does not take into account the python objects/structure and whatever is in attrs, but I'd argue that this is only reasonable - the array(s) (should) make up 99% of consumed memory.

jpivarski · 2024-12-17T14:39:38Z

This is very nice! But can we make it say "nbytes:" instead of "size:" (even if it's converted into units like "MB")? That would do two things: it would provide a hint as to how to access this number programmatically and it would avoid a confusion in which NumPy uses "size" to mean the number of items (size * itemsize = nbytes).

Also, are you making the right distinction between "MB" (megabytes) and "MiB" (mebibytes)? One system uses factors of 1000 and the other uses factors of 1024.

pfackeldey · 2024-12-17T14:45:57Z

I chose size initially because of the 4 letters that align visually nicely with axes and type for the repr (this may not be a good argument though). I can change it to nbytes 👍

Also, are you making the right distinction between "MB" (megabytes) and "MiB" (mebibytes)? One system uses factors of 1000 and the other uses factors of 1024.

That should be correct, I'm using factors of 1000 on ak.Array.nbytes for formatting the unit, which should then be in "MB" and not "MiB", right?

pfackeldey · 2024-12-17T14:58:53Z

Oh btw, one thing that I noticed is that the typetracer backend reports nbytes as if it is a numpy array. I can see why we want to mimic numpy arrays in every aspect with the typetracer backend, but wouldn't it make sense to set nbytes for typetracers explicitly to 0 ? This is just a minor design choice...

jpivarski · 2024-12-17T14:59:51Z

https://simple.wikipedia.org/wiki/Mebibyte (I'm in a meeting)

…; KB->kB

pfackeldey · 2024-12-17T19:39:26Z

as discussed I changed:

"KB" -> "kB"
"size:" -> "nbytes:"
added backend as an argument
sorted the rows (without array and type) in increasing/decreasing order for .show/jupyter-repr

Example:

In [1]: import awkward as ak; import numpy as np

In [2]: array = ak.with_named_axis(
   ...:     ak.zip({
   ...:         "one": np.full((1234567,), 1.0),
   ...:         "two": np.full((1234567,), 2.0),
   ...:         "three": np.full((1234567,), 3.0),
   ...:     }),
   ...:     named_axis=("some",),
   ...: )

In [3]: array.show(type=True, named_axis=True, nbytes=True, backend=True)
type: 1234567 * {
    one: float64,
    two: float64,
    three: float64
}
axes: some:0
nbytes: 29.6 MB
backend: cpu
[{one: 1, two: 2, three: 3},
 {one: 1, two: 2, three: 3},
 {one: 1, two: 2, three: 3},
 {one: 1, two: 2, three: 3},
 {one: 1, two: 2, three: 3},
 {one: 1, two: 2, three: 3},
 {one: 1, two: 2, three: 3},
 {one: 1, two: 2, three: 3},
 {one: 1, two: 2, three: 3},
 {one: 1, two: 2, three: 3},
 ...,

For .show(...) the order is:
(1) type (2) ascending order of axes, nbytes, backend (based on the <prefix>: length) (3) array

For the Jupyter repr the order is:
(1) array (2) ------ (3) descending order of axes, nbytes, backend (based on the <prefix>: length) (4) type

src/awkward/highlevel.py

pfackeldey · 2024-12-18T15:24:22Z

@jpivarski could you have one last look at this PR? thanks 🙏

jpivarski

It looks great! I see that the code duplication is gone (now in highlevel_array_show_rows) and it has an all argument. I think the PR is ready to merge.

add pretty nbytes repr to .show and jupyter repr

1c135d1

pfackeldey temporarily deployed to docs December 17, 2024 01:49 — with GitHub Actions Inactive

ak.Array.show(): add backend arg; fix sorting of rows; fix doc string…

fe58ba4

…; KB->kB

pfackeldey requested a review from jpivarski December 17, 2024 20:00

jpivarski approved these changes Dec 17, 2024

View reviewed changes

src/awkward/highlevel.py Outdated Show resolved Hide resolved

src/awkward/highlevel.py Outdated Show resolved Hide resolved

src/awkward/highlevel.py Outdated Show resolved Hide resolved

pfackeldey added 4 commits December 18, 2024 09:03

Merge branch 'main' into pfackeldey/add_bytes_repr

6f71e5b

address Jim's comments

65135dc

Merge branch 'main' into pfackeldey/add_bytes_repr

90f2d6c

address Jim's comments

09e0ae5

pfackeldey deployed to docs December 18, 2024 14:59 — with GitHub Actions View deployment

jpivarski approved these changes Dec 18, 2024

View reviewed changes

pfackeldey merged commit 55f1909 into main Dec 18, 2024
39 checks passed

pfackeldey deleted the pfackeldey/add_bytes_repr branch December 18, 2024 16:05

pfackeldey mentioned this pull request Dec 18, 2024

fix: nbytes property of ak.Record #3352

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add pretty nbytes repr to .show and jupyter repr #3348

feat: add pretty nbytes repr to .show and jupyter repr #3348

pfackeldey commented Dec 17, 2024 •

edited

Loading

agoose77 commented Dec 17, 2024

pfackeldey commented Dec 17, 2024

jpivarski commented Dec 17, 2024

pfackeldey commented Dec 17, 2024 •

edited

Loading

pfackeldey commented Dec 17, 2024

jpivarski commented Dec 17, 2024

pfackeldey commented Dec 17, 2024 •

edited

Loading

pfackeldey commented Dec 18, 2024

jpivarski left a comment

feat: add pretty nbytes repr to .show and jupyter repr #3348

feat: add pretty nbytes repr to .show and jupyter repr #3348

Conversation

pfackeldey commented Dec 17, 2024 • edited Loading

agoose77 commented Dec 17, 2024

pfackeldey commented Dec 17, 2024

jpivarski commented Dec 17, 2024

pfackeldey commented Dec 17, 2024 • edited Loading

pfackeldey commented Dec 17, 2024

jpivarski commented Dec 17, 2024

pfackeldey commented Dec 17, 2024 • edited Loading

pfackeldey commented Dec 18, 2024

jpivarski left a comment

Choose a reason for hiding this comment

pfackeldey commented Dec 17, 2024 •

edited

Loading

pfackeldey commented Dec 17, 2024 •

edited

Loading

pfackeldey commented Dec 17, 2024 •

edited

Loading