stress-tess
released this
03 Oct 00:12
·
40 commits
to master
since this release
Bug Fixes
- Issue #3762 - Fix dataframe groupby aggregations when keys contain
NaN
s - Issues #3658, #3650, #3654, #3783, #3784, #3788 and PR #3386 - Fix IO bugs including:
- reading segarrays containing
NaN
s and empty segments with hdf5 and parquet - reading dataframes containing uint and int segarray columns
- CSV address sanitizer "use after free" memory issues
- reading segarrays containing
- Issues #3648, #3676, #3682, #3679, #3687, #3666 - Fix multidimensional bugs in sorting,
nonzero
,repeat
,flatten
, andunflatten
- Issue #3367 - Fixes racy condition in SegHead function
- Issue #3468 - Fixes round trip discrepancies for Index with Categorical values
- Issue #3649 - Fixes bitshift failures
- Issue #3467 - Fixes indexing error in DataFrame instantiation
Major Updates
- Issues #3628, #3703 - Drop python
3.8
support - Issue #3355 - Pins
scipy<=1.13.1
- Issues #3332, #3334, #3351, #3360, #3417, #3419, #3504, #3613, #3695, #3769, #3767, #3711 and PRs #3363, #3368, #3379 -parquet optimizations:
- Added fixed length flag for string reads
- Read strings and byte sizes in batches
- Simplified source code
- Issues #3336, #3362, #3183, #3364, #3226, #3523, #3278,#3373, #3372, #3627 - Improve random module with a focus on numpy alignment. Adding:
exponential
,lognormal
,logistic
- multidimensional functionality to Random module
- Issues #3294, #3639, #3665, #3709 - Improve testing and add
delete
function for multidimensional arrays - Issues #3425, #3526, #3632, #3656, #3631, #3718, #3720, #3722, #3771, #3657 and PRs #3345, #3358, #3359, #3371, #3518, #3474, #3521, #3525, #3590, #3606, #3603, #3685, #3672, #3691, #3789, #3773, #3786, #3634, #3671, #3655, #3697 - Refactor and improve server side message argument handling
- PRs #3516, #3593, #3745 - Add initial implementation of sparse matrix functionality including matrix multiplication,
fill_vals
, andto_pdarray
Minor Updates
- Issues #2978, #3702 - Strip out ArrayView (replaced by multidimensional pdarray functionality)
- Issue #3302 - Adds
GroupBy.head
- Issue #3326 - Adds
DataFrame.assign
- Issue #3510, #3511 - Update
DataFrame.to_pandas
andSeries.to_pandas
to handle categoricals - Issue #3293, #3428 - Add
putmask
functionality - Issue #3297 - Adds
array_equal
- Issue #3742 - move numeric module to arkouda.numpy
- Issues #3289, #3288, #3291, #3295, #3299, #3298, #3301, #3287, #3296 - Modify dtypes for better numpy alignment
- rename
bool
tobool_
, align with numpy scalar type, removetranslate_np_dtype
- rename
- Issues #3259, #3265, #3267, #3271, #3275, #3269, #3263, #3273, #3261, #3385, #3400, #3403, #3409, #3440, #3445, #3457, #3448, #3452, #3454, #3459, #3461, #3463, #3465, #3407, #3442, #3446, #3405, #3411, #3389, #3212, #3145, #3144, #3143, #3231, #3441, #3447, #3458, #3443, #3462, #3466 , #3455, #3444, #3438, #3450, #3460, #3464 , #3388, #3430, #3624 , #3453, #3413, #3646, #3402, #3439, #3669, #3415, #3421 - Transitions to new testing suite including updating
make test
- Issues #3508, #3748, #3759, #3727, #3378 - Updates documentation including:
- chapel tutorial, installation docs, and documentation about memory pressure during server builds
- Issues #3793, #3798, #3797 and PR #3730 - Updates to benchmarks
Auto-Generated Release Notes
- Closes #3308 Unify file permissions by @ajpotts in #3309
- Closes #3332: Split Parquet code into multiple files by @bmcdonald3 in #3333
- temporary fix for #3355: pin
scipy<=1.13.1
to avoid CI failures by @ajpotts in #3356 - Closes #3334, #3351: Simplify server side string code and added fixed length by @bmcdonald3 in #3335
- Ignore new Parquet object files by @bmcdonald3 in #3363
- Closes #3336, #3362: Reuse random number generation loop structure by @stress-tess in #3352
- Closes #3259: deprecate test/scipy/scipy_test.py and special_test.py by @ajpotts in #3260
- Closes #3265 deprecate tests/numeric_test by @ajpotts in #3266
- Closes #3267 deprecate tests/dtypes_test by @ajpotts in #3268
- Closes #3271 deprecate tests/index_test by @ajpotts in #3272
- Closes #3275 deprecate tests/categorical_test by @ajpotts in #3277
- Closes #3360: Reduce code duplication in Parquet read code with templates by @bmcdonald3 in #3361
- Closes #3183, #3364: Add
exponential
distribution and aggregation to random generator loop by @stress-tess in #3310 - Simplify Command Map by @jeremiah-corrado in #3345
- Adds arkouda.testing module by @ajpotts in #3186
- Closes #3269 deprecate tests/datetime_test by @ajpotts in #3270
- Remove string.doFormat, replacing with string.format by @jeremiah-corrado in #3365
- Closes #3302 GroupBy.head by @ajpotts in #3324
- Refactor MessageArgs by @jeremiah-corrado in #3358
- Closes Ticket #3263: deprecate tests/dataframe_test by @ajpotts in #3264
- Closes #3273 deprecate tests/series_test by @ajpotts in #3274
- Remove number of files multiplication for IO benchmark by @bmcdonald3 in #3368
- 3231 unique unit tests by @drculhane in #3258
- Adds missing numpy dtypes by @ajpotts in #3330
- Closes #3367 racy condition in SegHead function by @ajpotts in #3369
- Closes #3261 deprecate tests/numpy by @ajpotts in #3262
- Closes #3375: Cleanup indexof1d code by @stress-tess in #3377
- Disable Parquet multi row group test until resolved by @bmcdonald3 in #3379
- Closes #3326 DataFrame.assign by @ajpotts in #3327
- Refactor SymbolTable and error handling by @jeremiah-corrado in #3359
- Resolves #3294 - Add numpy-like delete function by @jeremiah-corrado in #3321
- Closes #3281 rename bool to bool_ to match numpy by @ajpotts in #3282
- Fixes #3392: Fix mypy CI failures by @stress-tess in #3394
- Resolve CSV Asan "use after free" memory issues by @ShreyasKhandekar in #3386
- Closes #3376 more numpy imports by @ajpotts in #3381
- Closes #3385 groupby_test.py by @ajpotts in #3397
- Closes #3400 deprecate alignment_tests.py by @ajpotts in #3401
- Closes #3403 deprecate bigint_agg_test.py by @ajpotts in #3404
- Closes #3409 deprecate tests/client_dtypes_test.py by @ajpotts in #3410
- Closes #3417: Separate Parquet string read code from generic read function by @bmcdonald3 in #3418
- Closes #3419: Remove intertwined list column and string column byte calculation logic by @bmcdonald3 in #3420
- Closes #3425: Improve Msg Function Registration for Module Tracking by @bmcdonald3 in #3424
- Closes #3226: Adds parameterization to test_shuffle and test_permutation by @drculhane in #3320
- Closes #3414 deprecate compare_test.py by @ajpotts in #3416
- Closes #3293: Add
putmask
by @drculhane in #3370 add-path
modification for building on Horizon by @brandon-neth in #3423- Closes #3405 deprecate tests/bitops_test.py by @ajpotts in #3406
- Closes #3411 deprecate tests/client_test.py by @ajpotts in #3412
- Automated command registration by @jeremiah-corrado in #3371
- Closes #3475 make fails when lib and lib64 directories are both present by @ajpotts in #3503
- Closes #3504: Improve Parquet Integration: Stop Using Array Views by @bmcdonald3 in #3505
- Remove support for pre-2.0 versions of Chapel by @jeremiah-corrado in #3477
- Adds skip configuration for multidimensional histogram test by @brandon-neth in #3506
- Updates PROTOs pdarray_creation_test by @drculhane in #3393
- Closes #3514-add pandas-stubs to arkouda-env-dev.yml by @ajpotts in #3515
- Stop requiring manual installation of
chapel-py
to register commands by @jeremiah-corrado in #3518 - Closes #3407 deprecate tests/check.py by @ajpotts in #3408
- Closes #3442 deprecate indexing_test.py by @ajpotts in #3469
- Closes #3446 deprecate logger_test.py by @ajpotts in #3471
- Remove legacy 'supported_scalar_types' section from config file by @jeremiah-corrado in #3474
- Add function to retrieve server's maximum supported pdarray rank by @jeremiah-corrado in #3521
- Resolves #3467: Fix indexing error in DataFrame instantiation by @ajpotts in #3429
- Closes #3451 deprecate parquet_test.py by @ajpotts in #3476
- Soften
chapel-py
requirement for building server by @jeremiah-corrado in #3525 - Closes #3468 round trip discrepancies for Index with Categorical values by @ajpotts in #3472
- Closes #3441 deprecate import_export_test.py by @ajpotts in #3470
- Closes #3519, #3522: unpin hdf5 library by @ajpotts in #3520
- Add initial implementation for Sparse Matrix Mult by @ShreyasKhandekar in #3516
- Support for Zarr IO by @brandon-neth in #3159
- Closes #3440 deprecate extrema_test.py by @ajpotts in #3582
- Closes #3445: deprecate join_test.py by @ajpotts in #3583
- Closes #3511 DataFrame.to_pandas to handle categoricals by @ajpotts in #3513
- Closes #3378: Update chpl tutorial by @stress-tess in #3507
- Closes #3510-Series.to_pandas to handle categoricals by @ajpotts in #3512
- Closes #3457: deprecate security_test.py by @ajpotts in #3587
- Closes #3448: deprecate nan_test.py by @ajpotts in #3584
- Closes #3452 deprecate random_test.py by @ajpotts in #3585
- Closes #3454 deprecate read_write_tests.py by @ajpotts in #3586
- Closes #3459: deprecate setops_test.py by @ajpotts in #3588
- Closes #3461: deprecate stats_test.py by @ajpotts in #3589
- Closes #3463 deprecate summarization_test.py by @ajpotts in #3591
- Closes #3465 deprecate util_test.py by @ajpotts in #3592
- Closes #3523: Improve random functions message handling and enable multidim functionality by @stress-tess in #3524
- Closes #3395 align to numpy scalar types by @ajpotts in #3396
- Closes #3447: deprecate message_test.py by @ajpotts in #3600
- Closes #3458: deprecate segarray_test.py by @ajpotts in #3601
- Closes #3443 deprecate io_test.py by @ajpotts in #3597
- Sparse matrix printing by @ShreyasKhandekar in #3593
- Refactor IndexingMsg by @jeremiah-corrado in #3590
- Closes #3421 testing equivalence module by @ajpotts in #3509
- Update Commands.chpl to include new procs added in #3524 by @jeremiah-corrado in #3606
- Closes #3603: Small bug in
register_commands.py
by @stress-tess in #3604 - Closes #3278: Add ziggurat method to standard normal by @stress-tess in #3596
- Closes #3508: Update installation docs by @stress-tess in #3594
- Fixes for Chapel 2.2 deprecation warnings by @jeremiah-corrado in #3607
- Closes #3462 deprecate string_test.py by @ajpotts in #3608
- Closes #3373: Add
lognormal
to random number generators by @stress-tess in #3598 - Closes #3466: deprecate where_test.py by @ajpotts in #3609
- Closes #3455: deprecate regex_test.py by @ajpotts in #3610
- Closes #3444 deprecate io_util_test.py by @ajpotts in #3611
- Closes #3438 deprecate array_view_test.py by @ajpotts in #3612
- Closes #3613: Add fixed-length flag to string read benchmark for Parquet by @bmcdonald3 in #3614
- Closes #3450 deprecate operator_tests.py by @ajpotts in #3618
- Closes #3460 deprecate sort_test.py by @ajpotts in #3619
- Closes #3628: Drop python 3.8 support by @stress-tess in #3630
- Closes #3464 deprecate symbol_table_test.py by @ajpotts in #3620
- Closes #3388: deprecate pdarray_creation_test.py by @ajpotts in #3622
- Closes #3526: refactor argsortMsg to remove registerND annotation by @ajpotts in #3602
- Closes #3430: deprecate array_api tests by @ajpotts in #3629
- Closes #3624 remove make test from CI by @ajpotts in #3625
- Closes #3616, #2321: Resolve deprecation warnings in
make test-proto
by @stress-tess in #3621 - Closes #3453 deprecate read all tests.py by @ajpotts in #3626
- Closes #3413 deprecate tests/coargsort_test.py by @ajpotts in #3623
- Fix [slice] indexing bug on multi-locale builds by @jeremiah-corrado in #3634
- Closes #3372: Add logistic to random number generators by @stress-tess in #3605
- Closes #3049: Remove mypy version upper bound by @stress-tess in #3635
- Fixes #3627: multilocale choice failures by @stress-tess in #3636
- Fixes #3648: Fix empty bounding box error by @stress-tess in #3651
- Fixes #3650: incorrect parquet reads with larger sizes by @stress-tess in #3653
- Fixes #3649: Fix bitshift failures by @stress-tess in #3652
- Closes #3639: Add multi-dimensional testing to the CI by @ajpotts in #3640
- Closes #3641 remove translate_np_dtype by @ajpotts in #3642
- Fixes #3658: segarray with nans and empty segments hdf5 bug by @stress-tess in #3660
- Fixes #3654: Out of bounds bug writing float segarray by @stress-tess in #3659
- Closes #3646: Config changes for
make test-proto
by @stress-tess in #3647 - Closes #3661 set default size to 10**2 in unit tests by @ajpotts in #3663
- Fix dtype promotion table in array api module by @jeremiah-corrado in #3655
- Closes #3402, #3439, #3669: Rename proto-tests and make commands by @stress-tess in #3670
- Support for queried dtypes in tuples with
registerCommands
by @jeremiah-corrado in #3672 - Fixes #3666: Compilation error with multidim and multi-locale enabled by @stress-tess in #3667
- Closes #3674: add array_api tests to pytest.ini by @ajpotts in #3675
- Array transfer perf fix by @jeremiah-corrado in #3671
- Closes #3677: Occasional failures of
test_string_broadcast
by @stress-tess in #3678 - Fixes #3682: repeat incorrect results with multi-locale by @stress-tess in #3683
- Closes #3679:
flatten
andunflatten
runtime checks failure by @stress-tess in #3680 - Closes #3632: refactor LinalgMsg to remove registerND annotation by @ajpotts in #3633
- Closes #3631: refactor-utilMsg-to-remove-registerND-annotation by @ajpotts in #3681
- Closes #3664: streamline get_max_array_rank checks in unit testing by @ajpotts in #3673
- Closes #3353 reference numpy functions without alias by @ajpotts in #3662
- Fixes #3687: Out of bounds multidim sorting error with multilocale by @stress-tess in #3688
- Fix vecdot entries in Commands.chpl file by @jeremiah-corrado in #3691
- Closes #3415 update compare_test by @ajpotts in #3686
- Fixes #3676 - index order bug in
nonzero
by @jeremiah-corrado in #3690 - Closes #3657: remove serverConfig-multi-dim.json by @ajpotts in #3692
- Documentation/Tutorials about new function-registration framework by @jeremiah-corrado in #3685
- Fix performance regression in
to_ndarray
by @jeremiah-corrado in #3697 - Closes #3695: Fix Parquet Fixed-Length Code Path for Incorrect File Size Assumption by @bmcdonald3 in #3696
- Closes #3703: Upper bound for python version by @stress-tess in #3707
- Closes #3705: Add annotations to skip tests based on rank. by @ajpotts in #3706
- Closes #3702: strip ArrayView out of io.py by @ajpotts in #3704
- Closes #3699 zeros, ones, full to return Array by @ajpotts in #3701
- Closes #3656: Refactor LinalgMsg.transpose to use registerCommand by @ajpotts in #3693
- Update Commands.chpl to reflect recent changes to linalg module by @jeremiah-corrado in #3724
- Closes #3718: Remove @arkouda.registerND from MsgProcessing.chpl by @ajpotts in #3726
- Closes #3711: Add Parquet fixed length string benchmark by @bmcdonald3 in #3712
- Closes #3731: Add skip_if_max_rank_less_than markers to numeric_test… by @ajpotts in #3732
- Closes #3709: reshape to return a multi-dimensional array by @ajpotts in #3715
- Closes #3297: Adds
array_equal
by @drculhane in #3725 - Closes #3746 remove ForwardRef by @ajpotts in #3747
- Option for running each benchmark in its own server instance by @jeremiah-corrado in #3730
- Update Sort comparators for Chapel 2.2 by @jabraham17 in #3750
- Closes #2978: Strip out ArrayView by @ajpotts in #3752
- Closes #3742: move numeric module to arkouda.numpy by @ajpotts in #3743
- Closes #3769: Read Parquet byte sizes in batches, rather than individually by @bmcdonald3 in #3770
- Closes #3767: Add batch string read by @bmcdonald3 in #3768
- Closes #3720: Update
SetMsg
to use the new message framework by @stress-tess in #3774 - Closes #3771: register_commands.py to handle generic scalar type by @ajpotts in #3772
- Add
scalar
to.configs
by @stress-tess in #3775 - Fix minor variable with misleading name in PR #3772 by @ajpotts in #3776
- Fixes #3762: Fix dataframe groupby aggregations when keys contain
NaN
s by @stress-tess in #3766 - Sparse matrix testing and added features by @ShreyasKhandekar in #3745
- Closes 3428 putmask optimization by @drculhane in #3749
- Closes #3665: add multi-dim support to arkouda.testing module by @ajpotts in #3751
- Improved domain and type query support for
registerCommand
by @jeremiah-corrado in #3786 - Closes #3722: Remove @arkouda.registerND from SortMsg.chpl by @ajpotts in #3780
- Closes #3777: Fix testing logic in tests/testing/asserters_test.py by @ajpotts in #3778
- Parallel sparseMatrixtoPdarray implementation by @jeremiah-corrado in #3787
- Closes #3757 ak.array gives unexpected results on a transposed numpy multi dimensional array by @ajpotts in #3761
- Clean up registerCommands.py script by @jeremiah-corrado in #3789
- Fixes #3783, #3784, #3788: multilocale io test failures by @stress-tess in #3790
- Refactor
OperatorMsg
by @jeremiah-corrado in #3773 - Closes #3748, #3759, #3727: updates documentation about memory pressure during server builds by @stress-tess in #3795
- Closes #3793: make benchmark errors by @ajpotts in #3794
- Closes #3798 benchmark v2/sort cases benchmark.py throws errors by @ajpotts in #3801
- Closes #3797 benchmark v2/str locality benchmark.py throws errors by @ajpotts in #3800
- Temporarily revert #3787 by @jeremiah-corrado in #3805
- Fix test
size
by @jeremiah-corrado in #3806
Full Changelog: v2024.06.21...v2024.10.02