Skip to content

Releases: Bears-R-Us/arkouda

Release Notes v2023.05.05

05 May 18:03
a6629af
Compare
Choose a tag to compare

Bug Fixes

  • Issue #2398 - Fixes parquet error on list columns containing nested lists
  • Issue #2380 - Fixes SegArray register bug
  • Issue #2396 - Fixes server crash caused by nested parquet fields
  • Issue #2300 - Improves Strings strip performance

New Features

  • Issue #2296 - Adds bitops support for bigint pdarays
  • Issue #474 - Adds HDF5 overwrite dataset
  • Issues #2372 and #2373 - Update Categorical HDF5 format and add update_hdf method
  • Issue #2377 - Adds groupby aggregations that require min/max on bigint
  • Issue #1855 - Adds divmod support

Minor Updates

  • Issue #2355 - Drops support for 1.28
  • Issue #2138 - Updates messaging overview docs
  • Issue #2368 - Cleans up Strings references in SegArray
Auto-Generated Release Notes * Closes #2370 - Fixes Deprecation Warnings during `make` by @Ethan-DeBandi99 in https://github.com//pull/2371 * Closes #2368 - Cleans up Strings references in SegArray by @Ethan-DeBandi99 in https://github.com//pull/2369 * Closes #1855: Implement `divmod` by @jaketrookman in https://github.com//pull/2356 * Closes #2374: Pin hdf5 version to 1.12.2 by @pierce314159 in https://github.com//pull/2375 * Closes #2355: Drop support for 1.28 by @bmcdonald3 in https://github.com//pull/2383 * Closes #2377: Add groupby aggregations that require min/max on bigint by @pierce314159 in https://github.com//pull/2378 * Closes #474 - HDF5 Overwrite Dataset by @Ethan-DeBandi99 in https://github.com//pull/2382 * Closes #2300 - `segString` Strip performance issue by @joshmarshall1 in https://github.com//pull/2379 * Fixes #2380: Segarray register bug by @pierce314159 in https://github.com//pull/2392 * Closes #2341: Quiet 131 deprecations by @bmcdonald3 in https://github.com//pull/2342 * Closes #2372 & #2373 - Categorical HDF5 Format Update & `update_hdf` method by @Ethan-DeBandi99 in https://github.com//pull/2394 * Closes #2396 - Fixes Nested parquet fields causing server crash by @Ethan-DeBandi99 in https://github.com//pull/2399 * Closes #2398 - Fixes Parquet List Columns with Nested Lists Error by @Ethan-DeBandi99 in https://github.com//pull/2401 * Closes #2403: Update c_getDatasetNames calls to match new behavior by @bmcdonald3 in https://github.com//pull/2404 * Add a use of the Math module to avoid the anticipated `pi` deprecation warning by @lydia-duncan in https://github.com//pull/2407 * Deprecation updates for `BitOps.popcount` and `bigint.mod` by @jeremiah-corrado in https://github.com//pull/2409 * Closes #2296: bitops support for bigint by @pierce314159 in https://github.com//pull/2408 * Closes #2138 Update messaging overview docs by @jaketrookman in https://github.com//pull/2391 * Deprecation updates for `datetime` and `fromTimestamp` by @bmcdonald3 in https://github.com//pull/2410

Full Changelog: v2023.04.07...v2023.05.05

Release Notes v2023.04.07

07 Apr 13:30
80c4fe6
Compare
Choose a tag to compare

Bug Fixes

  • Issue #2329 - Fixes continued SegArray read performance issues
  • Issue #2299 - Fixes file writes reporting success when directory does not exist
  • Issue #2297 - Fixes HDF5 single file write stall
  • Issue #2327 - Fixes issue loading 16-bit and 32-bit from Parquet
  • Issue #2337 - Fixes ak.DataFrame.to_parquet with IPv4 columns
  • Issue #2348 - Fixes IPv4 removal from DataFrame
  • Issue #2350 - Fixes DataFrame column subset access
  • Issues #2306, #2317 and PR #2354 - Fix Datetime component scaling
  • Issue #2328 - Fixes bug when printing Dataframes containing bigint
  • Issue #2307 - Fixes reported memory usage exceeding 100 percent
  • Issue #2309 - Fixes error in Groupby.unique on Categorical or Strings
  • Issue #2240 - Fixes ak.coargsort empty String and Categorical bug
  • Issue #2347 - Fixes broken links in README.md

New Features

  • Issue #2050 - Adds File of Origin when loading data from Parquet or HDF5. Use ak.read_tagged_data to return the file origin information. More information on this function can be found here.
  • Issue #2295 - Adds SegArray filters, ak.SegArray.filter() allows values to be removed from SegArray.
  • Issue #2293- Adds ak.where support Strings and Categoricals
  • Issue #2324 - Adds benchmark documentation
  • Issue #2209 - Enhances Arkouda metrics

Minor Updates

  • Issue #2015 - Updates Parquet NaN detection on float/double columns to be more efficient
  • Issue #2280 - Improves max_bits handling
  • Issues #2319, #2320, #2339 - Update to test framework to pytest configuration
Auto-Generated Release Notes

Full Changelog: v2023.03.24...v2023.04.07

Release Notes v2023.03.24

24 Mar 21:52
424e162
Compare
Choose a tag to compare

Bug Fixes

  • Issue #2265 - Fixes bug in ak.DataFrame.to_parquet with empty Strings Column
  • Issue #2263 - Fixes bug which caused slow reads of large SegArrays and Strings
  • Issue #2214 - Fixes bigint rotate by more than max_bits bug
  • Issue #2183 - Fixes index reset in dataframe get_head_tail
  • Issues #2179 and #2199 - Fix OOB error when writing SegArray to HDF5 when locales exceed number of segments

New Features

Minor Updates

  • Issue #2110 - Updates _buildReadAllJSON to use Map
  • Issue #2077 - Remove duplicated bigint logic in IndexingMsg.chpl
Auto-Generated Release Notes

Full Changelog: v2023.03.01...v2023.03.24

Release Notes v2023.03.01

01 Mar 13:23
8ae978b
Compare
Choose a tag to compare

Bug Fixes

  • Issue #2163 - Resolves issue with SymEntry destruction for GroupBy objects.
  • Issue #2173 - Resolves SymEntry destruction bug for Strings and SegArray.
  • PR #2164 - Updates memory checks to use Chapel runtime view of allocatable memory when available
  • Issue #1987 - Resolves issues with AutoAPI documentation.
  • Issue #2129 - Resolves a periodic 403 error for integration and metrics-enabled Arkouda

New Features

  • Issues #2118, #2141, #2145, #2156 - Enable reading of Parquet files with columns containing SegArray objects.
  • PRs #2178, #2131 (part of Issue #2088), Issues #2139, #1961, - Provides updates speeding up BigInt pdarray creation using bigint_from_uint_arrays
  • Issue #2147 - Adds API for hashArrays
  • Issue #1835 - Updates SegArray HDF5 save format. Backwards compatibility maintained.
  • Issue #1939 - Adds % and %= for floats
  • Issue #1522 - Adds Index&MultiIndex Support for key in Series.locate()

Minor Updates

  • Issue #2152 - Enhances memory management logging and metrics
  • Issue #2177 - Updates to ak.client.maxTransferBytes
Auto-Generated Release Notes

Full Changelog: v2023.02.08...v2023.03.01

Release Notes v2023.02.08

08 Feb 16:40
5909302
Compare
Choose a tag to compare

Bug Fixes

  • Issue #2068 - Fixes dataframe groupby with categorical index bug
  • Issue #2076 - Fixes integer overflow in groupby.mean
  • Issue #2105 - Fixes bug when loading a dataframe containing a segarray with an _ in the column name
  • Issue #2099 - Fixes bug in left and right shift by >=64 bits for int/uint

New Features

  • Issues #2047, #2117 - Add CSV Support
  • Issue #2060 - Adds pdarray data type and size to metrics
  • Issue #2042 - Adds BigInt support in SegArray
  • Issue #2111 - Enables load_all and read workflows with dataframes containing segarrays
  • Issue #1695 - Renames util Packages
  • Issue #2058 - Enables logging to a variety of channels

Minor Updates

  • Issue #1994 - Adds aggregation interface for bigint
  • Issues #1988, #2073, #2082, #2089, #2094, #2103, #2108, #2124 and PR #2057 - Update documentation
  • Issues #2066, #2070, #2092 and PR #2074 - Provide the following updates to compilation:
    • Separates setting optimization level and enabling runtime checks
    • Disable bulk transfer when using ARKOUDA_QUICK_COMPILE
    • Improve Makefile errors when dependencies aren't found
    • Updates iconv check during compilation
  • Issue #2097 - Adds save-data flag to benchmark script
  • Issue #2096 - Updates to prevent numpy overflow deprecation warnings
  • Issue #2113 - Corrects naming and bug in segarray bigint test
Auto-Generated Release Notes

Full Changelog: v2023.01.11...v2023.02.08

Release Notes v2023.01.11

11 Jan 22:26
62b3266
Compare
Choose a tag to compare

Major updates:

  • Issues #1970, #1989, #1995, #2000, #2009, #2013, #2033, and #2041 - Add bigint pdarrays with binary operations and support for sort, in1d, search_intervals, groupby, and dataframe
  • Issue #1876 - 2.5x speed up of multi-column write for parquet
  • Issue #2019 - Fixes bug preventing reading strings formatted by older versions of Arkouda
  • Issues #1297 and #1983 - Change ak.array to prefer uint over float when containing values >2**63
  • Issue #2005 - Fixes parquet read error for columns containing NANs
  • Issues #1991, #2021, and #2024 - Add additional parquet compression support
  • Issues #1962 and #2038 - Add parameters to ak.get_mem_* functions and add percentage of memory used to overmemlimit logs
  • Issues #1965, #1979, #1981, #1986, #1996, and #2026 - Rework and update online documentation
  • PR #1966 - Recommends Chapel 1.29.0 and updates CI to use it
  • Issue #1850 - Removes legacy HDF5 multi-dim

Minor fixes:

  • Issues #1949, #1951, #2054, and PR #2016 - Add support for conversions between IDNA and non-UTF-8 encodings
  • Issue #2011 - Updates version requirement for h5py and numpy
  • Issue #1932 - Fixes bug with binopvv between uint and bool
  • Issue #1963 - Fixes message arg failure when List contains Strings and pdarray
  • Issue #1972 - Switches to allclose for float comparison in operator test
Auto-generated release notes

Full Changelog: v2022.12.09...v2023.01.11

Release Notes v2022.12.09

09 Dec 16:50
f5ef366
Compare
Choose a tag to compare

Release Notes 2022-12-09

Major updates:

  • Issues #1914, #1917, and #1922 - Add serverInfoNoSplash and autoShutdown flags along with documentation for running arkouda from a script
  • Issue #1927 - Quiets HDF5 Errors when ObjType attribute is missing
  • Issues #1904 and #1935 - Add Chapel-native encoding/decoding functionality
  • Issue #1896 - Separates client IO in anticipation of IO rework
  • Issue #1947 - Fixes conversion error when IPv4 is Index

Minor fixes:

  • Issues #1796, #1722, and #1894 - Update SegArray to only register a single object
  • Issues #1926 and #1938 - Fix operation equals for uint arrays
  • Issue #1901 - Reduces SymEntry creation overheads
  • Issue #1941 - Fixes condition for regex edge case
  • Issue #1905 - Adds encoding libraries to conda dependencies
  • Issue #319 - Moves chpl tests to tests/server directory
  • Issue #1943 - Quiets deprecation warnings in preparation for Chapel 1.29

Auto-generated release notes

Full Changelog: v2022.11.17...v2022.12.09

Release Notes v2022.11.17

17 Nov 22:57
cdeab05
Compare
Choose a tag to compare

Release Notes 2022-11-17

Major updates:

  • Issue #1906 - Supports older HDF5 files by assuming pdarray/Strings when no ObjType attribute is set
    • Note: This removes the need to use the legacyHDF5 flag
  • Issue #1909 - Adds support for __invert__ calls on uint
  • Issues #1844 and #1912 - Add option for hierarchical behavior to search_intervals
    • Note: This behavior is the new default. To maintain existing behavior, set hierarchical=False

Minor fixes:

  • Issue #1727 - Adds where argument to sqrt and power
  • Issue #1800 - Adds Symbol Table Overview documentation

Auto-generated release notes

Full Changelog: v2022.11.10...v2022.11.17

Release Notes v2022.11.10

10 Nov 20:51
1e31ee7
Compare
Choose a tag to compare

Release Notes 2022-11-10

In Memoriam

Mike Merrill (@mhmerrill), one of the co-founders of Arkouda, recently passed away. Without his leadership and contributions, the project would not exist. It's hard to overstate Mike's impact. He was a great person and will be dearly missed. Our deepest condolences to his family.

Major updates:

  • Issues #487, #1558, #1559, #1846, #1877, #1887 and PR #1879 - Rework HDF5 structure and schema, enable writing to a single file, and add documentation of schema
    • NOTE: Files written with tag v2022.10.13 or earlier need to be read with the legacyHDF5 flag set and re-written with the new format
  • Issue #1891 - Fixes bug in IDNA decode
  • Issues #1776, #1847, #1852, and #1867 - Optimize GroupBy on small strings

Minor fixes:

  • PR #1858 and Issue #1859 - Switches to C++17 for Arrow compilation and updates Arrow version to 9.0.0
  • Issue #1801 - Reorganizes structure of the symbol table
  • Issue #1779 - Adds documentation for creating a new symbol table entry
  • Issues #1839 and #1889 - Update MessageArgs parameter for CommandMap functions
  • Issue #1837 - Resolves intermittent failures in GroupBy prod aggregate test
  • Issue #1868 - Adds name property to AbstractSymEntry
  • Issue #1854 - Takes advantage of set and generator comprehensions in client code
  • Issue #1842 - Updates COMPARISON.md

Auto-generated release notes

New Contributors

Full Changelog: v2022.10.13...v2022.11.10

Release Notes v2022.10.13

13 Oct 16:49
a52645f
Compare
Choose a tag to compare

Release Notes 2022-10-13

Major updates:

  • Issue #1816 - Drops support for Chapel 1.26
  • Issues #1799 and #1828 - Add encoding and decoding support for idna
  • Issues #1831 and #1832 - Fix bugs with strings and fancy pdarray types in DataFrame indexing
  • Issue #1825 - Fixes aggregator use in median reduction
  • PR #1822 - Fixes errors with large Parquet files
  • Issue #1795 and PR #1812 - Add Chapel Intro and Client-Server communication docs for new developers
  • Issue #1689 - Adds support for DataFrame to generic attach
  • PR #1826 - Avoids flushing aggregators in lockstep

Minor fixes:

  • Issue #1792 - Adds GroupBy Symbol Table Entry
  • PR #1818 - Removes old re2 dependency compatibility check
  • PR #1823 - Fixes Chapel version query

Auto-generated release notes

Full Changelog: v2022.09.30...v2022.10.13