Skip to content

Releases: kenfar/DataGristle

v0.2.2

20 May 03:21
939423f
Compare
Choose a tag to compare

V0.2.1 - 2021-04

  • Improvement: the field-names from headers can now be used instead of column offsets
    for gristle_sorter, gristle_freaker, gristle_profiler, and gristle_slicer.
  • Improvement: The use of the header now follows four simple rules:
    • It can be referred to as row 0 when it makes sense - like with gristle_slicer
      & gristle_viewer.
    • It will be passed through when it makes sense - like with gristle_sorter.
    • It will be used to translate field names to offsets for configuration.
    • But will otherwise be ignored.
  • Bug Fix: gristle_freaker was failing with 0-length files when using col-type=each
  • Bug Fix: gristle_sorter was failing with some multi-directional sorts

Installation can be done through either pypi or building from source:

v0.2.1

22 Apr 18:20
b339130
Compare
Choose a tag to compare

V0.2.1 - 2021-04

Improvement: added gristle_sorter as a script to install in the system so that it is available to users.
Improvement: Now supports python versions 3.8 and 3.9.
Improvement: All csv programs now support envvars and config files for input and can generate config files.
Improvement: Programs always autodetect file csv dialect before applying user overrides - except for piped-in data. This results in a very consistent experience but also means that you may sometimes need to turn dialect options off rather than only on.
Improvement: A directly or example configurations is provided for reference - and is also used for testing: https://github.com/kenfar/DataGristle/tree/master/examples
BREAKING CHANGE: dropped support for python version 3.7
BREAKING CHANGES to all csv programs:

  • Various changes to names of options for consistency between programs, with older names caught with an error msg that provides the new name.
  • Various improvements to csv dialect handling for consistency and correct handling of escapechar, doublequoting, skipinitialspace.

Installation can be done through either pypi or building from source:

* pip from pypi:  https://pypi.org/project/datagristle/0.2.1/
* pip from this release on github:  pip install -U -e git://github.com/kenfar/[email protected]#egg=datagristle
* build from source

Maintenance release with breaking changes

23 Jul 03:25
Compare
Choose a tag to compare

Improvement: now supports python versions 3.7 and 3.8
BREAKING CHANGE: dropped support for python version 3.6
Bumped versions on dependent modules to eliminate vulnerabilities

gristle_differ

  • BREAKING CHANGE: col_names renamed to col-names for consistency
  • Fixes --already-unix option bug with file parsing
  • Fixes --stats bug with empty files
  • Improvement: added ability to use column names from file headers
  • breaking change: col_names renamed to col-names for consistency
  • Improvement: if a key-col is in the ignore-cols - it will simply be ignored, and the program will continue processing.
  • Improvement: if a key-col is in the compare-cols - it will simply be ignored, and the program will continue processing.
  • Improvement: if neither compare or ignore cols are provided it will use all cols as compare-cols and continue processing.
  • Improvement: CLI help is updated to provide more details and accurate examples of these options.

Added gristle_dir_merger

21 Sep 01:27
Compare
Choose a tag to compare

This release adds gristle_dir_merger - a tool for consolidating large directories of files. This tool is both fast and flexible. More info can be found on the readme, or by entering gristle_dir_merger --long-help.

Upgraded gristle_validator

17 Feb 23:38
Compare
Choose a tag to compare

The primary feature of this release is the support within gristle_validator of the json schema. This allows users to define a schema with data quality requirements (identify fields in a csv, then for each field describe type, min & max value, min & max length, whether or not blanks are allowed, and provide a regex validation pattern).