Releases: iterative/datachain
Releases · iterative/datachain
0.3.15
What's Changed
- Add resolve files by @EdwardLi-coder in #313
- unskip test_udf_parallel by @mattseddon in #432
- fix last modified comparison in resolve file test by @mattseddon in #436
- Refactor
Client.parse_url()
by @ilongin in #435 - Set stream for nested file signals by @dberenbaum in #443
- Read arrow files from cache by @dberenbaum in #442
- Auto-detect huggingface datasets when reading tabular data by @dberenbaum in #398
- Add
datachain.lib.tar.process_tar()
generator by @rlamy in #440 - Fix storage dependencies by @ilongin in #421
Full Changelog: 0.3.14...0.3.15
0.3.14
What's Changed
- fix dependency install instructions for examples by @mattseddon in #426
- Show progress bar for pytorch conversion by @dberenbaum in #429
- Fix calculating datasets stats size by @dreadatour in #418
- use the correct fixtures in tests by @mattseddon in #428
- Adding Complex Type Support to Signal Schema by @dtulga in #422
- tests: fix mock for subprocess stdout/stderr to return BytesIO by @skshetry in #431
- prevent tests from hanging on CI (windows) by @mattseddon in #427
- Remove Entry class and use File instead by @rlamy in #419
Full Changelog: 0.3.13...0.3.14
0.3.13
0.3.12
What's Changed
- Fixes settings by @dberenbaum in #397
- fix open file method for tar files by @dberenbaum in #412
- disable execution of last query expression by default by @skshetry in #407
New Contributors
- @yathomasi made their first contribution in #408
Full Changelog: 0.3.11...0.3.12
0.3.11
What's Changed
- query: remove use of pipe for communication by @skshetry in #393
- do not require last statement to be an expression or an instance of DatasetQuery by @skshetry in #395
- pin pydantic < 2.9 by @mattseddon in #399
- unpin pydantic, use python API for datamodel_codegen by @skshetry in #400
- Update the DataChain logo in the README and docs by @djsauble in #402
- avoid splitting script into feature files/scripts by @skshetry in #385
- allow merge on expressions by @mattseddon in #388
New Contributors
Full Changelog: 0.3.10...0.3.11
0.3.10
What's Changed
- Support for reading from huggingface hub with
hf://
filesystem by @dberenbaum in #375 - Simplify datachain.lib.listing by reusing Cilent.scandir() by @rlamy in #376
- Use stderr for sql debug prints by @shcheklein in #378
- Refactor
DataChain.from_storage()
to use new listing generator by @ilongin in #294 - remove unused finally block by @mattseddon in #379
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #382
- increase timeout of e2e test by @mattseddon in #383
- metrics: save metrics in realtime by @skshetry in #387
- query: remove support for saving dataset query with a given name by @skshetry in #389
- Using job class instead of hardcodced
Job
by @ilongin in #391 - cli: remove preview from
datachain query
command by @skshetry in #392 - fix issues with new version of huggingface datasets package by @mattseddon in #394
- Add
DataChain.listings()
method and use it in getting storages by @ilongin in #331
Full Changelog: 0.3.9...0.3.10
0.3.9
What's Changed
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #354
- increase timeout of datachain tests in CI (Windows) by @mattseddon in #363
- remove LaionMeta model store registration from wds example by @mattseddon in #364
- slight positioning change to deny AI abstractions by @volkfox in #356
- unstructured example - remove misleading install instructions by @mattseddon in #366
- improve datachain subtract by @EdwardLi-coder in #352
- Fixing get_file_signals for custom types by @dtulga in #371
Full Changelog: 0.3.8...0.3.9
0.3.8
What's Changed
- remove blip2 image desc example by @mattseddon in #338
- Convert 'Union[str, Literal[...]]' type to string by @dreadatour in #345
- increase timeout of datachain tests in CI by @mattseddon in #347
- reduce down to a single claude example by @mattseddon in #346
- Add ability to set row size for flushing udf results to database by @dberenbaum in #342
- Revert float64 tests from #13 by @dreadatour in #341
- Fix empty 'save()' as query last statement by @dreadatour in #357
- Remove Catalog.merge_datasets() by @EdwardLi-coder in #350
- Adding Custom Type (De)Serialization to Signal Schema by @dtulga in #348
DataChain.from_hf
by @dberenbaum in #311- Add
device
parameter to convert functions and update usage model de… by @ayasyrev in #351
New Contributors
Full Changelog: 0.3.7...0.3.8
0.3.7
What's Changed
- Convert custom columns types in dataset_select_paginated by @dreadatour in #339
Full Changelog: 0.3.6...0.3.7
0.3.6
What's Changed
- add retry locks to SQLiteDatabaseEngine execute_str by @mattseddon in #333
- Mutate cannot modify existing column by @EdwardLi-coder in #306
- Mutate can rename columns by @srini047 in #312
- Handle carriage return to support progress bar in logs by @amritghimire in #326
New Contributors
Full Changelog: 0.3.5...0.3.6