Skip to content

Commit

Permalink
Update TFDS version to 4.0.0
Browse files Browse the repository at this point in the history
API changes, new features:

* Dataset-as-folder: Dataset can now be self-contained module in a folder with checksums, dummy data,... This simplify implementing datasets outside the TFDS repository.
* `tfds.load` can now load dataset without using the generation class. So `tfds.load('my_dataset:1.0.0')` can work even if `MyDataset.VERSION == '2.0.0'` (See #2493).
* Add a new TFDS CLI (see https://www.tensorflow.org/datasets/cli for detail)
* `tfds.testing.mock_data` does not require metadata files anymore!
* Add `tfds.as_dataframe(ds, ds_info)` with custom visualisation ([example](https://www.tensorflow.org/datasets/overview#tfdsas_dataframe))
* Add `tfds.even_splits` to generate subsplits (e.g. `tfds.even_splits('train', n=3) == ['train[0%:33%]', 'train[33%:67%]', ...]`
* Add new `DatasetBuilder.RELEASE_NOTES` property
* tfds.features.Image now supports PNG with 4-channels
* `tfds.ImageFolder` now supports custom shape, dtype
* Downloaded URLs are available through `MyDataset.url_infos`
* Add `skip_prefetch` option to `tfds.ReadConfig`
* `as_supervised=True` support for `tfds.show_examples`, `tfds.as_dataframe`

Breaking compatible changes:

* `tfds.as_numpy()` now returns an iterable which can be iterated multiple times. To migrate `next(ds)` -> `next(iter(ds))`
* Rename `tfds.features.text.Xyz` -> `tfds.deprecated.text.Xyz`
* Remove `DatasetBuilder.IN_DEVELOPMENT` property
* Remove `tfds.core.disallow_positional_args` (should use Py3 `*, ` instead)
* tfds.features can now be saved/loaded, you may have to overwrite [FeatureConnector.from_json_content](https://www.tensorflow.org/datasets/api_docs/python/tfds/features/FeatureConnector?version=nightly#from_json_content) and `FeatureConnector.to_json_content` to support this feature.
* Stop testing against TF 1.15. Requires Python 3.6.8+.

Other bug fixes:

* Better archive extension detection for `dl_manager.download_and_extract`
* Fix `tfds.__version__` in TFDS nightly to be PEP440 compliant
* Fix crash when GCS not available
* Script to detect dead-urls
* Improved open-source workflow, contributor guide, documentation
* Many other internal cleanups, bugs, dead code removal, py2->py3 cleanup, pytype annotations,...

And of course, new datasets, datasets updates.

A gigantic thanks to our community which has helped us debugging issues and with the implementation of many features, especially vijayphoenix@ which has been one of our main contributor for this release.

PiperOrigin-RevId: 335667395
  • Loading branch information
Conchylicultor authored and copybara-github committed Oct 6, 2020
1 parent 5dd79ad commit 3bfcc7e
Show file tree
Hide file tree
Showing 3 changed files with 165 additions and 28 deletions.
Loading

0 comments on commit 3bfcc7e

Please sign in to comment.