v4.0.0
API changes, new features:
- Dataset-as-folder: Dataset can now be self-contained module in a folder with checksums, dummy data,... This simplify implementing datasets outside the TFDS repository.
tfds.load
can now load dataset without using the generation class. Sotfds.load('my_dataset:1.0.0')
can work even ifMyDataset.VERSION == '2.0.0'
(See #2493).- Add a new TFDS CLI (see https://www.tensorflow.org/datasets/cli for detail)
tfds.testing.mock_data
does not require metadata files anymore!- Add
tfds.as_dataframe(ds, ds_info)
with custom visualisation (example) - Add
tfds.even_splits
to generate subsplits (e.g.tfds.even_splits('train', n=3) == ['train[0%:33%]', 'train[33%:67%]', ...]
- Add new
DatasetBuilder.RELEASE_NOTES
property - tfds.features.Image now supports PNG with 4-channels
tfds.ImageFolder
now supports custom shape, dtype- Downloaded URLs are available through
MyDataset.url_infos
- Add
skip_prefetch
option totfds.ReadConfig
as_supervised=True
support fortfds.show_examples
,tfds.as_dataframe
Breaking compatible changes:
tfds.as_numpy()
now returns an iterable which can be iterated multiple times. To migratenext(ds)
->next(iter(ds))
- Rename
tfds.features.text.Xyz
->tfds.deprecated.text.Xyz
- Remove
DatasetBuilder.IN_DEVELOPMENT
property - Remove
tfds.core.disallow_positional_args
(should use Py3*,
instead) - tfds.features can now be saved/loaded, you may have to overwrite FeatureConnector.from_json_content and
FeatureConnector.to_json_content
to support this feature. - Stop testing against TF 1.15. Requires Python 3.6.8+.
Other bug fixes:
- Better archive extension detection for
dl_manager.download_and_extract
- Fix
tfds.__version__
in TFDS nightly to be PEP440 compliant - Fix crash when GCS not available
- Script to detect dead-urls
- Improved open-source workflow, contributor guide, documentation
- Many other internal cleanups, bugs, dead code removal, py2->py3 cleanup, pytype annotations,...
And of course, new datasets, datasets updates.
A gigantic thanks to our community which has helped us debugging issues and with the implementation of many features, especially vijayphoenix@ for being a major contributor.