Release v3.0.0 · tensorflow/datasets

Breaking changes:

Legacy mode tfds.experiment.S3 has been removed
New image_classification section. Some datasets have been move there from images.
in_memory argument has been removed from as_dataset/tfds.load (small datasets are now auto-cached).
DownloadConfig do not append the dataset name anymore (manual data should be in <manual_dir>/ instead of <manual_dir>/<dataset_name>/)
Tests now check that all dl_manager.download urls has registered checksums. To opt-out, add SKIP_CHECKSUMS = True to your DatasetBuilderTestCase.
tfds.load now always returns tf.compat.v2.Dataset. If you're using still using tf.compat.v1:
- Use tf.compat.v1.data.make_one_shot_iterator(ds) rather than ds.make_one_shot_iterator()
- Use isinstance(ds, tf.compat.v2.Dataset) instead of isinstance(ds, tf.data.Dataset)
tfds.Split.ALL has been removed from the API.

Future breaking change:

The tfds.features.text encoding API is deprecated. Please use tensorflow_text instead.
num_shards argument of tfds.core.SplitGenerator is currently ignored and will be removed in the next version.

Features:

DownloadManager is now pickable (can be used inside Beam pipelines)
tfds.features.Audio:
- Support float as returned value
- Expose sample_rate through info.features['audio'].sample_rate
- Support for encoding audio features from file objects
Various bug fixes, better error messages, documentation improvements
More datasets

Thank you to all our contributors for helping us make TFDS better for everyone!

Provide feedback