ParData

ParData (homophone of partake) is a Python API that enables data consumers and distributors to easily use and share datasets, and establishes a standard for exchanging data assets. It enables:

a data scientist to have a simpler and more unified way to begin working with a wide range of datasets, and
a data distributor to have a consistent, safe, and open source way to share datasets with interested communities.

Quick Example

>>> import pardata
>>> pardata.list_all_datasets()
{'claim_sentences_search': ('1.0.2',),
 ..., 'wikitext103': ('1.0.1',)}
>>> pardata.load_dataset('wikitext103')
{...}  # Content of the dataset

Install the Package & its Dependencies

To install the latest version of ParData, run

$ pip install pardata

Alternatively, if you have downloaded the source, switch to the source directory (same directory as this README file, cd /path/to/pardata-source) and run

$ pip install -U .

Quick Start

Import the package and load a dataset. ParData will download WikiText-103 dataset (version 1.0.1) if it's not already downloaded, and then load it.

import pardata
wikitext103_data = pardata.load_dataset('wikitext103')

View available ParData datasets and their versions.

>>> pardata.list_all_datasets()
{'claim_sentences_search': ('1.0.2',), ..., 'wikitext103': ('1.0.1',)}

To view your globally set configs for ParData, such as your default data directory, use pardata.get_config.

>>> pardata.get_config()
Config(DATADIR=PosixPath('dir/to/download/load/from'), ..., DATASET_SCHEMA_FILE_URL='file/to/load/datasets/from')

By default, pardata.load_dataset downloads to and loads from ~/.pardata/data/<dataset-name>/<dataset-version>/. To change the default data directory, use pardata.init.

pardata.init(DATADIR='new/dir/to/download/load/from')

Load a previously downloaded dataset using pardata.load_dataset. With the new default data dir set, ParData now searches for the Groningen Meaning Bank dataset (version 1.0.2) in new/dir/to/download/load/from/gmb/1.0.2/.

gmb_data = load_dataset('gmb', version='1.0.2', download=False)  # assuming GMB dataset was already downloaded

To learn more about ParData, check out the documentation and the tutorial.

Name	Name	Last commit message	Last commit date
Latest commit tlzhu19 Add JSON loader (#296 ) Dec 1, 2021 1d1600a · Dec 1, 2021 History 234 Commits
.github	.github	Drop Python 3.6 and support Python 3.10 (#299 )	Nov 24, 2021
docs	docs	Rename PyDAX to ParData in images (#268 )	Jul 23, 2021
examples/csv-viewer	examples/csv-viewer	Add example application to view loaded data (#291 )	Oct 26, 2021
pardata	pardata	Add JSON loader (#296 )	Dec 1, 2021
requirements	requirements	Update dependency flake8 to v4 (#294 )	Nov 24, 2021
tests	tests	Add JSON loader (#296 )	Dec 1, 2021
.bandit	.bandit	Use setuptools_scm to manage package version (#36 )	Nov 11, 2020
.dir-locals.el	.dir-locals.el	Add recommended Emacs configuration (#237 )	May 27, 2021
.gitattributes	.gitattributes	Add an intro diagram for the docs (not displayed in the doc yet) (#168 )	Jan 15, 2021
.gitignore	.gitignore	Rename to ParData (#266 )	Jul 22, 2021
.gitlab-ci.yml	.gitlab-ci.yml	Drop Python 3.6 and support Python 3.10 (#299 )	Nov 24, 2021
.mypy.ini	.mypy.ini	Enforce DATADIR to be type pathlib.Path (#35 )	Nov 11, 2020
.readthedocs.yml	.readthedocs.yml	Use readthedocs to build/host docs (#22 )	Dec 1, 2020
.yamllint.yaml	.yamllint.yaml	Add encoding support for retrieving schema files. (#10 )	Nov 24, 2020
AUTHORS.rst	AUTHORS.rst	Rename to ParData (#266 )	Jul 22, 2021
CONTRIBUTING.rst	CONTRIBUTING.rst	Rename to ParData (#266 )	Jul 22, 2021
LICENSE	LICENSE	Add common package files (#2 )	Oct 12, 2020
README.rst	README.rst	Rename to ParData (#266 )	Jul 22, 2021
requirements-dev.txt	requirements-dev.txt	Pin the versions of the packages that are used in tests and add depen…	Dec 11, 2020
setup.py	setup.py	Drop Python 3.6 and support Python 3.10 (#299 )	Nov 24, 2021
tox.ini	tox.ini	Rename to ParData (#266 )	Jul 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ParData

Install the Package & its Dependencies

Quick Start

About

Releases 5

Packages

Contributors 10

Languages

License

CODAIT/pardata

Folders and files

Latest commit

History

Repository files navigation

ParData

Install the Package & its Dependencies

Quick Start

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 5

Packages 0

Contributors 10

Languages

Packages