A Python library for the Renku collaborative data science platform. It allows the user to create projects, manage datasets, and capture data provenance while performing analysis tasks.
- NOTE:
renku-python
is the python library for Renku that provides an SDK and a command-line interface (CLI). It does not start the Renku platform itself - for that, refer to the Renku docs on running the platform.
The latest release is available on PyPI and can be installed using
pip
:
$ pip install renku
The latest code can be installed directly from the Git repository:
$ pip install -e git+https://github.com/SwissDataScienceCenter/renku-python.git#egg=renku
Initialize a renku project:
$ mkdir -p ~/temp/my-renku-project $ cd ~/temp/my-renku-project $ renku init
Create a dataset and add data to it:
$ renku dataset create my-dataset $ renku dataset add my-dataset https://raw.githubusercontent.com/SwissDataScienceCenter/renku-python/master/README.rst
Run an analysis:
$ renku run wc < data/my-dataset/README.rst > wc_readme
Trace the data provenance:
$ renku log wc_readme
These are the basics, but there is much more that Renku allows you to do with your data analysis workflows. The full documentation will soon be available at: https://renku-python.readthedocs.io/