Pydf: An Implementation of the DataFrame Specification in Python

This is the official implementation of the DataFrame specification provided by Raven Computing.

Getting Started

This library is available on PyPI.

Install via:

pip install raven-pydf

For more information see pypi.org.

After installation you can use the entire DataFrame API by importing one class:

from raven.struct.dataframe import DataFrame

# read a DataFrame file into memory
df = DataFrame.read("/path/to/myFile.df")

# show the first 10 rows on stdout
print(df.head(10))

Alternatively, you can import all concrete Column types directly, for example:

from raven.struct.dataframe import (DefaultDataFrame,
                                    IntColumn,
                                    DoubleColumn,
                                    StringColumn)

# create a DataFrame with 3 columns and 3 rows
df = DefaultDataFrame(
        IntColumn("A", [1, 2, 3]),
        DoubleColumn("B", [4.4, 5.5, 6.6]),
        StringColumn("C", ["cat", "dog", "horse"]))

print(df)
# _| A B   C
# 0| 1 4.4 cat
# 1| 2 5.5 dog
# 2| 3 6.6 horse

Compatibility

This library requires Python3.7 or higher.

Internally, this library uses Numpy for array operations. The minimum required version is v1.19.0

Documentation

The unified documentation is available here.

Additional features implemented in Python are documented in the Wiki.

Development

If you want to change code of this library or if you want to include it manually as a dependency without installing via PIP, you can do so by cloning this repository.

Setup

We are using virtual environments and the virtualenvwrapper utilities for all of our Python projects. If you are running on Linux then you can set up your development environment by sourcing the setup.sh script. This will create a virtual environment pydf for you and install all dependencies:

source setup.sh

Running Tests

Execute all unit tests via:

python -m unittest

Linting

Run pylint to perform static code analysis of the source code via:

pylint raven

License

This library is licensed under the Apache License Version 2 - see the LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
raven		raven
tests		tests
.gitignore		.gitignore
.global.sh		.global.sh
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
deploy.sh		deploy.sh
pylintrc		pylintrc
pypi.md		pypi.md
requirements.txt		requirements.txt
setup.py		setup.py
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pydf: An Implementation of the DataFrame Specification in Python

Getting Started

Compatibility

Documentation

Development

Setup

Running Tests

Linting

License

About

Uh oh!

Releases 9

Uh oh!

Languages

License

raven-computing/pydf

Folders and files

Latest commit

History

Repository files navigation

Pydf: An Implementation of the DataFrame Specification in Python

Getting Started

Compatibility

Documentation

Development

Setup

Running Tests

Linting

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 9

Uh oh!

Languages