Data processing library built on top of Ibis and DataFusion to write multi-engine data workflows.
Caution
This library does not currently have a stable release. Both the API and implementation are subject to change, and future updates may not be backward compatible.
LETSQL is available as letsql
on PyPI:
pip install letsql
import urllib.request
import letsql as ls
urllib.request.urlretrieve("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv", "iris.csv")
con = ls.connect()
iris_table = con.read_csv("iris.csv", table_name="iris")
res = (
iris_table.filter([iris_table.sepal_length > 5])
.group_by("species")
.agg(iris_table.sepal_width.sum())
.execute()
)
for more examples on how to use letsql, check the examples directory,
note that in order to run some of the scripts in there, you need to install the library with examples
extra:
pip install 'letsql[examples]'
Contributions are welcome and highly appreciated. To get started, check out the contributing guidelines.
If you have any issues with this repository, please don't hesitate to raise them. It is actively maintained, and we will do our best to help you.
This project heavily relies on Ibis and DataFusion.
If you've found this repository helpful, why not give it a star? It's an easy way to show your appreciation and support for the project. Plus, it helps others discover it too!
This repository is licensed under the Apache License