Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Update Bodo project description in ecosystem page #60846

Merged
merged 1 commit into from
Feb 4, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 21 additions & 9 deletions web/pandas/community/ecosystem.md
Original file line number Diff line number Diff line change
Expand Up @@ -496,17 +496,29 @@ You can find more information about the Hugging Face Dataset Hub in the [documen

## Out-of-core

### [Bodo](https://bodo.ai/)
### [Bodo](https://github.com/bodo-ai/Bodo)

Bodo is a high-performance Python computing engine that automatically parallelizes and
optimizes your code through compilation using HPC (high-performance computing) techniques.
Designed to operate with native pandas dataframes, Bodo compiles your pandas code to execute
across multiple cores on a single machine or distributed clusters of multiple compute nodes efficiently.
Bodo also makes distributed pandas dataframes queryable with SQL.

The community edition of Bodo is free to use on up to 8 cores. Beyond that, Bodo offers a paid
enterprise edition. Free licenses of Bodo (for more than 8 cores) are available
[upon request](https://www.bodo.ai/contact) for academic and non-profit use.
Bodo is a high-performance compute engine for Python data processing.
Using an auto-parallelizing just-in-time (JIT) compiler, Bodo simplifies scaling Pandas
workloads from laptops to clusters without major code changes.
Under the hood, Bodo relies on MPI-based high-performance computing (HPC) technology—making it
both easier to use and often much faster than alternatives.
Bodo also provides a SQL engine that can query distributed pandas dataframes efficiently.

```python
import pandas as pd
import bodo

@bodo.jit
def process_data():
df = pd.read_parquet("my_data.pq")
df2 = pd.DataFrame({"A": df.apply(lambda r: 0 if r.A == 0 else (r.B // r.A), axis=1)})
df2.to_parquet("out.pq")

process_data()
```


### [Cylon](https://cylondata.org/)

Expand Down
Loading