pandas-udfs-to-sql

This repository provides a translator for pandas udfs into sql udf to enhance enhance performance working with pandas dataframes.

To convert the exemplary scripts in src/testqueries/ execute:

- `python3 -m src.converter.convert src/testqueries/<SCRIPT> <PERSIST_MODE>`

where SCRIPT is equal to the to be converted file (e.g. simple_pipeline.py) and PERSIST_MODE is either MATERIALIZED_VIEW or NEW_TABLE (runtimes don't seem to differ much).

Benchmarks

The benchmark requires the following environment variables (e.g. exposed in a .env file):

pg_scalefactor="0.01"
db_scalefactor='0.01'

# specifies which server you want to use: either "postgresql" or "sqlserver"
flavor="sqlserver"

# required for postgres-baseline/grizzly/translated pipelines (materialized view/new table)
pg_pw="<postgres_server_password>"
pg_host="<postgres_server_ip>"
pg_user="<postgres_user_name>"

# required for sqlserver-baseline/sqlserver translated pipelines
sqlserver_pw="<sqlserver_password>"
sqlserver_host="<sqlserver_host>" # "localhost" or ip
sqlserver_user="<sqlserver_user>" # most likely "SA"

Setup

To install all required Python dependencies run:

pip install -r requirements.txt

For visualization you further need to run:

apt install cm-super dvipng apt install msttcorefonts -qq

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
data		data
src		src
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

pandas-udfs-to-sql

Benchmarks

Setup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

rothaarlappen/pandas-udfs-to-sql

Folders and files

Latest commit

History

Repository files navigation

pandas-udfs-to-sql

Benchmarks

Setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages