Starred repositories
Flowchart for debugging Spark applications
DeepSeek LLM: Let there be answers
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
A GPT-4 AI Tutor Prompt for customizable personalized learning experiences.
🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.
Streamlit — A faster way to build and share data apps.
PawMark is a platform for developers to build, schedule and monitor data pipelines.
Practice machine learning/deep learning.
Practice and tutorial-style notebooks covering wide variety of machine learning techniques
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
A curated list of awesome Machine Learning frameworks, libraries and software.
A natural language interface for computers
Open source platform for the machine learning lifecycle
Jupyter handsontable integration
The official Notion API client library, but rewritten in Python! (sync + async)
Interactive Widgets for the Jupyter Notebook
Apache Superset is a Data Visualization and Data Exploration Platform
A better notebook for Scala (and more)
Jupyter metapackage for installation, docs and chat
Upserts, Deletes And Incremental Processing on Big Data.
Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are initialized. This also allows extending the Spark metrics syst…
Examples for High Performance Spark
VIP cheatsheets for Stanford's CS 229 Machine Learning