Merge branch 'main' into 412-user-guide-and-api-documentation

probabl-ai · Oct 17, 2024 · 5f97540 · 5f97540
2 parents 4302fd0 + 3258510
commit 5f97540
Show file tree

Hide file tree

Showing 6 changed files with 375 additions and 61 deletions.
diff --git a/.github/workflows/skore.yml b/.github/workflows/skore.yml
@@ -27,7 +27,6 @@ jobs:
           path: skore-ui/dist
 
   test-skore:
-    runs-on: ubuntu-latest
     needs: build-skore-ui
     defaults:
       run:
@@ -36,7 +35,9 @@ jobs:
     strategy:
       fail-fast: true
       matrix:
+        os: ['ubuntu-latest', 'windows-latest']
         python-version: ['3.9', '3.10', '3.11', '3.12']
+    runs-on: ${{ matrix.os }}
     steps:
       - uses: actions/checkout@v4
       - uses: actions/setup-python@v5
@@ -66,6 +67,8 @@ jobs:
 
           # Test
           python -m pytest src/ tests/
+        env:
+          PYTHONUTF8: 1
 
   cleanup:
     runs-on: ubuntu-latest

diff --git a/.gitignore b/.gitignore
@@ -180,3 +180,7 @@ doc/auto_examples/
 auto-save-list
 tramp
 .\#*
+
+# Sphinx documentation
+doc/_build/
+doc/auto_examples/
diff --git a/README.md b/README.md
@@ -1,74 +1,94 @@
 # 👋 Welcome to skore
 
 ![ci](https://github.com/probabl-ai/skore/actions/workflows/ci.yml/badge.svg?event=push)
-![python](https://img.shields.io/badge/python-3.11%20|%203.12-blue?style=flat&logo=python)
+![python](https://img.shields.io/badge/python-3.9%20%7C%203.10%20%7C%203.11%20%7C%203.12-blue?style=flat&logo=python)
+[![pypi](https://img.shields.io/pypi/v/skore)](https://pypi.org/project/skore/)
+![license](https://img.shields.io/pypi/l/skore)
 
-`skore` allows data scientists to create tracking and visualization from their Python code:
-1. Users can store objects of different types: python lists and dictionaries, `numpy` arrays, `scikit-learn` fitted models, `matplotlib`, `altair`, and `plotly` figures, etc. Storing some values over time allows one to perform **tracking** and also to **visualize** them:
-2. They can visualize these stored objects on a dashboard. The dashboard is user-friendly: objects can easily be organized.
-3. This dashboard can be exported into a HTML file.
+With `skore`, data scientists can:
+1. Store objects of different types from their Python code: python lists, `scikit-learn` fitted pipelines, `plotly` figures, and more.
+2. **Track** and  **visualize** these stored objects on a user-friendly dashboard.
+3. Export the dashboard to a HTML file.
 
-These are only the first features of `skore`'s roadmap.
-`skore` is a work in progress and, on the long run, it aims to be an all-inclusive library for data scientists.
+These are only the first features: `skore` is a work in progress and aims to be an end-to-end library for data scientists.
 Stay tuned!
 
-<p align="center">
-    <img width="100%" src="https://github.com/sylvaincom/sylvaincom.github.io/blob/master/files/probabl/skore/2024_10_08_skore_demo.gif"/>
-</p>
+![GIF: short demo of `skore`](https://raw.githubusercontent.com/sylvaincom/sylvaincom.github.io/master/files/probabl/skore/2024_10_14_skore_demo.gif)
 
 ## ⚙️ Installation
 
-You can install `skore` by using `pip`:
+First of all, we recommend using a [virtual environment (venv)](https://docs.python.org/3/tutorial/venv.html). You need `python>=3.9`.
+
+Then, you can install `skore` by using `pip`:
 ```bash
 pip install -U skore
 ```
 
+🚨 For Windows users, the encoding must be set to [UTF-8](https://docs.python.org/3/using/windows.html#utf-8-mode): see [PYTHONUTF8](https://docs.python.org/3/using/cmdline.html#envvar-PYTHONUTF8).
+
 ## 🚀 Quick start
 
-In your shell, run the following to create a project file `project.skore` (the default) in your current working directory:
+1. From your shell, initialize a `skore` project, here named `project.skore`, that will be in your current working directory:
 ```bash
-python -m skore create 'project.skore'
+python -m skore create "project.skore"
 ```
-
-Run the following in your Python code (in the same working directory) to load the project, store some objects, delete them, etc:
+2. Then, from your Python code (in the same directory), load the project and store an integer for example:
 ```python
 from skore import load
-
-# load the project
 project = load("project.skore")
+project.put("my_int", 3)
+```
+3. Finally, from your shell (in the same directory), start the UI locally:
+```bash
+python -m skore launch "project.skore"
+```
+This will automatically open a browser at the UI's location:
+1. On the top left, create a new `View`.
+2. From the `Elements` section on the bottom left, you can add stored items to this view, either by double-cliking on them or by doing drag-and-drop.
 
-# save an item you need to track in your project
-project.put("my int", 3)
+## 👨‍💻 More examples
 
-# get an item's value
-project.get("my int")
+💡 Note that after launching the dashboard, you can keep modifying the current items or store new ones from your python code, and the dashboard will automatically be refreshed.
 
-# by default, strings are assumed to be Markdown:
-project.put("my string", "Hello world!")
+Storing a `pandas` dataframe:
+```python
+import numpy as np
+import pandas as pd
 
-# `put` overwrites previous data
-project.put("my string", "Hello again!")
+my_df = pd.DataFrame(np.random.randn(3, 3))
+project.put("my_df", my_df)
+```
 
-# list all the keys in a project
-print(project.list_item_keys())
+Storing a `matplotlib` figure:
+```python
+import matplotlib.pyplot as plt
 
-# delete an item
-project.delete_item("my int")
+x = [0, 1, 2, 3, 4, 5]
+fig, ax = plt.subplots(figsize=(5, 3), layout="constrained")
+ax.plot(x)
+project.put("my_figure", fig)
 ```
 
-Then, in the directory containing your project, run the following command in your shell to start the UI locally:
-```bash
-python -m skore launch project.skore
+Storing a `scikit-learn` fitted pipeline:
+```python
+from sklearn.datasets import load_diabetes
+from sklearn.linear_model import Lasso
+from sklearn.pipeline import Pipeline
+from sklearn.preprocessing import StandardScaler
+
+diabetes = load_diabetes()
+X = diabetes.data[:150]
+y = diabetes.target[:150]
+my_pipeline = Pipeline(
+    [("standard_scaler", StandardScaler()), ("lasso", Lasso(alpha=2))]
+)
+my_pipeline.fit(X, y)
+project.put("my_fitted_pipeline", my_pipeline)
 ```
-This will automatically open a browser at the UI's location.
-In the `Elements` tab on the left, you can visualize the stored items.
-Create a new `View`, then you can then add items into this view.
-
-💡 Note that after launching the dashboard, you can keep modifying current items or store new ones, and the dashboard will automatically be refreshed.
 
-👨‍🏫 For a complete introductory example, see our [basic usage notebook](https://github.com/probabl-ai/skore/blob/main/examples/basic_usage.ipynb).
-It shows you how to store all types of items: python lists and dictionaries, `numpy` arrays, `scikit-learn` fitted models, `matplotlib`, `altair`, and `plotly` figures, etc.
-The resulting `skore` report has been exported to [this HTML file](https://sylvaincom.github.io/files/probabl/skore/basic_usage.html).
+👨‍🏫 For a complete introductory example, see our [basic usage notebook](https://github.com/probabl-ai/skore/blob/main/examples/01_basic_usage.ipynb).
+It shows you how to store all types of items: python lists and dictionaries, `numpy` arrays, `pandas` dataframes, `scikit-learn` fitted models, figures (`matplotlib`, `altair`, and `plotly`), etc.
+The resulting `skore` report has been exported to [this HTML file](https://sylvaincom.github.io/files/probabl/skore/01_basic_usage.html).
 
 ## 🔨 Contributing
 
@@ -88,8 +108,8 @@ See [CONTRIBUTING.md](https://github.com/probabl-ai/skore/blob/main/CONTRIBUTING
 
 ---
 
-Brought to you by:
+Brought to you by
 
 <a href="https://probabl.ai" target="_blank">
-    <img width="120" src="https://sylvaincom.github.io/files/probabl/logo_probabl.svg" alt="Probabl logo">
+    <img width="120" src="https://sylvaincom.github.io/files/probabl/Logo-orange.png" alt="Probabl logo">
 </a>
diff --git a/examples/00_getting_started.ipynb b/examples/00_getting_started.ipynb
@@ -0,0 +1,201 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "0",
+   "metadata": {},
+   "source": [
+    "# Getting started with `skore`\n",
+    "\n",
+    "This guide provides a quick start to `skore`, an open-source package that aims at enable data scientist to:\n",
+    "1. Store objects of different types from their Python code: python lists, `scikit-learn` fitted pipelines, `plotly` figures, and more.\n",
+    "2. **Track** and  **visualize** these stored objects on a user-friendly dashboard.\n",
+    "3. Export the dashboard to a HTML file."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1",
+   "metadata": {},
+   "source": [
+    "## Initialize a Project and launch the UI\n",
+    "\n",
+    "From your shell, initialize a `skore` project, here named `project.skore`, that will be in your current working directory:\n",
+    "```bash\n",
+    "python -m skore create \"project.skore\"\n",
+    "```\n",
+    "This will create a skore project directory named `project.skore` in the current directory.\n",
+    "\n",
+    "From your shell (in the same directory), start the UI locally:\n",
+    "```bash\n",
+    "python -m skore launch \"project.skore\"\n",
+    "```\n",
+    "This will automatically open a browser at the UI's location.\n",
+    "\n",
+    "Now that the project file exists, we can load it in our notebook so that we can read from and write to it:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from skore import load\n",
+    "\n",
+    "project = load(\"project.skore\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3",
+   "metadata": {},
+   "source": [
+    "## Storing some items"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4",
+   "metadata": {},
+   "source": [
+    "Storing an integer:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "project.put(\"my_int\", 3)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6",
+   "metadata": {},
+   "source": [
+    "Here, the name of my stored item is `my_int` and the integer value is 3."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7",
+   "metadata": {},
+   "source": [
+    "For a `pandas` data frame:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "import pandas as pd\n",
+    "\n",
+    "my_df = pd.DataFrame(np.random.randn(3, 3))\n",
+    "project.put(\"my_df\", my_df)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9",
+   "metadata": {},
+   "source": [
+    "for a `matplotlib` figure:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "10",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "x = [0, 1, 2, 3, 4, 5]\n",
+    "fig, ax = plt.subplots(figsize=(5, 3), layout=\"constrained\")\n",
+    "ax.plot(x)\n",
+    "project.put(\"my_figure\", fig)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "11",
+   "metadata": {},
+   "source": [
+    "For a `scikit-learn` fitted pipeline:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "12",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from sklearn.datasets import load_diabetes\n",
+    "from sklearn.linear_model import Lasso\n",
+    "from sklearn.pipeline import Pipeline\n",
+    "from sklearn.preprocessing import StandardScaler\n",
+    "\n",
+    "diabetes = load_diabetes()\n",
+    "X = diabetes.data[:150]\n",
+    "y = diabetes.target[:150]\n",
+    "my_pipeline = Pipeline(\n",
+    "    [(\"standard_scaler\", StandardScaler()), (\"lasso\", Lasso(alpha=2))]\n",
+    ")\n",
+    "my_pipeline.fit(X, y)\n",
+    "project.put(\"my_fitted_pipeline\", my_pipeline)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "13",
+   "metadata": {},
+   "source": [
+    "## Back to the dashboard\n",
+    "\n",
+    "1. On the top left, create a new `View`.\n",
+    "2. From the `Elements` section on the bottom left, you can add stored items to this view, either by double-cliking on them or by doing drag-and-drop."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "14",
+   "metadata": {},
+   "source": []
+  }
+ ],
+ "metadata": {
+  "jupytext": {
+   "formats": "ipynb,py:percent"
+  },
+  "kernelspec": {
+   "display_name": ".venv",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.7"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}