Skip to content

Commit

Permalink
Merge branch 'main' of github.com:brainets/hoi
Browse files Browse the repository at this point in the history
  • Loading branch information
chrisferreyra13 committed Jul 9, 2024
2 parents 862c512 + 80ff722 commit b9febcf
Show file tree
Hide file tree
Showing 60 changed files with 3,028 additions and 1,044 deletions.
45 changes: 22 additions & 23 deletions .github/workflows/pypi-publish.yml
Original file line number Diff line number Diff line change
@@ -1,31 +1,30 @@
# This workflow will upload a Python Package using Twine when a release is created
# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries

name: Upload Python Package
name: Upload Python Package to PyPI when a Release is Created

on:
release:
types: [created]

jobs:
deploy:
pypi-publish:
name: Publish release to PyPI
runs-on: ubuntu-latest
environment:
name: pypi
url: https://pypi.org/p/hoi
permissions:
id-token: write
steps:
- uses: actions/[email protected]
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.x'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel twine
- name: Build and publish
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: |
make clean_dist
make build_dist
make check_dist
make upload_dist
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.x"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel
- name: Build package
run: |
python setup.py sdist bdist_wheel # Could also be python -m build
- name: Publish package distributions to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
2 changes: 1 addition & 1 deletion .github/workflows/test_doc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ jobs:
touch _build/html/.nojekyll
- name: Deploy Github Pages 🚀
uses: JamesIves/[email protected].1
uses: JamesIves/[email protected].3
with:
branch: gh-pages
folder: docs/_build/html/
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -157,3 +157,4 @@ yarn.lock
*.dir
*.zip
*ipynb
develop/
22 changes: 22 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@

# clean dist
clean_dist:
@rm -rf build/
@rm -rf build/
@rm -rf frites.egg-info/
@rm -rf dist/
@echo "Dist cleaned"

# build dist
build_dist: clean_dist
python setup.py sdist
python setup.py bdist_wheel
@echo "Dist built"

# check distribution
check_dist:
twine check dist/*

# upload distribution
upload_dist:
twine upload --verbose dist/*
Binary file added docs/_static/jax_cgpu_entropy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/jax_cgpu_oinfo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 3 additions & 2 deletions docs/_templates/layout.html
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,11 @@
<footer>
<div class="foot">
<img src="https://cibul.s3.amazonaws.com/e9619f705931403093351210dc70d8ee.base.image.jpg" alt="INT" height="80">
<img src="https://www.engagement.fr/wp-content/uploads/2013/02/Aix-Marseille-Universit%C3%A9.png" alt=" Aix-Marseille university" height="80">
<img src="https://www.univ-amu.fr/system/files/2021-05/AMU%20logo.png" alt=" Aix-Marseille university" height="80">
<img src="https://developers.google.com/open-source/gsoc/resources/downloads/GSoC-Vertical.png" alt="Gsoc" height="80">
<img src="https://enlight-eu.org/images/logos/Logo_Gent.png" alt="Ghent" height="80">
<br>
<p>&copy; Copyright {{ copyright }}.</p>
<!-- <p>&copy; Copyright {{ copyright }}.</p> -->
</div>
</footer>
{% endblock %}
6 changes: 3 additions & 3 deletions docs/api/api_core.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,12 @@ Measures of Entropy
:toctree: generated/

get_entropy
entropy_gcmi
entropy_gc
entropy_gauss
entropy_bin
entropy_knn
entropy_kernel
copnorm_nd
prepare_for_entropy
prepare_for_it

Measures of Mutual Information
++++++++++++++++++++++++++++++++
Expand Down
2 changes: 2 additions & 0 deletions docs/api/api_metrics.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _metrics:

``hoi.metrics``
---------------

Expand Down
2 changes: 1 addition & 1 deletion docs/api/api_sim.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,4 @@ Simulate HOI.
.. autosummary::
:toctree: generated/

simulate_hois_gauss
simulate_hoi_gauss
5 changes: 3 additions & 2 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@
sys.path.insert(0, os.path.abspath(".."))

project = "HOI"
copyright = "BraiNets"
author = "BraiNets"
# copyright = "BraiNets"
# author = "BraiNets"
release = hoi.__version__
release = hoi.__version__

Expand Down Expand Up @@ -96,6 +96,7 @@
"../examples/tutorials",
"../examples/it",
"../examples/metrics",
"../examples/statistics",
"../examples/miscellaneous",
]
),
Expand Down
2 changes: 2 additions & 0 deletions docs/contributor_guide.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _contribute:

Developer Documentation
=======================

Expand Down
13 changes: 11 additions & 2 deletions docs/glossary.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _glossary:

Glossary
========

Expand Down Expand Up @@ -25,7 +27,14 @@ Glossary
Partial Information Decomposition (PID) :cite:`williams2010nonnegative` is a framework for quantifying the unique, shared, and synergistic information that multiple variables provide about a target variable. It aims to decompose the mutual information between a set of predictor variables and a target variable into non-negative components, representing the unique information contributed by each predictor variable, the redundant information shared among predictor variables, and the synergistic information that can only be obtained by considering multiple predictor variables together. PID provides a more nuanced understanding of the relationships between variables in complex systems, beyond traditional pairwise measures of association.

Network behavior
Higher Order Interactions between a set of variables.
Higher Order Interactions between a set of variables. Metrics of intrinsic
information :cite:`luppi2024information`, i.e. information carried by a group of variables about their
future, are part of this category. `Undirected` metrics :cite:`rosas2024characterising` as the
O-information, fall as well in this category.

Network encoding
Higher Order Interactions between a set of variables about a target variable.
Higher Order Interactions between a set of variables modulated by a target variable.
Measures of exstrinsic information :cite:`luppi2024information`, i.e. information carried by a group of
variables about an external target are part of this group.
`Directed` metrics :cite:`rosas2024characterising`, as the
Redundancy-synergy index (RSI), are also part of this group.
125 changes: 125 additions & 0 deletions docs/jax.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,128 @@ Jax: linear algebra backend
One of the main issues in the study of the higher-order structure of complex systems is the computational cost required to investigate one by one all the multiplets of any order. When using information theoretic tools, one must consider the fact that each metric relies on a complex set of operations that have to be performed for all the multiplets of variables in the data set. The number of possible multiplets of :math:`k` nodes in a data set grows as :math:`\binom{n}{k}`. This means that, in a data set of :math:`100` variables, the multiples of three nodes are :math:`\simeq 10^5`, the multiples of 4 nodes, :math:`\simeq 10^6` and 5 nodes, :math:`\simeq 10^7`, etc. This leads to huge computational costs and time that can pose real problems to the study of higher-order interactions in different research fields.

In this toolbox to deal with this problem, we used the recently developed Python library `Jax <https://github.com/google/jax>`_, that uses XLA to compile and run your NumPy programs on CPU, GPU and TPU.

CPU vs. GPU : Performance comparison
++++++++++++++++++++++++++++++++++++

Computing entropy on large multi-dimensional arrays
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In this first part, we are going to compare the time taken to compute entropy using large arrays. To run this comparison, we recommend using `Colab <https://colab.research.google.com/>`_ and go to *Modify > Notebook settings* and select a GPU environment.

In the first cell, install hoi and import some modules:

.. code-block:: shell
!pip install hoi
import numpy as np
import jax
import jax.numpy as jnp
from time import time
from hoi.metrics import Oinfo
from hoi.core import get_entropy
import matplotlib.pyplot as plt
plt.style.use("ggplot")
In a new cell, past the following code. This code compute the Gaussian Copula entropy for an array with a size growing, both on the CPU or GPU :

.. code-block:: shell
def compute_timings(n=15):
n_samples = np.linspace(10, 10e2, n).astype(int)
n_features = np.linspace(1, 10, n).astype(int)
n_variables = np.linspace(1, 10e2, n).astype(int)
entropy = jax.vmap(get_entropy(method="gc"), in_axes=(0,))
# dry run
entropy(np.random.rand(2, 2, 10))
timings_cpu = []
data_size = []
for n_s, n_f, n_v in zip(n_samples, n_features, n_variables):
# generate random data
x = np.random.rand(n_v, n_f, n_s)
x = jnp.asarray(x)
# compute entropy
start = time()
entropy(x)
timings_cpu.append(time() - start)
data_size.append(n_s * n_f * n_v)
return data_size, timings_cpu
with jax.default_device(jax.devices("gpu")[0]):
data_size, timings_gpu = compute_timings()
with jax.default_device(jax.devices("cpu")[0]):
data_size, timings_cpu = compute_timings()
Finally, plot the timing comparison :
.. code-block:: shell
plt.plot(data_size, timings_cpu, label="CPU")
plt.plot(data_size, timings_gpu, label="GPU")
plt.xlabel("Data size")
plt.ylabel("Time (s)")
plt.title("CPU vs. GPU for computing entropy", fontweight="bold")
plt.legend()
.. image:: _static/jax_cgpu_entropy.png
On CPU, the computing time increase linearly as the array gets larger. However, on GPU, it doesn't scale as fast.
Computing Higher-Order Interactions on large multiplets
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In the next example, we are going to compute Higher-Order Interactions on a large network of 10 nodes with an increasing order (i.e. multiplets up to size 3, 4, ..., 10), both on CPU and GPU.
.. code-block:: shell
def compute_timings():
# create a dynamic network with 1000 samples, 10 nodes and
# 100 time points
x = np.random.rand(1000, 10, 100)
# define the model
model = Oinfo(x, verbose=False)
# compute hoi for increasing order
order = np.arange(3, 11)
timings = []
for o in order:
start = time()
model.fit(minsize=3, maxsize=o)
timings.append(time() - start)
return order, timings
with jax.default_device(jax.devices("gpu")[0]):
order, timings_gpu = compute_timings()
with jax.default_device(jax.devices("cpu")[0]):
order, timings_cpu = compute_timings()
Let's plot the results :
.. code-block:: shell
plt.plot(order, timings_cpu, label="CPU")
plt.plot(order, timings_gpu, label="GPU")
plt.xlabel("Multiplet order")
plt.ylabel("Time (s)")
plt.title("CPU vs. GPU for computing the O-information", fontweight="bold")
plt.legend()
.. image:: _static/jax_cgpu_oinfo.png
On this toy example, computing the O-information on CPU takes ~13 seconds for each order while on GPU it takes ~3 seconds. GPU computations are ~4 times faster than CPU !
68 changes: 66 additions & 2 deletions docs/quickstart.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,71 @@
Quickstart
==========

HOI is a Python package to estimate :term:`Higher Order Interactions`. A network is composed of nodes (e.g. users in social network, brain areas in neuroscience, musicians in an orchestra etc.) and nodes are interacting together. Traditionally we measure pairwise interactions. HOI allows to go beyond the pairwise interactions by quantifying the interactions between 3, 4, ..., N nodes of the system. As we are using measures from the :term:`Information Theory`, we can further describe the type of interactions, i.e. whether nodes of the network tend to have redundant or synergistic interactions (see the definition of :term:`Redundancy`, :term:`Synergy`).

* **Installation :** to install HOI with its dependencies, see :ref:`installation`. If you are a developer or if you want to contribute to HOI, checkout the :ref:`contribute`.
* **Theoretical background :** For a detailed introduction to information theory and HOI, see :ref:`theory`. You can also have a look to our :ref:`glossary` to see the definition of the terms we are using here.
* **API and examples :** the list of functions and classes can be found in the section :ref:`hoi_modules`. For practical examples on how to use those functions, see :doc:`auto_examples/index`. For faster computations, HOI is built on top of Jax. Checkout the page :doc:`jax` for the performance claims.

Installation
++++++++++++

To install or update HOI, run the following command in your terminal :

.. code-block:: bash
pip install -U hoi
Simulate data
+++++++++++++

We provide functions to simulate data and toy example. In a notebook or in a python script, you can run the following lines to simulate synergistic interactions between three variables :

.. code-block:: python
from hoi.simulation import simulate_hoi_gauss
data = simulate_hoi_gauss(n_samples=1000, triplet_character='synergy')
Compute Higher-Order Interactions
+++++++++++++++++++++++++++++++++

We provide a list of metrics of HOI (see :ref:`metrics`). Here, we are going to use the O-information (:class:`hoi.metrics.Oinfo`):

.. code-block:: python
# this is a comment
x = 2
# import the O-information
from hoi.metrics import Oinfo
# define the model
model = Oinfo(data)
# compute hoi for multiplets with a minimum size of 3 and maximum size of 3
# using the Gaussian Copula entropy
hoi = model.fit(minsize=3, maxsize=3, method="gc")
Inspect the results
+++++++++++++++++++

To inspect your results, we provide a plotting function called :func:`hoi.plot.plot_landscape` to see how the information is spreading across orders together with :func:`hoi.utils.get_nbest_mult` to get a table of the multiplets with the strongest synergy or redundancy :


.. code-block:: python
from hoi.plot import plot_landscape
from hoi.utils import get_nbest_mult
# plot the landscape
plot_landscape(hoi, model=model)
# print the summary table
print(get_nbest_mult(hoi, model=model))
Practical recommendations
+++++++++++++++++++++++++

Robust estimations of HOI strongly rely on the accuity of measuring entropy/mutual information on/between (potentially highly) multivariate data. In the :doc:`auto_examples/index` section you can find benchmarks of our entropy estimators. Here we recommend :

* **Measuring entropy and mutual information :** we recommend the Gaussian Copula method (`method="gc"`). Although this measure is not accurate for capturing relationships beyond the gaussian assumption (see :ref:`sphx_glr_auto_examples_it_plot_entropies.py`), this method performs relatively well for multivariate data (see :ref:`sphx_glr_auto_examples_it_plot_entropies_mvar.py`)
* **Measuring Higher-Order Interactions for network behavior and network encoding :** for network behavior and ncoding, we recommend respectively the O-information :class:`hoi.metrics.Oinfo` and the :class:`hoi.metrics.GradientOinfo`. Although both metrics suffer from the same limitations, like the spreading to higher orders, this can be mitigated using a boostrap approach (see :ref:`sphx_glr_auto_examples_statistics_plot_bootstrapping.py`). Otherwise, both metrics are usually pretty accurate to retrieve the type of interactions between variables, especially once combined with the Gaussian Copula.
Loading

0 comments on commit b9febcf

Please sign in to comment.