Skip to content

Commit

Permalink
Implement SparseKDE, QuickShift and add H2O-BLYP-Piglet dataset (#222)
Browse files Browse the repository at this point in the history
* Add the class `SparseKDE` is located at `src/skmatter/utils/_sparsekde.py`.
  It mitigates the high cost of doing KDE for large datasets by doing KDE for
  selected data points (e.g. grid points sampled by farthest point-sampling).
  This class takes the original dataset as a parameter and fits the model
  using the sampled grid points. The corresponding tests can be found in
  `tests/test_neighbors.py`.
* Add the class `QuickShift` in `src/skmatter/clustering/_quick_shift.py`
  implementing the quick shift clustering algorithm with corresponding tests in
  `tests/test_clustering.py`.
* Add H2O-BLYP-Piglet dataset containing 27233 hydrogen bond with 3D descriptor
  and weights. The corresponding tests can be found in `tests/test_datasets.py`
* Add two auxiliary functions of `effdim` and `oas` stored in
  `src/skmatter/utils/_sparsekde.py` with corresponding tests in
  `tests/test_neighbors.py`.
* Add two distance metrics compatible with PBC,  `pairwise_euclidean_distances`
  and `pairwise_mahalanobis_distances`, are realized and stored in
  `src/skmatter/metrics/pairwise.py` with corresponding tests in
  `tests/test_metrics.py`.
  • Loading branch information
GardevoirX authored Oct 9, 2024
1 parent 3c784c9 commit ad56b1d
Show file tree
Hide file tree
Showing 29 changed files with 2,251 additions and 1 deletion.
7 changes: 7 additions & 0 deletions CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,13 @@ The rules for CHANGELOG file:
- Updating ``FPS`` to allow a numpy array of ints as an initialize parameter (#145)
- Supported Python versions are now ranging from 3.9 - 3.12.
- Updating ``skmatter.datasets`` submodule to support sklearn 1.5.0 (#229)
- Add `SparseKDE` class (#222)
- Add `QuickShift` class (#222)
- Add an example on how to conduct PAMM algorithm with `SparseKDE` and `QuickShift`
(#222)
- Add H2O-BLYP-Piglet dataset (#222)
- Add two distance metrics that support the periodic boundry condition,
`periodic_pairwise_euclidean_distances` and `pairwise_mahalanobis_distances` (#222)

0.2.0 (2023/08/24)
------------------
Expand Down
6 changes: 6 additions & 0 deletions docs/src/bibliography.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,12 @@ References
"Principal covariates regression: Part I. Theory", Chemom. intell. lab. syst. 14
(1992) 155-164 https://doi.org/10.1016/0169-7439(92)80100-I
.. [Gasparotto2014]
Piero Gasparotto, Michele Ceriotti,
"Recognizing molecular patterns by machine learning: An agnostic structural
definition of the hydrogen bond", J. Chem. Phys., 141 (17): 174110.
https://doi.org/10.1063/1.4900655.
.. [Imbalzano2018]
Giulio Imbalzano, Andrea Anelli, Daniele Giofré,Sinja Klees, Jörg Behler, and
Michele Ceriotti, “Automatic selection of atomic fingerprints and reference
Expand Down
2 changes: 1 addition & 1 deletion docs/src/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@
"sphinx_toggleprompt",
]

example_subdirs = ["pcovr", "selection", "regression", "reconstruction"]
example_subdirs = ["pcovr", "selection", "regression", "reconstruction", "neighbors"]
sphinx_gallery_conf = {
"filename_pattern": "/*",
"examples_dirs": [f"../../examples/{p}" for p in example_subdirs],
Expand Down
11 changes: 11 additions & 0 deletions docs/src/references/clustering.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Clustering
==========

.. automodule:: skmatter.clustering

.. _quick-shift-api:

Quick Shift
------------

.. autoclass:: skmatter.clustering.QuickShift
2 changes: 2 additions & 0 deletions docs/src/references/datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ Datasets

.. include:: ../../../src/skmatter/datasets/descr/degenerate_CH4_manifold.rst

.. include:: ../../../src/skmatter/datasets/descr/h2o-blyp-piglet.rst

.. include:: ../../../src/skmatter/datasets/descr/nice_dataset.rst

.. include:: ../../../src/skmatter/datasets/descr/who_dataset.rst
2 changes: 2 additions & 0 deletions docs/src/references/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@ API Reference
preprocessing
selection
linear_models
clustering
decomposition
metrics
neighbors
datasets
utils
15 changes: 15 additions & 0 deletions docs/src/references/metrics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,3 +40,18 @@ Component-wise Prediction Rigidity
----------------------------------

.. autofunction:: skmatter.metrics.componentwise_prediction_rigidity


.. _pairwise-euclidian-api:

Pairwise Euclidean Distances
----------------------------

.. autofunction:: skmatter.metrics.periodic_pairwise_euclidean_distances

.. _pairwise-mahalanobis-api:

Pairwise Mahalanobis Distance
-----------------------------

.. autofunction:: skmatter.metrics.pairwise_mahalanobis_distances
16 changes: 16 additions & 0 deletions docs/src/references/neighbors.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Neighbors
=========

.. automodule:: skmatter.neighbors

.. _sparse-kde-api:

Sparse Kernel Density Estimation
--------------------------------

.. autoclass:: skmatter.neighbors.SparseKDE
:show-inheritance:

.. automethod:: fit
.. automethod:: score_samples
.. automethod:: score
11 changes: 11 additions & 0 deletions docs/src/references/utils.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,14 @@ Random Partitioning with Overlaps
---------------------------------

.. autofunction:: skmatter.model_selection.train_test_split


Effective Dimension of Covariance Matrix
----------------------------------------

.. autofunction:: skmatter.utils.effdim

Oracle Approximating Shrinkage
------------------------------

.. autofunction:: skmatter.utils.oas
1 change: 1 addition & 0 deletions docs/src/tutorials.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
examples/selection/index
examples/regression/index
examples/reconstruction/index
examples/neighbors/index
2 changes: 2 additions & 0 deletions examples/neighbors/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Neighbors
=========
Loading

0 comments on commit ad56b1d

Please sign in to comment.