embed comments from review

scikit-learn-contrib · Aug 8, 2023 · 2691cae · 2691cae
1 parent 28980c4
commit 2691cae
Show file tree

Hide file tree

Showing 7 changed files with 133 additions and 152 deletions.
diff --git a/docs/src/getting-started.rst b/docs/src/getting-started.rst
@@ -7,25 +7,8 @@ For a detailed explaination, please look at the :ref:`selection-api`
 Features and Samples Selection
 ------------------------------
 
-  .. include:: selection.rst
-     :start-after: marker-selection-introduction-begin
-     :end-before: marker-selection-introduction-end
-
-
-These selectors are available:
-
-* :ref:`CUR-api`: a decomposition: an iterative feature selection method based upon the
-  singular value decoposition.
-* :ref:`PCov-CUR-api` decomposition extends upon CUR by using augmented right or left
-  singular vectors inspired by Principal Covariates Regression.
-* :ref:`FPS-api`: a common selection technique intended to exploit the diversity of
-  the input space. The selection of the first point is made at random or by a
-  separate metric
-* :ref:`PCov-FPS-api` extends upon FPS much like PCov-CUR does to CUR.
-* :ref:`Voronoi-FPS-api`: conduct FPS selection, taking advantage of Voronoi
-  tessellations to accelerate selection.
-* :ref:`DCH-api`: selects samples by constructing a directional convex hull and
-  determining which samples lie on the bounding surface.
+.. automodule:: skmatter._selection
+   :noindex:
 
 Examples
 ^^^^^^^^
@@ -37,19 +20,8 @@ Examples
 Reconstruction Measures
 -----------------------
 
-  .. include:: gfrm.rst
-     :start-after: marker-reconstruction-introduction-begin
-     :end-before: marker-reconstruction-introduction-end
-
-
-These reconstruction measures are available:
-
-* :ref:`GRE-api` (GRE) computes the amount of linearly-decodable information
-  recovered through a global linear reconstruction.
-* :ref:`GRD-api` (GRD) computes the amount of distortion contained in a global linear
-  reconstruction.
-* :ref:`LRE-api` (LRE) computes the amount of decodable information recovered through
-  a local linear reconstruction for the k-nearest neighborhood of each sample.
+.. automodule:: skmatter.metrics
+   :noindex:
 
 Examples
 ^^^^^^^^
@@ -60,21 +32,8 @@ Examples
 Principal Covariates Regression
 -------------------------------
 
-  .. include:: pcovr.rst
-     :start-after: marker-pcovr-introduction-begin
-     :end-before: marker-pcovr-introduction-end
-
-It includes
-
-* :ref:`PCovR-api` the standard Principal Covariates Regression. Utilises a
-  combination between a PCA-like and an LR-like loss, and therefore attempts to find
-  a low-dimensional projection of the feature vectors that simultaneously minimises
-  information loss and error in predicting the target properties using only the
-  latent space vectors :math:`\mathbf{T}`.
-* :ref:`KPCovR-api` the Kernel Principal Covariates Regression
-  a kernel-based variation on the
-  original PCovR method, proposed in [Helfrecht2020]_.
-
+.. automodule:: skmatter.decomposition
+   :noindex:
 
 Examples
 ^^^^^^^^

diff --git a/docs/src/gfrm.rst b/docs/src/gfrm.rst
@@ -5,19 +5,16 @@ Reconstruction Measures
 
 .. marker-reconstruction-introduction-begin
 
-A set of easily-interpretable error measures of the relative information capacity of
-feature space `F` with respect to feature space `F'`. The methods returns a value
-between 0 and 1, where 0 means that `F` and `F'` are completey distinct in terms of
-linearly-decodable information, and where 1 means that `F'` is contained in `F`. All
-methods are implemented as the root mean-square error for the regression of the
-feature matrix `X_F'` (or sometimes called `Y` in the doc) from `X_F` (or sometimes
-called `X` in the doc) for transformations with different constraints (linear,
-orthogonal, locally-linear). By default a custom 2-fold cross-validation
-:py:class:`skosmo.linear_model.RidgeRegression2FoldCV` is used to ensure the
-generalization of the transformation and efficiency of the computation, since we deal
-with a multi-target regression problem. Methods were applied to compare different
-forms of featurizations through different hyperparameters and induced metrics and
-kernels [Goscinski2021]_ .
+.. automodule:: skmatter.metrics
+
+These reconstruction measures are available:
+
+* :ref:`GRE-api` (GRE) computes the amount of linearly-decodable information
+  recovered through a global linear reconstruction.
+* :ref:`GRD-api` (GRD) computes the amount of distortion contained in a global linear
+  reconstruction.
+* :ref:`LRE-api` (LRE) computes the amount of decodable information recovered through
+  a local linear reconstruction for the k-nearest neighborhood of each sample.
 
 .. marker-reconstruction-introduction-end
 

diff --git a/docs/src/pcovr.rst b/docs/src/pcovr.rst
@@ -1,29 +1,6 @@
 Principal Covariates Regression (PCovR)
 =======================================
 
-
-.. marker-pcovr-introduction-begin
-
-Often, one wants to construct new ML features from their
-current representation in order to compress data or visualise
-trends in the dataset. In the archetypal method for this
-dimensionality reduction, principal components analysis (PCA),
-features are transformed into the latent space which best
-preserves the variance of the original data. Principal Covariates
-Regression (PCovR), as introduced by [deJong1992]_,
-is a modification to PCA that incorporates target information,
-such that the resulting embedding could be tuned using a
-mixing parameter α to improve performance in regression
-tasks (:math:`\alpha = 0` corresponding to linear regression
-and :math:`\alpha = 1` corresponding to PCA).
-[Helfrecht2020]_ introduced the non-linear
-version, Kernel Principal Covariates Regression (KPCovR),
-where the mixing parameter α now interpolates between kernel ridge
-regression (:math:`\alpha = 0`) and kernel principal components
-analysis (KPCA, :math:`\alpha = 1`)
-
-.. marker-pcovr-introduction-end
-
 .. _PCovR-api:
 
 PCovR

diff --git a/docs/src/selection.rst b/docs/src/selection.rst
@@ -3,70 +3,7 @@
 Feature and Sample Selection
 ============================
 
-.. marker-selection-introduction-begin
-
-Data sub-selection modules primarily corresponding to methods derived from
-CUR matrix decomposition and Farthest Point Sampling. In their classical form,
-CUR and FPS determine a data subset that maximizes the variance (CUR) or
-distribution (FPS) of the features or samples.
-These methods can be modified to combine supervised target information denoted by the
-methods `PCov-CUR` and `PCov-FPS`.
-For further reading, refer to [Imbalzano2018]_ and [Cersonsky2021]_.
-
-These selectors can be used for both feature and sample selection, with similar
-instantiations. All sub-selection methods  scores each feature or sample
-(without an estimator)
-and chooses that with the maximum score. As an simple example
-
-.. doctest::
-
-    >>> # feature selection
-    >>> import numpy as np
-    >>> from skmatter.feature_selection import CUR, FPS, PCovCUR, PCovFPS
-    >>> selector = CUR(
-    ...     # the number of selections to make
-    ...     # if None, set to half the samples or features
-    ...     # if float, fraction of the total dataset to select
-    ...     # if int, absolute number of selections to make
-    ...     n_to_select=2,
-    ...     # option to use `tqdm <https://tqdm.github.io/>`_ progress bar
-    ...     progress_bar=True,
-    ...     # float, cutoff score to stop selecting
-    ...     score_threshold=1e-12,
-    ...     # boolean, whether to select randomly after non-redundant selections
-    ...     # are exhausted
-    ...     full=False,
-    ... )
-    >>> X = np.array(
-    ...     [
-    ...         [0.12, 0.21, 0.02],  # 3 samples, 3 features
-    ...         [-0.09, 0.32, -0.10],
-    ...         [-0.03, -0.53, 0.08],
-    ...     ]
-    ... )
-    >>> y = np.array([0.0, 0.0, 1.0])  # classes of each sample
-    >>> selector.fit(X)
-    CUR(n_to_select=2, progress_bar=True, score_threshold=1e-12)
-    >>> Xr = selector.transform(X)
-    >>> print(Xr.shape)
-    (3, 2)
-    >>> selector = PCovCUR(n_to_select=2)
-    >>> selector.fit(X, y)
-    PCovCUR(n_to_select=2)
-    >>> Xr = selector.transform(X)
-    >>> print(Xr.shape)
-    (3, 2)
-    >>>
-    >>> # Now sample selection
-    >>> from skmatter.sample_selection import CUR, FPS, PCovCUR, PCovFPS
-    >>> selector = CUR(n_to_select=2)
-    >>> selector.fit(X)
-    CUR(n_to_select=2)
-    >>> Xr = X[selector.selected_idx_]
-    >>> print(Xr.shape)
-    (2, 3)
-
-.. marker-selection-introduction-end
+.. automodule:: skmatter._selection
 
 .. _CUR-api:
 

diff --git a/src/skmatter/_selection.py b/src/skmatter/_selection.py
@@ -1,5 +1,77 @@
-"""
-Sequential selection
+r"""
+This module contains data sub-selection modules primarily corresponding to
+methods derived from CUR matrix decomposition and Farthest Point Sampling. In
+their classical form, CUR and FPS determine a data subset that maximizes the
+variance (CUR) or distribution (FPS) of the features or samples.  These methods
+can be modified to combine supervised target information denoted by the methods
+`PCov-CUR` and `PCov-FPS`.  For further reading, refer to [Imbalzano2018]_ and
+[Cersonsky2021]_. These selectors can be used for both feature and sample
+selection, with similar instantiations. All sub-selection methods  scores each
+feature or sample (without an estimator) and chooses that with the maximum
+score. A simple example of usage:
+
+.. doctest::
+
+    >>> # feature selection
+    >>> import numpy as np
+    >>> from skmatter.feature_selection import CUR, FPS, PCovCUR, PCovFPS
+    >>> selector = CUR(
+    ...     # the number of selections to make
+    ...     # if None, set to half the samples or features
+    ...     # if float, fraction of the total dataset to select
+    ...     # if int, absolute number of selections to make
+    ...     n_to_select=2,
+    ...     # option to use `tqdm <https://tqdm.github.io/>`_ progress bar
+    ...     progress_bar=True,
+    ...     # float, cutoff score to stop selecting
+    ...     score_threshold=1e-12,
+    ...     # boolean, whether to select randomly after non-redundant selections
+    ...     # are exhausted
+    ...     full=False,
+    ... )
+    >>> X = np.array(
+    ...     [
+    ...         [0.12, 0.21, 0.02],  # 3 samples, 3 features
+    ...         [-0.09, 0.32, -0.10],
+    ...         [-0.03, -0.53, 0.08],
+    ...     ]
+    ... )
+    >>> y = np.array([0.0, 0.0, 1.0])  # classes of each sample
+    >>> selector.fit(X)
+    CUR(n_to_select=2, progress_bar=True, score_threshold=1e-12)
+    >>> Xr = selector.transform(X)
+    >>> print(Xr.shape)
+    (3, 2)
+    >>> selector = PCovCUR(n_to_select=2)
+    >>> selector.fit(X, y)
+    PCovCUR(n_to_select=2)
+    >>> Xr = selector.transform(X)
+    >>> print(Xr.shape)
+    (3, 2)
+    >>>
+    >>> # Now sample selection
+    >>> from skmatter.sample_selection import CUR, FPS, PCovCUR, PCovFPS
+    >>> selector = CUR(n_to_select=2)
+    >>> selector.fit(X)
+    CUR(n_to_select=2)
+    >>> Xr = X[selector.selected_idx_]
+    >>> print(Xr.shape)
+    (2, 3)
+
+These selectors are available:
+
+* :ref:`CUR-api`: a decomposition: an iterative feature selection method based upon the
+  singular value decoposition.
+* :ref:`PCov-CUR-api` decomposition extends upon CUR by using augmented right or left
+  singular vectors inspired by Principal Covariates Regression.
+* :ref:`FPS-api`: a common selection technique intended to exploit the diversity of
+  the input space. The selection of the first point is made at random or by a
+  separate metric
+* :ref:`PCov-FPS-api` extends upon FPS much like PCov-CUR does to CUR.
+* :ref:`Voronoi-FPS-api`: conduct FPS selection, taking advantage of Voronoi
+  tessellations to accelerate selection.
+* :ref:`DCH-api`: selects samples by constructing a directional convex hull and
+  determining which samples lie on the bounding surface.
 """
 
 import numbers

diff --git a/src/skmatter/decomposition/__init__.py b/src/skmatter/decomposition/__init__.py
@@ -1,6 +1,28 @@
-"""
-The :mod:`skmatter.decomposition` module includes the two distance
-measures, as defined by Principal Covariates Regression (PCovR)
+r"""
+Often, one wants to construct new ML features from their current representation
+in order to compress data or visualise trends in the dataset. In the archetypal
+method for this dimensionality reduction, principal components analysis (PCA),
+features are transformed into the latent space which best preserves the
+variance of the original data. This module provides the Principal Covariates
+Regression (PCovR), as introduced by [deJong1992]_, is a modification to PCA
+that incorporates target information, such that the resulting embedding could
+be tuned using a mixing parameter α to improve performance in regression tasks
+(:math:`\alpha = 0` corresponding to linear regression and :math:`\alpha = 1`
+corresponding to PCA).  [Helfrecht2020]_ introduced the non-linear version,
+Kernel Principal Covariates Regression (KPCovR), where the mixing parameter α
+now interpolates between kernel ridge regression (:math:`\alpha = 0`) and
+kernel principal components analysis (KPCA, :math:`\alpha = 1`).
+
+The module includes:
+
+* :ref:`PCovR-api` the standard Principal Covariates Regression. Utilises a
+  combination between a PCA-like and an LR-like loss, and therefore attempts to find
+  a low-dimensional projection of the feature vectors that simultaneously minimises
+  information loss and error in predicting the target properties using only the
+  latent space vectors :math:`\mathbf{T}`.
+* :ref:`KPCovR-api` the Kernel Principal Covariates Regression
+  a kernel-based variation on the
+  original PCovR method, proposed in [Helfrecht2020]_.
 """
 
 from ._kernel_pcovr import KernelPCovR

diff --git a/src/skmatter/metrics/__init__.py b/src/skmatter/metrics/__init__.py
@@ -1,3 +1,20 @@
+r"""
+This module contains a set of easily-interpretable error measures of the
+relative information capacity of feature space `F` with respect to feature
+space `F'`. The methods returns a value between 0 and 1, where 0 means that
+`F` and `F'` are completey distinct in terms of linearly-decodable
+information, and where 1 means that `F'` is contained in `F`. All methods
+are implemented as the root mean-square error for the regression of the
+feature matrix `X_F'` (or sometimes called `Y` in the doc) from `X_F` (or
+sometimes called `X` in the doc) for transformations with different
+constraints (linear, orthogonal, locally-linear). By default a custom 2-fold
+cross-validation :py:class:`skosmo.linear_model.RidgeRegression2FoldCV` is
+used to ensure the generalization of the transformation and efficiency of
+the computation, since we deal with a multi-target regression problem.
+Methods were applied to compare different forms of featurizations through
+different hyperparameters and induced metrics and kernels [Goscinski2021]_ .
+"""
+
 from ._reconstruction_measures import (
     check_global_reconstruction_measures_input,
     check_local_reconstruction_measures_input,