Introduce PIC approximation #492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

peddie wants to merge 12 commits into master from peddie/pic

Contributor

peddie commented Aug 5, 2024

This commit implements the Partially Independent Conditional sparse GP approximation introduced in
https://proceedings.mlr.press/v2/snelson07a/snelson07a.pdf

Although I'm not aware of any outstanding bugs, this approximation remains experimental at the time of writing.

peddie requested review from akleeman, wlanghans and philippebrocard

August 5, 2024 22:00

peddie force-pushed the peddie/pic branch from f10fe92 to f350f50 Compare

August 6, 2024 16:25

philippebrocard approved these changes

View reviewed changes

doc/src/crappy-pic.rst Outdated

philippebrocard Aug 6, 2024

Still reviewing the equations of the first section.
The Woodbury lemma variant proof is pretty brutal !
I wonder if the condition you highlighted in the conclusion is necessary in absolute, or if it was just for the derivation. Maybe one could just try multiplying the inverse by (A - BC-1B.transpose) on the right or left to get identity and maybe this does not assume these conditions ? (just an idea)

doc/src/sparse-gp-details.rst Outdated

philippebrocard Aug 6, 2024

still reviewing/trying to understand, I will send you offline if I found something in the equations

peddie added 9 commits

August 8, 2024 01:04


          Make BlockDiagonalLDLT serialiseable

8090c47


          Factor out common sparse GP stuff

31a0b21


          Introduce PIC approximation

0af3b22


          format fixups

f2d36a1


          Remove debug spam

78348f5


          Remove unused group tracker

9ba34bf


          serialization fixups

7612fcf


          missing operator== for block diagonal LDLT

179dd64


          Avoid linker collision

cdf4337

peddie force-pushed the peddie/pic branch from f4f5b9b to cdf4337 Compare

August 7, 2024 15:58


          Remove unimplemented PIC functions

ae2f561

akleeman reviewed

View reviewed changes

include/albatross/src/models/pic_gp.hpp Outdated

+                    K_xu[i] = cov_(x, inducing_points_[i]);
+                    K_uy[i] = cov_(inducing_points_[i], y);
+                  }
+                  // const Eigen::VectorXd K_uy = cov_(inducing_points_, y);

Contributor

akleeman Aug 30, 2024

nit: remove comment

akleeman reviewed

View reviewed changes

include/albatross/src/models/pic_gp.hpp

+              template <typename CovarianceType, typename InducingFeatureType,
+                        typename GrouperFunction>
+              class BruteForcePIC

Contributor

akleeman Aug 30, 2024

Can you add a comment describing what brute force PIC does.

akleeman reviewed

View reviewed changes

include/albatross/src/models/pic_gp.hpp

+                void shift_mean(const Eigen::VectorXd &mean_shift) {
+                  ALBATROSS_ASSERT(mean_shift.size() == information.size());
+                  information += train_covariance.solve(mean_shift);

Contributor

akleeman Aug 30, 2024

Do we also have to shift the local group values (mean_w?). Maybe worth just hard failing here (or removing this?) if it hasn't been tested.

akleeman reviewed

View reviewed changes

include/albatross/src/models/pic_gp.hpp Outdated

+              };
+              /*
+               *  This class implements an approximation technique for Gaussian processes

Contributor

akleeman Aug 30, 2024

Is this for PIC or PITC?

akleeman reviewed

View reviewed changes

include/albatross/src/models/pic_gp.hpp

Comment on lines +506 to +585

+                MarginalDistribution _predict_impl(
+                    const std::vector<FeatureType> &features,
+                    const Fit<PICGPFit<GrouperFunction, InducingFeatureType, FitFeatureType>>
+                        &sparse_gp_fit,
+                    PredictTypeIdentity<MarginalDistribution> &&) const {
+                  const auto K_up =
+                      this->covariance_function_(sparse_gp_fit.inducing_features, features);
+                  Eigen::VectorXd mean = gp_mean_prediction(K_up, sparse_gp_fit.information);
+                  this->mean_function_.add_to(features, &mean);
+                  // First pass: compute an O(1) mapping from features to groups
+                  std::vector<std::size_t> feature_to_block(features.size());
+                  std::vector<decltype(sparse_gp_fit.measurement_groups.begin())> groups(
+                      features.size());
+                  Eigen::VectorXi col_alloc{Eigen::VectorXi::Zero(features.size())};
+                  bool all_same_group = true;
+                  for (Eigen::Index j = 0; j < features.size(); ++j) {
+                    groups[j] = sparse_gp_fit.measurement_groups.find(
+                        independent_group_function_(without_measurement(features[j])));
+                    if (groups[j] != sparse_gp_fit.measurement_groups.end()) {
+                      col_alloc(j) = groups[j]->second.block_size;
+                      all_same_group = all_same_group && groups[j] == groups[0];
+                    }
+                    feature_to_block[j] =
+                        std::distance(sparse_gp_fit.measurement_groups.begin(), groups[j]);
+                  }
+                  // Second pass: compute mean vector and fill sparse Vp matrix
+                  Eigen::VectorXd mean_correction = Eigen::VectorXd::Zero(features.size());
+                  Eigen::SparseMatrix<double> Vp(sparse_gp_fit.train_features.size(),
+                                                 features.size());
+                  Vp.reserve(col_alloc);
+                  for (Eigen::Index j = 0; j < features.size(); ++j) {
+                    if (groups[j] == sparse_gp_fit.measurement_groups.end()) {
+                      continue;
+                    }
+                    const auto &B = groups[j]->second;
+                    Eigen::VectorXd Vbp =
+                        this->covariance_function_(B.dataset.features,
+                                                   std::vector<FeatureType>{features[j]}) -
+                        K_up.col(j).transpose() * sparse_gp_fit.covariance_Y[B.block_index];
+                    for (Eigen::Index i = 0; i < Vbp.size(); ++i) {
+                      Vp.insert(B.initial_row + i, j) = Vbp(i);
+                    }
+                    mean_correction[j] = Vbp.dot(sparse_gp_fit.mean_w[B.block_index]);
+                  }
+                  Vp.makeCompressed();
+                  Eigen::MatrixXd xi_lambda = sparse_gp_fit.A_ldlt.sqrt_solve(Vp);
+                  Eigen::MatrixXd xi_u = sparse_gp_fit.Z * Vp;
+                  Eigen::VectorXd VSV_diag{
+                      (xi_lambda.transpose() * xi_lambda - xi_u.transpose() * xi_u)
+                          .diagonal()};
+                  const Eigen::VectorXd U_diag =
+                      (K_up.transpose() * sparse_gp_fit.W * Vp).diagonal();
+                  Eigen::VectorXd marginal_variance(cast::to_index(features.size()));
+                  for (Eigen::Index i = 0; i < marginal_variance.size(); ++i) {
+                    marginal_variance[i] = this->covariance_function_(
+                        features[cast::to_size(i)], features[cast::to_size(i)]);
+                  }
+                  const Eigen::MatrixXd Q_sqrt =
+                      sparse_gp_fit.train_covariance.sqrt_solve(K_up);
+                  const Eigen::VectorXd Q_diag =
+                      Q_sqrt.cwiseProduct(Q_sqrt).array().colwise().sum();
+                  marginal_variance -= Q_diag;
+                  const Eigen::MatrixXd S_sqrt =
+                      sqrt_solve(sparse_gp_fit.sigma_R, sparse_gp_fit.P, K_up);
+                  const Eigen::VectorXd S_diag =
+                      S_sqrt.cwiseProduct(S_sqrt).array().colwise().sum();
+                  marginal_variance += S_diag;
+                  mean += mean_correction;
+                  return MarginalDistribution(mean,
+                                              marginal_variance - (2 * U_diag + VSV_diag));

Contributor

akleeman Aug 30, 2024

👏 can't say I follow all of it, but I know this was a tough one. Well done.

akleeman reviewed

View reviewed changes

tests/test_pic_gp.cc Outdated

+              #include <fstream>
+              #include <gtest/gtest.h>
+              // #include "albatross/src/eigen/serializable_ldlt.hpp"

Contributor

akleeman Aug 30, 2024

nit: remove

akleeman approved these changes

View reviewed changes

Contributor

akleeman left a comment

A few nits, and (as you're aware) I'm sure there are optimizations/improvements BUT this is F*&#ing awesome, and I don't see any changes here which would cause any backward incompatibility for use cases using PITC at the moment. good job!

peddie added 2 commits

September 2, 2024 15:47


          Cleanups and comments

86b38ea


          Update Albatross to use vendored MKL distribution

9bb009e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet