Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce PIC approximation #492

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

Introduce PIC approximation #492

wants to merge 12 commits into from

Conversation

peddie
Copy link
Contributor

@peddie peddie commented Aug 5, 2024

This commit implements the Partially Independent Conditional sparse GP approximation introduced in
https://proceedings.mlr.press/v2/snelson07a/snelson07a.pdf

Although I'm not aware of any outstanding bugs, this approximation remains experimental at the time of writing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still reviewing the equations of the first section.
The Woodbury lemma variant proof is pretty brutal !
I wonder if the condition you highlighted in the conclusion is necessary in absolute, or if it was just for the derivation. Maybe one could just try multiplying the inverse by (A - BC-1B.transpose) on the right or left to get identity and maybe this does not assume these conditions ? (just an idea)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still reviewing/trying to understand, I will send you offline if I found something in the equations

K_xu[i] = cov_(x, inducing_points_[i]);
K_uy[i] = cov_(inducing_points_[i], y);
}
// const Eigen::VectorXd K_uy = cov_(inducing_points_, y);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove comment


template <typename CovarianceType, typename InducingFeatureType,
typename GrouperFunction>
class BruteForcePIC
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment describing what brute force PIC does.


void shift_mean(const Eigen::VectorXd &mean_shift) {
ALBATROSS_ASSERT(mean_shift.size() == information.size());
information += train_covariance.solve(mean_shift);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also have to shift the local group values (mean_w?). Maybe worth just hard failing here (or removing this?) if it hasn't been tested.

};

/*
* This class implements an approximation technique for Gaussian processes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for PIC or PITC?

Comment on lines +506 to +585
MarginalDistribution _predict_impl(
const std::vector<FeatureType> &features,
const Fit<PICGPFit<GrouperFunction, InducingFeatureType, FitFeatureType>>
&sparse_gp_fit,
PredictTypeIdentity<MarginalDistribution> &&) const {
const auto K_up =
this->covariance_function_(sparse_gp_fit.inducing_features, features);
Eigen::VectorXd mean = gp_mean_prediction(K_up, sparse_gp_fit.information);
this->mean_function_.add_to(features, &mean);

// First pass: compute an O(1) mapping from features to groups
std::vector<std::size_t> feature_to_block(features.size());
std::vector<decltype(sparse_gp_fit.measurement_groups.begin())> groups(
features.size());
Eigen::VectorXi col_alloc{Eigen::VectorXi::Zero(features.size())};
bool all_same_group = true;
for (Eigen::Index j = 0; j < features.size(); ++j) {
groups[j] = sparse_gp_fit.measurement_groups.find(
independent_group_function_(without_measurement(features[j])));
if (groups[j] != sparse_gp_fit.measurement_groups.end()) {
col_alloc(j) = groups[j]->second.block_size;
all_same_group = all_same_group && groups[j] == groups[0];
}
feature_to_block[j] =
std::distance(sparse_gp_fit.measurement_groups.begin(), groups[j]);
}

// Second pass: compute mean vector and fill sparse Vp matrix
Eigen::VectorXd mean_correction = Eigen::VectorXd::Zero(features.size());
Eigen::SparseMatrix<double> Vp(sparse_gp_fit.train_features.size(),
features.size());
Vp.reserve(col_alloc);
for (Eigen::Index j = 0; j < features.size(); ++j) {
if (groups[j] == sparse_gp_fit.measurement_groups.end()) {
continue;
}

const auto &B = groups[j]->second;
Eigen::VectorXd Vbp =
this->covariance_function_(B.dataset.features,
std::vector<FeatureType>{features[j]}) -
K_up.col(j).transpose() * sparse_gp_fit.covariance_Y[B.block_index];
for (Eigen::Index i = 0; i < Vbp.size(); ++i) {
Vp.insert(B.initial_row + i, j) = Vbp(i);
}
mean_correction[j] = Vbp.dot(sparse_gp_fit.mean_w[B.block_index]);
}
Vp.makeCompressed();

Eigen::MatrixXd xi_lambda = sparse_gp_fit.A_ldlt.sqrt_solve(Vp);
Eigen::MatrixXd xi_u = sparse_gp_fit.Z * Vp;
Eigen::VectorXd VSV_diag{
(xi_lambda.transpose() * xi_lambda - xi_u.transpose() * xi_u)
.diagonal()};

const Eigen::VectorXd U_diag =
(K_up.transpose() * sparse_gp_fit.W * Vp).diagonal();

Eigen::VectorXd marginal_variance(cast::to_index(features.size()));
for (Eigen::Index i = 0; i < marginal_variance.size(); ++i) {
marginal_variance[i] = this->covariance_function_(
features[cast::to_size(i)], features[cast::to_size(i)]);
}

const Eigen::MatrixXd Q_sqrt =
sparse_gp_fit.train_covariance.sqrt_solve(K_up);
const Eigen::VectorXd Q_diag =
Q_sqrt.cwiseProduct(Q_sqrt).array().colwise().sum();
marginal_variance -= Q_diag;

const Eigen::MatrixXd S_sqrt =
sqrt_solve(sparse_gp_fit.sigma_R, sparse_gp_fit.P, K_up);
const Eigen::VectorXd S_diag =
S_sqrt.cwiseProduct(S_sqrt).array().colwise().sum();
marginal_variance += S_diag;

mean += mean_correction;

return MarginalDistribution(mean,
marginal_variance - (2 * U_diag + VSV_diag));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 can't say I follow all of it, but I know this was a tough one. Well done.

#include <fstream>
#include <gtest/gtest.h>

// #include "albatross/src/eigen/serializable_ldlt.hpp"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove

Copy link
Collaborator

@akleeman akleeman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits, and (as you're aware) I'm sure there are optimizations/improvements BUT this is F*&#ing awesome, and I don't see any changes here which would cause any backward incompatibility for use cases using PITC at the moment. good job!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants