Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-uniform vector quantization #374

Open
wants to merge 174 commits into
base: main
Choose a base branch
from
Open

Non-uniform vector quantization #374

wants to merge 174 commits into from

Conversation

marianotepper
Copy link
Collaborator

This PR adds a new feature that uses quantized vectors for re-ranking. The quantization is computed on subvectors of each vector in the index, using a non-uniform quantizer.

…n OptimizationResult that facilitates error checking.
# Conflicts:
#	jvector-base/src/main/java/io/github/jbellis/jvector/vector/VectorUtil.java
#	jvector-base/src/main/java/io/github/jbellis/jvector/vector/VectorUtilSupport.java
#	jvector-native/src/main/java/io/github/jbellis/jvector/vector/NativeVectorUtilSupport.java
#	jvector-twenty/src/main/java/io/github/jbellis/jvector/vector/PanamaVectorUtilSupport.java
#	jvector-twenty/src/main/java/io/github/jbellis/jvector/vector/SimdOps.java
@marianotepper marianotepper marked this pull request as ready for review December 19, 2024 00:42
Copy link
Collaborator

@jkni jkni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor nits/questions. Overall, looks quite good.

var function = scorer.scoreFunctionFor(queryVector, vsf);

return new ScoreFunction.ExactScoreFunction() {
private final QuantizedVector scratch = NVQuantization.QuantizedVector.createEmpty(nvq.subvectorSizesAndOffsets, nvq.bitsPerDimension);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In most other places like this, we use a reusable thread-local scratch. I'm not sure if it's worth it, so consider this a possible deferred optimization.

@@ -324,6 +324,8 @@ public OnDiskGraphIndexWriter build() throws IOException {
int dimension;
if (features.containsKey(FeatureId.INLINE_VECTORS)) {
dimension = ((InlineVectors) features.get(FeatureId.INLINE_VECTORS)).dimension();
} else if (features.containsKey(FeatureId.NVQ_VECTORS)) {
dimension = ((NVQ) features.get(FeatureId.NVQ_VECTORS)).dimension();
} else {
throw new IllegalArgumentException("Inline vectors must be provided.");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update this error message to indicate that either inline vectors or NVQ vectors much be provided.

*/
public LossFunction(int nDims) {
if (nDims <= 0) {
throw new IllegalArgumentException("The standard deviation initSigma must be positive");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad error message copy/paste

minBounds = new float[nDims];
maxBounds = new float[nDims];
for (int d = 0; d < nDims; d++) {
minBounds[d] = Float.NEGATIVE_INFINITY;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use Arrays.fill here

*/
public class NESOptimizer {
public enum Distribution {
MULTINORMAL, // This is for adding future support for the multinormal case (Algorithm 5 in [1])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think we'll get back to MULTINORMAL relatively soon? if not, is this worth cleaning up for now?

float[] nvqCosine8bit(VectorFloat<?> vector, ByteSequence<?> bytes, float growthRate, float midpoint, float minValue, float maxValue, VectorFloat<?> centroid);

/**
* When using 4-bit NVQ quantization and vector instructions, it is easier to unpack all even entries, and then all
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the reference to 4-bit quantization in 8bit javadoc?

}
}

{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

introducing the new variable scopes here is perfectly fine, but IMO, would also be fine to put this all in one scope and re-use the variables. There are definitely codebases where this would be more idiomatic, but I can't think of other places we do this in JVector. Just thinking out loud -- don't feel like this has to change.

public void testSaveLoadNVQ() throws Exception {

int[][] testsConfigAndResults = {
//Tuples of: nDimensions, nSubvectors, number of bots per dimension, and the expected number of bytes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

number of bots per dimension can be removed here (and also, typo, but it will get removed anyway)

// NVQ quantization instructions start here
//---------------------------------------------

static FloatVector const1f = FloatVector.broadcast(FloatVector.SPECIES_PREFERRED, 1.f);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same static final discussion here as in VectorSimdOps


/**
* Quantize a subvector as an 8-bit quantized subvector.
* All values of the vector must be in [0, 1]. For example, the input vector has been
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by All values of the vector... and then the references to bias/scale in this/nvqLoss/nvqUniformLoss. Are those remnants of a previous approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants