Class OptimizedScalarQuantizer

java.lang.Object
org.apache.lucene.util.quantization.OptimizedScalarQuantizer

public class OptimizedScalarQuantizer extends Object
This is a scalar quantizer that optimizes the quantization intervals for a given vector. This is done by optimizing the quantiles of the vector centered on a provided centroid. The optimization is done by minimizing the quantization loss via coordinate descent.

Local vector quantization parameters was originally proposed with LVQ in Similarity search in the blink of an eye with compressed indices This technique builds on LVQ, but instead of taking the min/max values, a grid search over the centered vector is done to find the optimal quantization intervals, taking into account anisotropic loss.

Anisotropic loss is first discussed in depth by Accelerating Large-Scale Inference with Anisotropic Vector Quantization by Ruiqi Guo, et al.

WARNING: This API is experimental and might change in incompatible ways in the next release.
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static final record 
    Quantization result containing the lower and upper interval bounds, the additional correction
  • Constructor Summary

    Constructors
    Constructor
    Description
    Create a new scalar quantizer with the default lambda and number of iterations.
    OptimizedScalarQuantizer(VectorSimilarityFunction similarityFunction, float lambda, int iters)
    Create a new scalar quantizer with the given similarity function, lambda, and number of iterations.
  • Method Summary

    Modifier and Type
    Method
    Description
    static float[]
    deQuantize(byte[] quantized, float[] dequantized, byte bits, float lowerInterval, float upperInterval, float[] centroid)
    Dequantizes a quantized byte vector back to float values.
    static int
    discretize(int value, int bucket)
     
    multiScalarQuantize(float[] vector, byte[][] destinations, byte[] bits, float[] centroid)
    Quantize the vector to the multiple bit levels.
    static void
    packAsBinary(byte[] vector, byte[] packed)
    Pack the vector as a binary array.
    scalarQuantize(float[] vector, byte[] destination, byte bits, float[] centroid)
    Quantize the vector to the given bit level.
    static void
    transposeDibit(byte[] vector, byte[] packed)
    Transpose a 2-bit (dibit) quantized vector into a byte array for efficient bitwise operations.
    static void
    transposeHalfByte(byte[] q, byte[] quantQueryByte)
    Transpose the query vector into a byte array allowing for efficient bitwise operations with the index bit vectors.
    static void
    unpackBinary(byte[] packed, byte[] vector)
    Unpack a binary array back to a vector.
    static void
    untransposeDibit(byte[] packed, byte[] vector)
    Untranspose a packed 2-bit (dibit) vector back to its original form.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • OptimizedScalarQuantizer

      public OptimizedScalarQuantizer(VectorSimilarityFunction similarityFunction, float lambda, int iters)
      Create a new scalar quantizer with the given similarity function, lambda, and number of iterations.
      Parameters:
      similarityFunction - similarity function to use
      lambda - lambda value to use
      iters - number of iterations to use
    • OptimizedScalarQuantizer

      public OptimizedScalarQuantizer(VectorSimilarityFunction similarityFunction)
      Create a new scalar quantizer with the default lambda and number of iterations.
      Parameters:
      similarityFunction - similarity function to use
  • Method Details

    • multiScalarQuantize

      public OptimizedScalarQuantizer.QuantizationResult[] multiScalarQuantize(float[] vector, byte[][] destinations, byte[] bits, float[] centroid)
      Quantize the vector to the multiple bit levels.
      Parameters:
      vector - raw vector
      destinations - array of destinations to store the quantized vector
      bits - array of bits to quantize the vector
      centroid - centroid to center the vector
      Returns:
      array of quantization results
    • scalarQuantize

      public OptimizedScalarQuantizer.QuantizationResult scalarQuantize(float[] vector, byte[] destination, byte bits, float[] centroid)
      Quantize the vector to the given bit level.
      Parameters:
      vector - raw vector
      destination - destination to store the quantized vector
      bits - number of bits to quantize the vector
      centroid - centroid to center the vector
      Returns:
      quantization result
    • deQuantize

      public static float[] deQuantize(byte[] quantized, float[] dequantized, byte bits, float lowerInterval, float upperInterval, float[] centroid)
      Dequantizes a quantized byte vector back to float values.

      This method reconstructs float vectors from their quantized byte representation using linear interpolation between the corrective range [a, b] and adding back the centroid offset.

      Parameters:
      quantized - the quantized byte vector to dequantize
      dequantized - the output array to store dequantized float values
      bits - the number of bits used for quantization
      lowerInterval - lower value of quantization range
      upperInterval - upper value of quantization range
      centroid - the centroid vector that was subtracted during quantization
      Returns:
      the dequantized float array (same as dequantized parameter)
    • discretize

      public static int discretize(int value, int bucket)
    • transposeHalfByte

      public static void transposeHalfByte(byte[] q, byte[] quantQueryByte)
      Transpose the query vector into a byte array allowing for efficient bitwise operations with the index bit vectors. The idea here is to organize the query vector bits such that the first bit of every dimension is in the first set dimensions bits, or (dimensions/8) bytes. The second, third, and fourth bits are in the second, third, and fourth set of dimensions bits, respectively. This allows for direct bitwise comparisons with the stored index vectors through summing the bitwise results with the relative required bit shifts.

      This bit decomposition for fast bitwise SIMD operations was first proposed in:

         Gao, Jianyang, and Cheng Long. "RaBitQ: Quantizing High-
         Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search."
         Proceedings of the ACM on Management of Data 2, no. 3 (2024): 1-27.
         
      Parameters:
      q - the query vector, assumed to be half-byte quantized with values between 0 and 15
      quantQueryByte - the byte array to store the transposed query vector
    • packAsBinary

      public static void packAsBinary(byte[] vector, byte[] packed)
      Pack the vector as a binary array.
      Parameters:
      vector - the vector to pack
      packed - the packed vector
    • unpackBinary

      public static void unpackBinary(byte[] packed, byte[] vector)
      Unpack a binary array back to a vector.
      Parameters:
      packed - the packed binary array
      vector - the unpacked vector (each byte will be 0 or 1)
    • transposeDibit

      public static void transposeDibit(byte[] vector, byte[] packed)
      Transpose a 2-bit (dibit) quantized vector into a byte array for efficient bitwise operations. The result has 2 stripes: similar to transposeHalfByte(byte[], byte[]), but only for 2 bits
      Parameters:
      vector - the 2-bit quantized vector (values 0-3)
      packed - the byte array to store the transposed vector
    • untransposeDibit

      public static void untransposeDibit(byte[] packed, byte[] vector)
      Untranspose a packed 2-bit (dibit) vector back to its original form. This is the reverse of transposeDibit(byte[], byte[]).
      Parameters:
      packed - the packed/transposed byte array
      vector - the output vector where each byte will contain a 2-bit value (0-3)