Interface VectorUtilSupport


public interface VectorUtilSupport
Interface for implementations of VectorUtil support.
NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
  • Method Summary

    Modifier and Type
    Method
    Description
    float
    cosine(byte[] a, byte[] b)
    Returns the cosine similarity between the two byte vectors.
    float
    cosine(float[] v1, float[] v2)
    Returns the cosine similarity between the two vectors.
    int
    dotProduct(byte[] a, byte[] b)
    Returns the dot product computed over signed bytes.
    float
    dotProduct(float[] a, float[] b)
    Calculates the dot product of the given float arrays.
    int
    filterByScore(int[] docBuffer, double[] scoreBuffer, double minScoreInclusive, int upTo)
    filter both docBuffer and scoreBuffer with minScoreInclusive, each docBuffer and scoreBuffer of the same index forms a pair, pairs with score not greater than or equal to minScoreInclusive will be filtered out from the array.
    int
    findNextGEQ(int[] buffer, int target, int from, int to)
    Given an array buffer that is sorted between indexes 0 inclusive and to exclusive, find the first array index whose value is greater than or equal to target.
    long
    int4BitDotProduct(byte[] int4Quantized, byte[] binaryQuantized)
    Compute the dot product between a quantized int4 vector and a binary quantized vector.
    long
    int4DibitDotProduct(byte[] int4Quantized, byte[] dibitQuantized)
    Compute the dot product between a quantized int4 vector and a dibit (2-bit) quantized vector.
    int
    int4DotProduct(byte[] a, byte[] b)
    Returns the dot product computed over unsigned half-bytes, both uncompressed.
    int
    int4DotProductBothPacked(byte[] a, byte[] b)
    Returns the dot product computed over unsigned half-bytes, both compressed.
    int
    int4DotProductSinglePacked(byte[] unpacked, byte[] packed)
    Returns the dot product computed over unsigned half-bytes, one compressed.
    int
    int4SquareDistance(byte[] a, byte[] b)
    Returns the sum of squared differences between two unsigned half-byte vectors, both uncompressed.
    int
    int4SquareDistanceBothPacked(byte[] a, byte[] b)
    Returns the sum of squared differences between two unsigned half-byte vectors, both compressed.
    int
    int4SquareDistanceSinglePacked(byte[] unpacked, byte[] packed)
    Returns the sum of squared differences between two unsigned half-byte vectors, one compressed.
    float
    minMaxScalarQuantize(float[] vector, byte[] dest, float scale, float alpha, float minQuantile, float maxQuantile)
    Quantizes vector, putting the result into dest.
    float
    recalculateScalarQuantizationOffset(byte[] vector, float oldAlpha, float oldMinQuantile, float scale, float alpha, float minQuantile, float maxQuantile)
    Recalculates the offset for vector.
    int
    squareDistance(byte[] a, byte[] b)
    Returns the sum of squared differences of the two byte vectors.
    float
    squareDistance(float[] a, float[] b)
    Returns the sum of squared differences of the two vectors.
    int
    uint8DotProduct(byte[] a, byte[] b)
    Returns the dot product computed as though the bytes were unsigned.
    int
    uint8SquareDistance(byte[] a, byte[] b)
    Returns the sum of squared differences of the two unsigned byte vectors.
  • Method Details

    • dotProduct

      float dotProduct(float[] a, float[] b)
      Calculates the dot product of the given float arrays.
    • cosine

      float cosine(float[] v1, float[] v2)
      Returns the cosine similarity between the two vectors.
    • squareDistance

      float squareDistance(float[] a, float[] b)
      Returns the sum of squared differences of the two vectors.
    • dotProduct

      int dotProduct(byte[] a, byte[] b)
      Returns the dot product computed over signed bytes.
    • int4DotProduct

      int int4DotProduct(byte[] a, byte[] b)
      Returns the dot product computed over unsigned half-bytes, both uncompressed.
    • int4DotProductSinglePacked

      int int4DotProductSinglePacked(byte[] unpacked, byte[] packed)
      Returns the dot product computed over unsigned half-bytes, one compressed.
    • int4DotProductBothPacked

      int int4DotProductBothPacked(byte[] a, byte[] b)
      Returns the dot product computed over unsigned half-bytes, both compressed.
    • uint8DotProduct

      int uint8DotProduct(byte[] a, byte[] b)
      Returns the dot product computed as though the bytes were unsigned.
    • cosine

      float cosine(byte[] a, byte[] b)
      Returns the cosine similarity between the two byte vectors.
    • squareDistance

      int squareDistance(byte[] a, byte[] b)
      Returns the sum of squared differences of the two byte vectors.
    • int4SquareDistance

      int int4SquareDistance(byte[] a, byte[] b)
      Returns the sum of squared differences between two unsigned half-byte vectors, both uncompressed.
    • int4SquareDistanceSinglePacked

      int int4SquareDistanceSinglePacked(byte[] unpacked, byte[] packed)
      Returns the sum of squared differences between two unsigned half-byte vectors, one compressed.
    • int4SquareDistanceBothPacked

      int int4SquareDistanceBothPacked(byte[] a, byte[] b)
      Returns the sum of squared differences between two unsigned half-byte vectors, both compressed.
    • uint8SquareDistance

      int uint8SquareDistance(byte[] a, byte[] b)
      Returns the sum of squared differences of the two unsigned byte vectors.
    • findNextGEQ

      int findNextGEQ(int[] buffer, int target, int from, int to)
      Given an array buffer that is sorted between indexes 0 inclusive and to exclusive, find the first array index whose value is greater than or equal to target. This index is guaranteed to be at least from. If there is no such array index, to is returned.
    • int4BitDotProduct

      long int4BitDotProduct(byte[] int4Quantized, byte[] binaryQuantized)
      Compute the dot product between a quantized int4 vector and a binary quantized vector. It is assumed that the int4 quantized bits are packed in the byte array in the same way as the OptimizedScalarQuantizer.transposeHalfByte(byte[], byte[]) and that the binary bits are packed the same way as OptimizedScalarQuantizer.packAsBinary(byte[], byte[]).
      Parameters:
      int4Quantized - half byte packed int4 quantized vector
      binaryQuantized - byte packed binary quantized vector
      Returns:
      the dot product
    • int4DibitDotProduct

      long int4DibitDotProduct(byte[] int4Quantized, byte[] dibitQuantized)
      Compute the dot product between a quantized int4 vector and a dibit (2-bit) quantized vector. It is assumed that the int4 quantized bits are packed in the byte array in the same way as the OptimizedScalarQuantizer.transposeHalfByte(byte[], byte[]) and that the dibit bits are packed the same way as OptimizedScalarQuantizer.transposeDibit(byte[], byte[]).
      Parameters:
      int4Quantized - half byte packed int4 quantized vector (4 stripes)
      dibitQuantized - dibit packed quantized vector (2 stripes)
      Returns:
      the dot product
    • minMaxScalarQuantize

      float minMaxScalarQuantize(float[] vector, byte[] dest, float scale, float alpha, float minQuantile, float maxQuantile)
      Quantizes vector, putting the result into dest.
      Parameters:
      vector - the vector to quantize
      dest - the destination vector
      scale - the scaling factor
      alpha - the alpha value
      minQuantile - the lower quantile of the distribution
      maxQuantile - the upper quantile of the distribution
      Returns:
      the corrective offset that needs to be applied to the score
    • recalculateScalarQuantizationOffset

      float recalculateScalarQuantizationOffset(byte[] vector, float oldAlpha, float oldMinQuantile, float scale, float alpha, float minQuantile, float maxQuantile)
      Recalculates the offset for vector.
      Parameters:
      vector - the vector to quantize
      oldAlpha - the previous alpha value
      oldMinQuantile - the previous lower quantile
      scale - the scaling factor
      alpha - the alpha value
      minQuantile - the lower quantile of the distribution
      maxQuantile - the upper quantile of the distribution
      Returns:
      the new corrective offset
    • filterByScore

      int filterByScore(int[] docBuffer, double[] scoreBuffer, double minScoreInclusive, int upTo)
      filter both docBuffer and scoreBuffer with minScoreInclusive, each docBuffer and scoreBuffer of the same index forms a pair, pairs with score not greater than or equal to minScoreInclusive will be filtered out from the array.
      Parameters:
      docBuffer - doc buffer contains docs (or some other value forms a pair with scoreBuffer)
      scoreBuffer - score buffer contains scores to be compared with minScoreInclusive
      minScoreInclusive - minimal required score to not be filtered out
      upTo - where the filter should end
      Returns:
      how many pairs left after filter