Package org.apache.lucene.util.hnsw
Class InitializedHnswGraphBuilder
java.lang.Object
org.apache.lucene.util.hnsw.HnswGraphBuilder
org.apache.lucene.util.hnsw.InitializedHnswGraphBuilder
- All Implemented Interfaces:
HnswBuilder
This creates a graph builder that is initialized with the provided HnswGraph. This is useful for
merging HnswGraphs from multiple segments.
The builder performs the following operations:
- Copies the graph structure from the initializer graph with ordinal remapping
- Identifies and repairs disconnected nodes (nodes that lost a portion of their neighbors due to deletions)
- Rebalances the graph hierarchy to maintain proper level distribution according to the HNSW probabilistic model
- Allows incremental addition of new nodes while preserving initialized nodes
Disconnected Node Detection: A node is considered disconnected if it retains less than
DISCONNECTED_NODE_FACTOR of its original neighbor count from the source graph. This
typically occurs when many of the node's neighbors were deleted documents that couldn't be
remapped.
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.util.hnsw.HnswGraphBuilder
HnswGraphBuilder.GraphBuilderKnnCollector -
Field Summary
Fields inherited from class org.apache.lucene.util.hnsw.HnswGraphBuilder
beamCandidates, DEFAULT_BEAM_WIDTH, DEFAULT_MAX_CONN, frozen, graphSearcher, hnsw, HNSW_COMPONENT, hnswLock, infoStream, M, randSeed, scorer -
Method Summary
Modifier and TypeMethodDescriptionvoidaddGraphNode(int node) Inserts a doc with a vector value to the graphstatic InitializedHnswGraphBuilderfromGraph(RandomVectorScorerSupplier scorerSupplier, int beamWidth, long seed, HnswGraph initializerGraph, int[] newOrdMap, BitSet initializedNodes, int totalNumberOfVectors) Creates an initialized HNSW graph builder from an existing graph.static OnHeapHnswGraphinitGraph(HnswGraph initializerGraph, int[] newOrdMap, int totalNumberOfVectors, int beamWidth, RandomVectorScorerSupplier scorerSupplier) Convenience method to create a fully initialized on-heap HNSW graph without tracking initialized nodes.Methods inherited from class org.apache.lucene.util.hnsw.HnswGraphBuilder
addGraphNode, addVectors, build, create, create, getCompletedGraph, getGraph, setInfoStream
-
Method Details
-
fromGraph
public static InitializedHnswGraphBuilder fromGraph(RandomVectorScorerSupplier scorerSupplier, int beamWidth, long seed, HnswGraph initializerGraph, int[] newOrdMap, BitSet initializedNodes, int totalNumberOfVectors) throws IOException Creates an initialized HNSW graph builder from an existing graph.This factory method constructs a new graph builder, initializes it with the structure from the provided graph (applying ordinal remapping), and returns the builder ready for additional operations.
- Parameters:
scorerSupplier- provides vector similarity scoring for graph operationsbeamWidth- the search beam width for finding neighbors during graph constructionseed- random seed for level assignment and node promotion during rebalancinginitializerGraph- the source graph to copy structure fromnewOrdMap- maps old ordinals in the initializer graph to new ordinals in the merged graph; -1 indicates a deleted document that should be skippedinitializedNodes- bit set marking which nodes are already initialized (can be null if not tracking)totalNumberOfVectors- the total number of vectors in the merged graph (used for pre-allocation)- Returns:
- a new builder initialized with the provided graph structure
- Throws:
IOException- if an I/O error occurs during graph initialization
-
initGraph
public static OnHeapHnswGraph initGraph(HnswGraph initializerGraph, int[] newOrdMap, int totalNumberOfVectors, int beamWidth, RandomVectorScorerSupplier scorerSupplier) throws IOException Convenience method to create a fully initialized on-heap HNSW graph without tracking initialized nodes. This is useful when you just need the resulting graph structure without planning to add additional nodes incrementally.- Parameters:
initializerGraph- the source graph to copy structure fromnewOrdMap- maps old ordinals to new ordinals; -1 indicates deleted documentstotalNumberOfVectors- the total number of vectors in the merged graphbeamWidth- the search beam width for graph constructionscorerSupplier- provides vector similarity scoring- Returns:
- a fully initialized on-heap HNSW graph
- Throws:
IOException- if an I/O error occurs during graph initialization
-
addGraphNode
Description copied from interface:HnswBuilderInserts a doc with a vector value to the graph- Specified by:
addGraphNodein interfaceHnswBuilder- Overrides:
addGraphNodein classHnswGraphBuilder- Throws:
IOException
-