from n2 import HnswIndex
import random
f = 40
t = HnswIndex(f) # HnswIndex(f, "L2, euclidean, or angular")
for i in xrange(1000):
v = [random.gauss(0, 1) for z in xrange(f)]
t.add_data(v)
t.build(m=5, max_m0=10, n_threads=4)
t.save('test.hnsw')
u = HnswIndex(f)
u.load('test.hnsw')
print(u.search_by_id(0, 1000))Note that if a user passes a negative value, it will be set to a default value when the metric has the default value.
HnswIndex(dim, metric): returns a new Hnsw indexdim(int): dimension of vectorsmetric(string): an optional parameter for choosing a metric of distance. (‘L2’|'euclidean'|‘angular’)
index.add_data(v): adds vectorvv(list of float): a vector with dimensiondim
index.build(m, max_m0, ef_construction, n_threads, mult, neighbor_selecting, graph_merging): builds a hnsw graph with given configurations.m(int): max number of edges for nodes at level>0 (default=12)max_m0(int): max number of edges for nodes at level==0 (default=24)ef_construction(int): efConstruction (see HNSW paper…) (default=150)n_threads(int): number of threads for building indexmult(float): level multiplier (recommend: use default value) (default=1/log(1.0*M))neighbor_selecting(string): neighbor selecting policy- available values
"heuristic"(default): select neighbors using algorithm4 on HNSW paper (recommended)"naive": select closest neighbors (not recommended)
- available values
graph_merging(string): graph merging heuristic- available values
"skip"(default): do not merge (recommended for large scale of data(over 10M))"merge_level0": build an another graph in reverse order. then merge edges of level0 (recommended for under 10M scale data)
- available values
index.save(fname): saves the index to a diskfname(string)
index.load(fname): loads an index from a disk.fname(string) -use_mmap(bool): An optional parameter, default value is true. If this parameter is set, N2 loads model through mmap.
index.unload(): unloads (unmap)index.search_by_id(item_id, k, ef_serach=-1, include_distances=False): returnsknearest items.item_id(int)k(int)ef_search(int): default value = 50 * kinclude_distances(boolean): If you set this argument to True, it will return a list of tuples((item_id, distance)).
index.search_by_vector(v, k, ef_serach=-1, include_distances=False): returnsknearest items.v(list of float): a query vectork(int)ef_search(int): default value = 50 * kinclude_distances(boolean): If you set this argument to True, it will return a list of tuples((item_id, distance)).