Typical usage: configure → build an index from a matrix of embeddings → search → persist → reload. This mirrors how you might call DCEE from a FastAPI worker or batch job.
import numpy as np
from dcee import DCEEConfig, DCEEEngine, is_gpu_available
print("GPU:", is_gpu_available())
emb = np.random.randn(10_000, 128).astype(np.float32)
emb /= np.linalg.norm(emb, axis=1, keepdims=True)
cfg = DCEEConfig.tuned_for(len(emb), emb.shape[1])
engine = DCEEEngine(cfg)
engine.build(emb)
q = emb[0]
for idx, score in engine.search(q, top_k=5):
print(idx, score)
engine.save("index.dce2")
loaded = DCEEEngine.from_file("index.dce2")
print(loaded.search(q, top_k=3))Full signatures and more snippets: API reference.
DCEEConfig.tuned_for(n_vectors, dim) sets heuristic defaults for cluster count, probes, refinement pool, and keyframe spacing.search returns a list of (global_row_index, cosine_similarity) pairs (vectors are treated as unit-normalized for scoring).DCEEEngine.from_file(path, verbose=False) in production to avoid extra logging during load.cfg.verbose = False for quieter builds and tqdm progress.The main project ships helper scripts (run from the repo root after dev install):
benchmark_dcee.py — compare DCEE to FAISS baselines on the same queries.tune_dcee.py — random search over hyperparameters (recall vs latency).test_realworld_dcee.py — synthetic or sentence-transformer corpora.