Introduction
DCEE (Delta-Compressed Embedding Engine) is a Python library for approximate nearest-neighbor search over embeddings that are correlated in vector space — for example chunks from the same document, consecutive log lines, chat turns, or clusters of semantically similar items.
Core idea
Instead of storing every vector independently as dense floats, DCEE partitions the corpus with k-means, orders vectors inside each cluster so neighboring vectors are close, then stores keyframes and quantized deltas along that order. That mimics video coding (keyframes + differences) and reduces bytes when deltas are small.
Query pipeline
- Keyframe routing: compare the query to one representative vector per cluster (fast matmul).
- Adaptive Margin Probing (AMP): optionally scan extra clusters when keyframe scores sit on a flat plateau (ambiguous routing).
- Coarse scoring inside chosen clusters, then refinement on the best candidates with full-precision cosine similarity.
When it works well
- Embeddings from sliding windows over one file or topic thread.
- Corpora where MiniBatch k-means finds coherent groups.
When it does not
- IID / uncorrelated vectors — deltas do not compress; consider FAISS HNSW / IVF-PQ instead.
- Exact nearest-neighbor guarantees — DCEE is approximate by design (quantization + partial cluster search).
Compute backend
Reconstruction and scoring use CuPy when a CUDA GPU is available; otherwise NumPy on CPU (same API).