DCEE

Benchmark with FAISS baselines

The DCEE repository includes benchmark_dcee.py, which measures Recall@K against exact inner-product neighbors (normalized vectors = cosine similarity) on the same queries for every method.

Example snapshot

Representative run: 50,000 normalized vectors, same 200 queries, K = 5, DCEE with tuned + AMP settings. Numbers depend on machine, drivers, and dataset — treat as directional, not guarantees.

MethodRecall@5P50 (ms)P95 (ms)QPS (approx.)Build (s)Size (MB)
DCEE96.4%0.971.014228.576.40
FAISS IndexFlatIP100.0%0.530.7918970.0125.60
FAISS HNSW (M=32, ef=64)100.0%0.090.11106890.6339.21
FAISS IVF-Flat (nprobe=8)90.6%0.030.03363640.4826.47

How to read this

  • Flat IP is the exact baseline (full stored vectors, brute-force search per query in this harness).
  • HNSW / IVF are mature approximate indexes — often faster query times with different memory footprints.
  • DCEE emphasizes smaller on-disk index size (quantized deltas) with approximate recall; tune probes and refinement for your SLA.

Compression

With int8 delta quantization, payload is often around ~4× smaller than raw float32 storage for the vector data — exact ratio depends on settings and corpus correlation.

Multi-hop retrieval benchmark

Besides single-hop Recall@K, DCEE can also be used in iterative multi-hop retrieval loops where each round expands the frontier with the top-K neighbors of all current hits (a common pattern in graph-style RAG systems).

The repository includes benchmark_multihop_retrieval_dcee.py, which simulates synthetic chains of correlated embeddings and measures how often a distant target node becomes reachable within a small number of hops under the same expansion policy for both DCEE and an exact cosine oracle.

Chain length LExact multi-hop%DCEE multi-hop%Exact hops (avg)DCEE hops (avg)
2100.0%100.0%1.001.02
3100.0%100.0%2.002.12
4100.0%100.0%2.232.24
5100.0%100.0%2.262.32

On this synthetic multi-hop benchmark (chain length 2–5, beam = 32, max depth = 8), DCEE matches the exact cosine oracle with 100% multi-hop recall while keeping multi-hop expansion in the tens of milliseconds per batch, showing that the compressed index works reliably for multi-hop retrieval–style expansion.

TurboQuant-style GloVe findings (DCEE)

Using benchmark_turboquant_style_dcee.py on GloVe dimensions 50/100/200/300 (100k base, 1k queries), DCEE shows a strong bits-per-vector efficiency profile while maintaining practical Recall@10 and throughput.

DimRecall@10 (%)Build (s)Query (s)QPSBits / vecEst. MB
5094.515.37323.8229261.6512.06.40
10092.599.75005.2909189.0992.012.40
20090.3323.13136.4738154.51952.024.40
30087.9941.170312.130482.42912.036.40
  • Bits/vec finding: this run shows a strong, predictable compression profile (bits/vec scaling linearly with dimension).
  • Next dataset focus: DBpedia benchmark runs are in progress to compare with published TurboQuant-style settings at higher dimensions.
  • Next optimization focus: improve retrieval quality (Recall@10) while preserving DCEE's compact bit budget.