Dot Product Similarity: The Fast Track to Vector Matching
Dot product similarity is the fastest way to compare vectors. With normalized embeddings, it becomes equivalent to cosine similarity, making it the preferred method in high-performance systems.
What Is Dot Product?
Dot product multiplies corresponding vector elements and sums them:
A · B = (A₁ × B₁) + (A₂ × B₂) + ... + (Aₙ × Bₙ)
Example:A = [1, 2, 3]B = [4, 5, 6]A · B = (1×4) + (2×5) + (3×6) = 4 + 10 + 18 = 32Dot Product vs. Cosine Similarity
With normalized vectors:
If ||A|| = 1 and ||B|| = 1 (unit vectors)
Cosine similarity = (A · B) / (||A|| × ||B||) = (A · B) / (1 × 1) = A · B
Result: Dot product and cosine similarity are identical!Computational advantage:
Cosine similarity: 1. Compute A · B 2. Compute ||A|| 3. Compute ||B|| 4. Divide A · B by product of norms Total operations: n multiplications + (n-1) additions + 2 norms + 1 division
Dot product (with normalized vectors): 1. Compute A · B Total operations: n multiplications + (n-1) additions
Savings: Skip all norm computations!Practical speedup:
768-dimensional vectorsCosine: ~2000 CPU cyclesDot product (normalized): ~768 CPU cyclesSpeedup: 2.6× faster
At scale (1M searches):Cosine: 2 secondsDot product: 0.77 secondsPractical difference in query servingNormalization: The Critical Requirement
Dot product similarity ONLY equals cosine similarity with normalized vectors.
What normalization means:
Vector: [3, 4]Magnitude: √(9 + 16) = 5Normalized: [3/5, 4/5] = [0.6, 0.8]Check: √(0.36 + 0.64) = 1.0 ✓
All normalized vectors have magnitude 1.0Without normalization:
A = [1, 2, 3] (magnitude ≈ 3.74)B = [2, 4, 6] (magnitude ≈ 7.48) (A scaled by 2)
Dot product: (1×2) + (2×4) + (3×6) = 2 + 8 + 18 = 28
Normalized:A' = [0.267, 0.535, 0.802]B' = [0.267, 0.535, 0.802]
A' · B' = (0.267×0.267) + (0.535×0.535) + (0.802×0.802) ≈ 1.0
Problem: Scaled vectors have different dot product!Cosine similarity handles this correctly.Practical Implications
Implementation in Vector Databases
Modern vector databases handle normalization automatically:
# Pinecone (uses dot product internally with normalized vectors)index.upsert(vectors=[ ("id1", embedding1), # Automatically normalized ("id2", embedding2),])
results = index.query(vector=query_embedding, top_k=5)# Uses dot product on normalized vectors
# Elasticsearch (offers both metrics){ "knn": { "field": "embedding", "query_vector": query_embedding, # Must be normalized "k": 5, "similarity": 0.5 }}
# FAISS (designed for dot product similarity)index = faiss.IndexFlatIP(dimension) # IP = Inner Productindex.add(embeddings) # Expects normalized embeddingsdistances, indices = index.search(query_embedding, k=5)Embedding Model Normalization
Quality embedding models are already normalized:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
embedding = model.encode("This is a sentence")print(f"Embedding norm: {np.linalg.norm(embedding)}") # Should be ~1.0Good models output unit-norm vectors by default.
When to Use Dot Product
Use dot product when:
✓ Vectors are normalized (magnitude = 1.0)✓ Using FAISS, Qdrant, Milvus, or similar optimized systems✓ Maximum performance is critical✓ You're willing to ensure proper normalizationUse cosine similarity when:
✓ Vectors might not be normalized✓ Readability over microseconds matters✓ Less familiarity with vector math✓ Using general-purpose databases (PostgreSQL with pgvector)Mathematical Relationship
Understanding the relationship clarifies when to use each:
For normalized vectors (||A|| = ||B|| = 1): Dot product = Cosine similarity
For non-normalized vectors: Dot product ≠ Cosine similarity Cosine = Dot product / (||A|| × ||B||) Dot product = Cosine × ||A|| × ||B||
If A is scaled by k and B by m: Dot product becomes k × m × original Cosine remains unchangedPerformance Example: Dot Product at Scale
Building a search system for 10M documents:
import numpy as npimport faiss
# Setupembeddings = np.random.randn(10000000, 384).astype('float32') # 10M docs# Normalizeembeddings = embeddings / np.linalg.norm(embeddings, axis=1, keepdims=True)
# Create indexindex = faiss.IndexFlatIP(384) # Dot product indexindex.add(embeddings)
# Queryquery = np.random.randn(1, 384).astype('float32')query = query / np.linalg.norm(query)
# Time retrievalimport timestart = time.time()distances, indices = index.search(query, k=10)elapsed = time.time() - start
print(f"Search time: {elapsed*1000:.2f}ms") # ~1-10ms typicalResults:
10M vectors, 384 dimensionsGPU (V100): ~2ms per searchCPU (modern): ~20-50ms per searchThroughput: 20-500 queries/second depending on hardwareOptimization: Quantization
Quantizing normalized embeddings with dot product:
# Original: float32 (4 bytes per element)# Quantized: int8 (-128 to 127, 1 byte per element)# Memory: 4x reduction# Speed: 4-8x faster dot product (SIMD operations)
import numpy as np
original = np.array([0.1, 0.8, -0.2, ...]) # float32quantized = np.int8(original * 127) # Scale and convert
# Dot product of quantized vectorsdot_product = np.dot(quantized, quantized) / (127 * 127)
# Similar results with 4x memory savingsModern vector databases like Qdrant support quantization automatically.
Advanced: Asymmetric Dot Product
For query-document pairs, asymmetry can help:
def asymmetric_similarity(query, document): # Query encoded differently (shorter context) query_vec = encode_query(query)
# Document encoded differently (longer context) doc_vec = encode_document(document)
# Both normalized, dot product gives similarity return np.dot(query_vec, doc_vec)Some systems train separate encoders for queries and documents, optimizing for asymmetric retrieval.
Debugging Dot Product Issues
Issue 1: Not Normalized
# Problemvectors = model.encode(texts) # Might not be normalizedsim = np.dot(vec1, vec2) # Wrong if not normalized!
# Solutionvectors = vectors / np.linalg.norm(vectors, axis=1, keepdims=True)sim = np.dot(vec1, vec2) # Now correctIssue 2: Scale Drift
# If documents added over time with different normalizationinitial_docs = normalize(docs_batch_1)new_docs = docs_batch_2 # Forgot to normalize!
# Results: Inconsistent similarities# Solution: Always normalize at ingestion
def ingest_documents(docs): embeddings = model.encode(docs) embeddings = embeddings / np.linalg.norm(embeddings, axis=1, keepdims=True) return store_embeddings(embeddings)Issue 3: Negative Similarities
# After normalization, similarities should be in [0, 1] range# Negative values indicate issues
similarity = np.dot(normalized_vec1, normalized_vec2)if similarity < 0: raise ValueError("Negative similarity indicates normalization error")Dot Product in 2024
Trends:
- FAISS (Facebook’s vector index) uses dot product exclusively
- Qdrant, Milvus heavily optimize dot product
- Most new embeddings normalize automatically
- Quantized dot product becoming standard for cost
Recommendation:
Modern RAG systems default to dot product with normalized vectorsIt's the industry standard for performance-critical retrievalUnderstand both dot product and cosine similarityKnow they're equivalent with normalizationConclusion
Dot product similarity is the fastest similarity metric for normalized vectors. It’s the standard in high-performance vector databases and is effectively identical to cosine similarity when vectors are normalized. For RAG systems serving high query throughput, understanding and using dot product correctly is essential.