HNSW Algorithm — Interactive Visualization

Hierarchical Navigable Small World graphs for Approximate Nearest Neighbor search

🏗️
1. Build the Graph
Click + Add 10 Points to populate the graph. Each point is randomly promoted to higher layers — creating a natural hierarchy.
🔍
2. Run a Search
Click Search Nearest to place a random query point. HNSW will find its nearest neighbor using the hierarchy.
👣
3. Step Through
Use Prev / Next to walk through each decision. Watch how the search descends from sparse layers to dense ones.
highway levels
connections/node
build quality
Canvas click:
Node Colors: Entry Point (search starts here) Regular Node Visited Current Best Query Point

What's Happening

Welcome to the HNSW visualization.

Step 1: Click + Add 10 Points to build a graph.
Step 2: Click 🔍 Search Nearest to run a search.
Step 3: Use Prev / Next to walk through each step.
Distance: Euclidean
√((x₁−x₂)² + (y₁−y₂)²)
How HNSW Works

HNSW builds a multi-layer graph. Think of it like a city's transport network: highways (top layers) let you travel far quickly, then local roads (bottom layer) take you exactly where you need to go.

  • Layer 0 (bottom): Contains all points. Highest accuracy, but too slow to search exhaustively.
  • Higher layers: Contain a random subset of points. Sparser = faster to navigate.
  • Search strategy: Start at the top, greedily move toward the query, then descend. Each descent gives finer resolution.

Result: logarithmic search time even for millions of points, with high recall.

Understanding the Parameters
  • Layers: Number of hierarchy levels. 3 is good for small graphs. More layers help very large datasets find shortcuts. Too many layers = wasted memory.
  • M (connections): How many neighbors each node connects to. Higher M = denser graph = better accuracy, but slower inserts and more memory. Typical production values: 5–32.
  • ef (build quality): How hard the algorithm searches for neighbors during construction. Higher ef = better quality connections = better recall later. Does not affect search speed, only build time.
Why is HNSW Useful?

Exact nearest neighbor search in high-dimensional spaces (like embeddings from LLMs) is extremely slow — you'd have to compare every point. HNSW trades a tiny bit of accuracy for massive speed gains:

  • Search time scales as O(log n) instead of O(n)
  • Achieves 95–99% recall in practice
  • Powers vector databases like Pinecone, Weaviate, Chroma, and FAISS