Diagram showing PostgreSQL logo with pgvector extension indexing a large volume of AI embedding vectors on a server with 12GB RAM.
AI/ML

Indexing 100M Vectors in 20 Minutes on PostgreSQL with 12GB RAM

Codemurf Team

Codemurf Team

AI Content Generator

Dec 13, 2025
5 min read
0 views
Back to Blog

Learn how to achieve high-performance vector indexing on PostgreSQL with pgvector using optimized strategies for AI embeddings and semantic search on constrained hardware.

The demand for efficient semantic search capabilities has exploded with the rise of AI applications. While specialized vector databases often grab headlines, PostgreSQL, enhanced by the pgvector extension, remains a formidable and production-ready contender. The real challenge lies in operational efficiency: how do you manage massive vector datasets on practical, cost-effective hardware? We demonstrate a methodology for indexing 100 million embedding vectors in just 20 minutes using a single PostgreSQL instance with only 12GB of RAM.

The Power of PostgreSQL and pgvector

PostgreSQL, the venerable open-source RDBMS, has evolved into a powerful multi-model database. The pgvector extension adds support for vector similarity search directly within SQL, enabling seamless integration of AI embeddings with traditional relational data. This eliminates complex data pipelines between separate systems for transactional and vector data.

Key to this performance is the Hierarchical Navigable Small World (HNSW) index. HNSW creates a multi-layered graph for approximate nearest neighbor (ANN) search, offering an excellent trade-off between query speed, recall accuracy, and build time. Unlike Inverted File (IVFFlat) indexes, HNSW doesn't require training and provides superior search performance out-of-the-box, making it ideal for dynamic datasets.

Architecture for Speed on Constrained Hardware

Indexing 100M vectors with limited RAM requires a deliberate approach to avoid swap thrashing and maximize I/O throughput. The core strategy involves decoupling the data loading phase from the index creation phase and leveraging PostgreSQL's tuning knobs.

1. Optimized Data Loading: Bulk insertion is performed using COPY commands, the fastest method to ingest data into PostgreSQL. Vectors are loaded into an unindexed table first. Crucially, we temporarily increase maintenance_work_mem significantly during the index build phase, allowing PostgreSQL to use more memory for the sorting and graph construction operations the HNSW algorithm requires.

2. Strategic Index Build Parameters: The HNSW index in pgvector accepts critical parameters: m (the maximum number of connections per layer) and ef_construction (the size of the dynamic candidate list during build). Higher values improve graph quality and subsequent search accuracy but increase build time and memory. For this scale, we found a moderate m of 16 and an ef_construction of 64 provided an optimal balance, enabling fast construction while maintaining high recall for search queries.

3. System-Level Tuning: To prevent the OS from caching too much data and starving the maintenance_work_mem operation, we moderately reduced shared_buffers. We also ensured the system had a fast SSD (NVMe) to handle the intensive random I/O of the graph construction process. The max_parallel_maintenance_workers setting was increased to allow PostgreSQL to use multiple CPU cores for index building.

Key Takeaways and Performance Insights

This exercise yields several critical insights for engineering teams:

  • PostgreSQL is Production-Ready for Vectors: With proper tuning, it can handle web-scale vector workloads, simplifying architecture by co-locating vectors with related metadata.
  • RAM is for Indexing, Disk is for Storage: The 12GB RAM was used almost exclusively as working memory for the HNSW construction algorithm. The 100M vectors themselves (at 1536 dimensions, FP16) resided on disk, showcasing that massive vector datasets do not need to fit entirely in RAM.
  • Parameter Tuning is Non-Linear: The relationship between HNSW parameters (m, ef_construction), build time, memory, and final query performance is complex. Profiling with a subset of data is essential before a full production build.
  • The 20-Minute Benchmark: Achieved on a cloud instance with 4 vCPUs, 12GB RAM, and a local NVMe SSD. This highlights the incredible cost-performance ratio possible with mature open-source software.

Conclusion

Specialized vector databases offer compelling features, but for many organizations, the simplest and most robust path to semantic search lies within their existing PostgreSQL infrastructure. By leveraging pgvector's HNSW index and applying targeted database tuning, it's possible to achieve remarkable performance—indexing 100 million vectors in 20 minutes—on modest, cost-effective hardware. This approach reduces system complexity, leverages existing SQL expertise, and provides a unified data store for both structured and vector data, making AI-powered search more accessible than ever.

Codemurf Team

Written by

Codemurf Team

AI Content Generator

Sharing insights on technology, development, and the future of AI-powered tools. Follow for more articles on cutting-edge tech.