Object-storage-native search engine for vector, full-text, and hybrid retrieval at petabyte scale
turbopuffer documents a fast search engine at turbopuffer.com/docs built natively on object storage (S3, GCS, Azure Blob) with NVMe/memory caching for compute. Architecture docs describe a write-ahead log on object storage, SPFresh centroid-based ANN indexes for vectors, inverted BM25 indexes for full-text, exact metadata indexes with native filtering, and branching copy-on-write namespaces. The API supports vector ANN queries, BM25 full-text, hybrid multi-queries, regex/trigram search, filters, and encryption with customer keys per turbopuffer.com/docs/index and turbopuffer.com/docs/architecture. Docs cite production-scale limits observed (4T+ documents, 10M+ writes/s, 25k+ queries/s) with tradeoffs: higher write latency from object-storage durability and occasional cold queries on uncached namespaces.
Use cases
- First-stage retrieval narrowing millions of documents for RAG pipelines
- Cost-sensitive vector search where object storage economics beat in-memory-only DBs
- Hybrid lexical + semantic search with high-recall filtered ANN
- Petabyte-scale namespaces with pinned hot sets for low latency
- Agent search layers when paired with external embedders
Key features
- SPFresh centroid ANN index optimized for object storage roundtrips
- Native metadata filtering integrated into vector queries (not pre/post-filter only)
- Hybrid vector + BM25 full-text with branching namespaces
- Strongly consistent writes via WAL on object storage
- Multi-tenant, single-tenant, or BYOC deployment options
Who Is It For?
- Teams optimizing search cost at billion-document scale
- Engineers building RAG with strong filter requirements on vectors
- Startups wanting managed search without operating HNSW clusters
Frequently Asked Questions
- Is turbopuffer only a vector database?
- Docs position it as a search engine supporting vector, full-text, regex, and filters—not vector-only.
- What are the latency tradeoffs?
- Architecture docs cite ~165ms p50 write latency for 500kB upserts and cold query latency around hundreds of ms when uncached; cached queries can reach low tens of ms.
- Which index underpins vector search?
- turbopuffer.com/docs/architecture documents SPFresh centroid-based ANN rather than graph indexes like HNSW.
Related
Related
3 Indexed items
Typesense
Typesense documents an open-source search engine at typesense.org/docs for fast typo-tolerant keyword search, faceting, and vector retrieval. Vector search docs at typesense.org/docs/30.2/api/vector-search describe KNN search on imported embeddings or auto-generated embeddings via OpenAI, Google PaLM API, or built-in Hugging Face models in huggingface.co/typesense/models (use the `ts` namespace prefix). Features include semantic search, hybrid search with rank fusion and adjustable `alpha` weighting, similar-document queries by ID, HNSW approximate search with optional `flat_search_cutoff` brute-force mode, and cosine `vector_distance` scoring. Deploy via Typesense Cloud or self-hosted Docker/binaries with REST API and official client libraries.
LanceDB
LanceDB documents a multimodal lakehouse for AI at docs.lancedb.com, built on the open-source Lance columnar format for storing vectors, metadata, raw bytes, and embeddings in unified tables. LanceDB OSS is an embedded library with Python, TypeScript, and Rust SDKs for local development; LanceDB Enterprise is a distributed managed lakehouse for search, curation, feature engineering, and training workflows per docs.lancedb.com. Features include vector/semantic search, BM25 full-text search, hybrid search with SQL filters, versioning, and cloud object-store integration (S3, GCS, Azure).
Qdrant
Qdrant documents an AI-native vector search engine at qdrant.tech/documentation for storing, indexing, and querying high-dimensional vectors with optional payloads, supporting dense, sparse, and multi-vector configurations. Official guides cover Docker/Kubernetes self-hosting, Qdrant Cloud on AWS/GCP/Azure, Hybrid Cloud, Private Cloud, and Qdrant Edge for embedded retrieval. Client libraries include Python (`qdrant-client`), JavaScript/TypeScript (`@qdrant/js-client-rest`), Rust, Go, Java, and .NET with REST and gRPC APIs per the API reference at api.qdrant.tech.