T

AI Tool

turbopuffer

Object-storage-native search engine for vector, full-text, and hybrid retrieval at petabyte scale

turbopuffer documents a fast search engine at turbopuffer.com/docs built natively on object storage (S3, GCS, Azure Blob) with NVMe/memory caching for compute. Architecture docs describe a write-ahead log on object storage, SPFresh centroid-based ANN indexes for vectors, inverted BM25 indexes for full-text, exact metadata indexes with native filtering, and branching copy-on-write namespaces. The API supports vector ANN queries, BM25 full-text, hybrid multi-queries, regex/trigram search, filters, and encryption with customer keys per turbopuffer.com/docs/index and turbopuffer.com/docs/architecture. Docs cite production-scale limits observed (4T+ documents, 10M+ writes/s, 25k+ queries/s) with tradeoffs: higher write latency from object-storage durability and occasional cold queries on uncached namespaces.

Category Developer Tools
Pricing Usage-based pricing (see turbopuffer.com/pricing)
Platforms Cloud / API / Python / Rust
vector-searchfull-text-searchhybrid-search

Use cases

  • First-stage retrieval narrowing millions of documents for RAG pipelines
  • Cost-sensitive vector search where object storage economics beat in-memory-only DBs
  • Hybrid lexical + semantic search with high-recall filtered ANN
  • Petabyte-scale namespaces with pinned hot sets for low latency
  • Agent search layers when paired with external embedders

Key features

  • SPFresh centroid ANN index optimized for object storage roundtrips
  • Native metadata filtering integrated into vector queries (not pre/post-filter only)
  • Hybrid vector + BM25 full-text with branching namespaces
  • Strongly consistent writes via WAL on object storage
  • Multi-tenant, single-tenant, or BYOC deployment options

Who Is It For?

  • Teams optimizing search cost at billion-document scale
  • Engineers building RAG with strong filter requirements on vectors
  • Startups wanting managed search without operating HNSW clusters

Frequently Asked Questions

Is turbopuffer only a vector database?
Docs position it as a search engine supporting vector, full-text, regex, and filters—not vector-only.
What are the latency tradeoffs?
Architecture docs cite ~165ms p50 write latency for 500kB upserts and cold query latency around hundreds of ms when uncached; cached queries can reach low tens of ms.
Which index underpins vector search?
turbopuffer.com/docs/architecture documents SPFresh centroid-based ANN rather than graph indexes like HNSW.

Related

Related

3 Indexed items

Typesense

Developer ToolsOpen source

Typesense documents an open-source search engine at typesense.org/docs for fast typo-tolerant keyword search, faceting, and vector retrieval. Vector search docs at typesense.org/docs/30.2/api/vector-search describe KNN search on imported embeddings or auto-generated embeddings via OpenAI, Google PaLM API, or built-in Hugging Face models in huggingface.co/typesense/models (use the `ts` namespace prefix). Features include semantic search, hybrid search with rank fusion and adjustable `alpha` weighting, similar-document queries by ID, HNSW approximate search with optional `flat_search_cutoff` brute-force mode, and cosine `vector_distance` scoring. Deploy via Typesense Cloud or self-hosted Docker/binaries with REST API and official client libraries.

LanceDB

Developer ToolsOpen source

LanceDB documents a multimodal lakehouse for AI at docs.lancedb.com, built on the open-source Lance columnar format for storing vectors, metadata, raw bytes, and embeddings in unified tables. LanceDB OSS is an embedded library with Python, TypeScript, and Rust SDKs for local development; LanceDB Enterprise is a distributed managed lakehouse for search, curation, feature engineering, and training workflows per docs.lancedb.com. Features include vector/semantic search, BM25 full-text search, hybrid search with SQL filters, versioning, and cloud object-store integration (S3, GCS, Azure).

Qdrant

Developer ToolsOpen source

Qdrant documents an AI-native vector search engine at qdrant.tech/documentation for storing, indexing, and querying high-dimensional vectors with optional payloads, supporting dense, sparse, and multi-vector configurations. Official guides cover Docker/Kubernetes self-hosting, Qdrant Cloud on AWS/GCP/Azure, Hybrid Cloud, Private Cloud, and Qdrant Edge for embedded retrieval. Client libraries include Python (`qdrant-client`), JavaScript/TypeScript (`@qdrant/js-client-rest`), Rust, Go, Java, and .NET with REST and gRPC APIs per the API reference at api.qdrant.tech.