New Video! Build a Hotel Voice Agent With Trieve + Vapi! »

· announcements · 4 min read

Nick Khami

Trieve's New Usage-Based Pricing

Pricing models supporting growth

Pricing models supporting growth

Our New Usage-Based Pricing

Trieve is growing. While effective at first, the tier-based pricing strategy was causing pain for our customers and the team. We’ve also expanded our free tier to make it more generous and builder-friendly.

Restrictive and Large Pricing Tiers

Tiering was designed to control concerns about our largest cost centers: Chunks Stored, File Storage, AI Messages, and Datasets. However, the floors, ceilings, and gaps between our tiers proved to be restrictive for our customer base.

As dataset count hockey-sticks, companies attempting to scale especially those using multi-tenant setups like HostAI and Vapi saw friction managing their usage to avoid jumping into the next tier by exceeding chunk, message, or dataset limits.

New Features that original pricing didn’t include

Trieve has grown so fast in the past year that our pricing table wasn’t able to keep up with the full breadth and depth of our product. Users found these features already baked in as we launched them and the tier-based pricing worked well enough to limit on chunk and dataset count.

For example

  • We added a first-party integration to chunkr, and our very own VLM-based pdf2md service.
  • We maintain our own webcrawler called firecrawl-simple for seamless and performant web-scraping driectly into a Trieve search index.
  • We created a search component generator to prototype and create drop-in discovery components.
  • We turbocharged our analytics dashboard and routes, added voice search, and the list goes on.

Pricing Table

Without further ado, our pricing table:

ProductFree TierCost above free tier
UsersFirst 5 Users free$5 / User
Platform Fee$ 25 / mo

Storage Cost

Chunk Storage1000 Chunks (11 MB)$132 / 1M chunks ($12.07 / GB)
File Storage10 GB$0.046 / GB
Datasets2 datasets$0.05 / dataset

Ingestion (Resets at end of month)

Write TokensFirst 3M tokens / mo free$0.028 / 1M tokens
File OCRFirst 100 / mo free$0.01 / Page
Web CrawlingFirst 10 pages / mo free$0.00086 / Page Crawled
Bytes IngestedFirst 1 GB$2 / GB ingested

Per Search charges

Search TokensFirst 3M tokens / mo free$0.028 / 1M tokens
Message TokensFirst 263,000 tokens / mo free$3.528 / 1M tokens
Analytic EventsFirst 1M events / mo free$0.0001 / event

Pricing Breakdown

Platform Fee

To account for the drastic price decreases, we feel a base platform fee for a full all in one solution is now needed. This also gives us flexiblity to expand to multiple tiers in the future for even better bulk cost savings in the future.

Chunk Storage

Chunk storage is the largest cost center. We hold all the vectors, payload indices, and metadata in RAM with 2x replication. Holding all of these in RAM allows us to provide the lowest possible latency for all of our searches.

We measure the size of a single chunk in Trieve to be (1536 * 4 bytes/vector) + (256 * 4 bytes/sparse vector) + (4096 bytes/payload ) = 11,264 bytes.

We host n4-standard-64 machines and use hyperdisk-balanced NVMe SSDs for our vector db storage layer. The cost per GB is $8.65/GB of RAM + 0.2/GB = $8.67/GB. We charge a ~40% markup which comes out to $12.07 / GB.

File Storage

S3 charges $0.023/GB. We charge $0.046/GB which is a 100% upcharge on S3 storage.

Datasets

We charge $0.05/dataset. Every dataset created adds a new index into our vector db’s HNSW index. Each dataset also gets an ingestion queue for files. Crawls and chunks have equal priority.

Search and Write Tokens

OpenAI’s text-embedding-3-small charges $0.02/M tokens. We charge a 40% markup to embed the input so that’s $0.02 * 40% markup = $0.028/M tokens. Both searches and writes are charged the same fee.

File OCR

Past the free first 100 pages, we charge $0.01/page. PDF pages average 800 tokens.

For our pdf2md OCR ingestion using gpt-4o, our raw cost is 800 tokens * (0.00001 LLM cost/token) = $0.008/page. Accounting for our markup, it comes out to $0.01 / page.

Web Crawling

We price at the industry standard level which is around $86/100,000 pages or $0.00086/page.

Bytes Ingested

The first 1GB of bytes ingested are free. Overages at the end of the month are $2/GB. This is to account for networking costs and match other databases’ write charges.

Message Tokens

As a basis, we charge per token. We allow proxying OpenRouter or OpenAI through Trieve. Embedding and search steps are bundled into the cost of an AI message in Trieve.

For each message, we create an embeddding and conduct an LLM inference step. OpenAI’s gtp-4o-mini model charges us $0.6/1M tokens and text-embedding-3-small charges $0.02/1M tokens.

Analytics

Trieve provides analytics for all searches and chat messages automatically. The first 1,000,000 searches and messages a month are free and have a 5 year retention period. Any additional events are $0.0001/event.

Trieve Enterprise

The prices above are for our public cloud service.

We offer dedicated cloud services that come with SLA guarantees and all of the enterprise goodies. If you require a custom quote for a large scale usecase, please contact us. We are happy to provide bulk discount quotes for massive usage.

Back to Blog

Related Posts

View All Posts »

Guide for Self-Hosting Trieve on a VPS

Instructions for self-hosting Trieve on a VPS using docker-compose. You'll be able to set up Trieve on a Hetzner server which comes with semantic and hybrid search, SPLADE fulltext search, re-ranker models, RAG AI Chat, recommendations, and analytics.