← December 10, 2025 edition

zeroentropy

Enterprise RAG that actually works

ZeroEntropy Thinks Enterprise Search Is Broken, and the Fix Isn't Another Vector Database

Developer ToolsEnterpriseAISearch

The Macro: Enterprise Search Has Been Bad for a Long Time

Here’s a number that should embarrass every enterprise software company: knowledge workers spend roughly 20% of their time looking for information they already have. That stat has been bouncing around since the McKinsey days, and it hasn’t improved much. If anything, it’s gotten worse. Companies have more documents, more systems, more internal wikis, more Confluence pages that nobody reads, and search across all of it is still terrible.

The RAG (retrieval-augmented generation) wave was supposed to fix this. The pitch: plug your documents into a vector database, let an LLM answer questions about them, and suddenly your enterprise knowledge is unlocked. Clean story. The reality is messier. Most RAG implementations fall apart on real enterprise documents. PDFs with tables. Scanned contracts. Engineering specs with nested diagrams. The kind of messy, formatting-heavy content that actually matters in a business context.

Vector search works great for blog posts and FAQ pages. It works poorly for the 400-page regulatory filing your compliance team needs to query. Semantic similarity is a useful trick, but it’s a blunt instrument when the query is “what was the liability cap in amendment 3 of the 2019 contract with Acme Corp.” That’s not a vibes question. That’s a precision question, and most RAG systems return garbage for it.

The competitive field is crowded but weak. Elastic has been trying to bolt AI onto its search engine. Pinecone and Weaviate sell the vector database layer. Glean built a decent enterprise search product but charges enterprise prices for enterprise deployment timelines. Algolia is fast but shallow. None of them have cracked the accuracy problem on complex documents, and that gap is where ZeroEntropy is planting its flag.

The Micro: Two Math Nerds With a Search API

ZeroEntropy is building a search API for unstructured enterprise data. The core claim: sub-second latency with 99.5% accuracy on complex documents. If that number holds up under real enterprise workloads, it’s a genuinely meaningful technical achievement. Most RAG systems are happy to claim 80% accuracy and hope nobody checks too carefully.

The approach is different from standard vector search. Instead of embedding documents into a vector space and hoping semantic similarity captures the right answer, ZeroEntropy uses AI agents to decompose complex retrieval queries into smaller, solvable pieces. It’s more expensive computationally, but the accuracy tradeoff is apparently dramatic. Their bet is that enterprises will pay more per query for answers that are actually correct.

Ghita Houir Alami is the CEO and co-founder. She has two master’s degrees in Applied Mathematics, one from Ecole Polytechnique and one from UC Berkeley. Nicholas Pipitone is the CTO. His background is in theoretical mathematics and computer science at CMU, where he dropped out to build things instead of proving theorems about them. Both have spent time building AI systems at earlier startups. They’re a two-person team in San Francisco, part of YC’s Winter 2025 batch.

The company raised $4.2 million in mid-2025, which is a healthy seed for a two-person infrastructure startup. The product is live and available as an API. The positioning is clear: they’re selling accuracy, not features. “High accuracy search API over unstructured data” is the kind of tagline that either resonates immediately with your buyer or doesn’t. There’s no consumer story to tell here. This is plumbing for enterprises that need to search their own mess.

The pricing model isn’t publicly listed, which usually means enterprise sales with custom contracts. That’s fine for this market. Nobody buying enterprise search infrastructure is comparison-shopping on a pricing page.

The Verdict

I think ZeroEntropy is solving a real problem that most RAG companies are hand-waving past. The accuracy gap between “works on demos” and “works on our actual documents” is where most enterprise AI deployments die, and going after that gap directly is a smart positioning choice.

The risk is the same one every infrastructure startup faces: can a two-person team sell to enterprises? Enterprise sales cycles are long, procurement is painful, and the buyers who care most about accuracy are also the ones with the most security requirements and compliance hoops. The $4.2 million gives them runway, but the clock is ticking on proving enterprise traction before the bigger players close the accuracy gap themselves.

In 30 days, I’d want to see how many paying customers they have and what their average contract size looks like. In 60 days, I’d want to understand their document type coverage. Does the accuracy hold on scanned PDFs? On spreadsheets embedded in Word docs? On the truly cursed document formats that enterprises actually use? In 90 days, the question is whether the “agents for retrieval” approach scales economically or whether the compute costs make the unit economics ugly at volume. The technical thesis is strong. The go-to-market is the hard part.