As generative AI reshapes industries, one of its most important yet invisible challenges is retrieval, the process of fetching the right data with relevant context from messy knowledge bases. Large language models (LLMs) are only as accurate as the information they can retrieve.
That’s where ZeroEntropy wants to build its mark. The San Francisco-based startup, co-founded by CEO Ghita Houir Alami and CTO Nicolas Pipitone, has raised $4.2 million in seed funding to assist models retrieve relevant data quickly, accurately, and at scale.
The round was led by Initialized Capital, with participation from Y Combinator, Transpose Platform, 22 Ventures, a16z Scout, and a long list of angels, including operators from OpenAI, Hugging Face, and Front.
ZeroEntropy joins a growing wave of infrastructure companies hoping to utilize retrieval-augmented generation (RAG) to power search for the next generation of AI agents. Competitors range from MongoDB’s VoyageAI to early fellow YC startups like Sid.ai.
“We’ve met a lot of teams building in and around RAG, but Ghita and Nicolas’s models outperform everything we’ve seen,” declares Zoe Perret, partner at Initialized Capital. “Retrieval is undeniably a critical unlock in the next frontier of AI, and ZeroEntropy is building it.”
Retrieval-augmented generation (RAG) grabs data from external documents and has become a go-to architecture for AI agents, whether it’s a chatbot surfacing HR policies or a legal assistant citing case law.
Yet the ZeroEntropy founders believe that for many AI apps, this layer is fragile: a cobbled collection of vector databases, keyword search, and re-ranking models. ZeroEntropy offers an API that manages ingestion, indexing, re-ranking, and evaluation.
What that means is that — unlike a search product for enterprise employees like Glean — ZeroEntropy is strictly a developer tool. It quickly grabs data, even across messy internal documents. Houir Alami likens her startup to a “Supabase for search” referring to the popular open-source database that automates much of the database management.
“Right now, most teams are either stitching toreceiveher existing tools from the market or dumping their entire knowledge base into an LLM’s context window. The first approach is time-consuming to build and maintain,” Houir Alami stated. “The second approach can cautilize compounding errors. We’re building a developer-first search infrastructure—consider of it like a Supabase for search—designed to build deploying accurate, rapid retrieval systems simple and efficient.”
At its core is its proprietary re-ranker called ze-rank-1, which the company claims currently outperforms similar models from Cohere and Salesforce on both public and private retrieval benchmarks. It builds sure that when an AI system sees for answers in a knowledge base, it grabs the most relevant information first.
















Leave a Reply