Cube Wants to Be the Semantic Layer AI Actually Uses — Not Just the One You Deploy and Forget

The Macro: AI Analytics Is Confidently Wrong at Scale

The data analytics market is big and getting bigger in ways that are almost comically hard to pin down. Depending on which research firm you ask, it’s either $64 billion or $91 billion today, heading toward somewhere between $302 billion and $785 billion by the mid-2030s. The variance tells you something: this market is moving fast enough that even the people paid to forecast it are doing their own version of hallucinating.

What everyone agrees on is the direction. Analytics is growing, AI is eating it, and the central unsolved problem is trust. Can you actually believe what the AI tells you?

The short answer, right now, is often no.

The reason is structural. Most AI analytics tools sit on top of raw database tables and try to reason about your business from there. They don’t know that “revenue” means net of refunds in your accounting system, or that “active user” means logged in within 30 days in your product team’s view. They infer. Inference on ambiguous business logic produces confidently wrong answers at scale, which is arguably worse than no answer at all.

This is the problem the semantic layer was invented to solve. The idea is to define your business metrics, hierarchies, and logic once, in a governed layer, and let every downstream tool query that instead of raw tables. dbt has been pushing hard here. AtScale, Honeydew, and Promethium are all playing in adjacent parts of the space. Cube, the open-source framework and not to be confused with Cube Group or CUBE Global, which are unrelated companies that will absolutely make Googling this annoying, has been building semantic layer infrastructure for years. This launch is about what happens when you put an AI agent on top of it.

The Micro: The Semantic Layer Gets a Pilot

Here’s the core product decision that makes Cube interesting: instead of asking AI to figure out your business logic from scratch at query time, which is where hallucinations come from, Cube builds the semantic layer first, automatically, using an AI agent. The agent reads your data sources and constructs the structured definitions of your metrics and dimensions. Then, when you ask a question, the AI queries that governed layer rather than raw tables.

That’s a meaningful architectural distinction.

The hallucination problem in AI analytics isn’t really a model problem. It’s a context problem. If the model doesn’t have an authoritative definition of what “churn” means in your business, it will approximate one. Building the semantic layer upfront gives the model something real to work from instead of a best guess.

Cube is building this on top of its open-source semantic layer framework, which reportedly has over 19,000 GitHub stars. That’s not nothing. It suggests a real developer community and years of production use, which is relevant context for evaluating whether the underlying infrastructure is battle-tested or just vibes.

The product got solid traction on launch day, with a free tier available. That’s the right call for a product targeting data teams who will absolutely want to poke at it before letting it anywhere near production.

What I’d want to know, and what isn’t surfaced in the launch materials: how long the AI agent actually takes to build a semantic layer on a complex schema, how much human review it requires before you’d trust it, and what happens when the agent makes a wrong assumption about your business logic. Those aren’t dealbreakers. They’re expected questions at this stage.

The Verdict

Cube is making a real technical argument, not just an AI-shaped marketing argument. The semantic-layer-first approach is architecturally coherent, and the existing open-source project gives them a foundation that most competitors are still building from scratch. That’s a legitimate advantage.

The question at 30 days is activation.

How fast can a new user get from “connected my data warehouse” to “I trust these answers enough to send them to my boss”? Semantic layer setup has historically been the part where data projects die. If the AI agent genuinely compresses that from weeks to minutes, Cube has something real. If it produces a layer that requires significant manual cleanup, it’s just shifted the work rather than removed it.

At 60 to 90 days, the question becomes whether data teams who already use Cube’s open-source layer adopt the agent workflow, or whether they see it as redundant. That existing community is the most valuable distribution channel Cube has. It’s also the most demanding audience they’ll face.

I think this is probably a strong fit for data teams that are already bought into semantic layer thinking and want to accelerate setup. It’s a harder sell for teams that haven’t done that work yet, because the AI agent is only as good as the business logic it’s given, and someone still has to define what good looks like. The pitch is zero hallucinations, which is a high bar to set in writing. If they back it up with specifics, here’s what we get wrong, here’s how we catch it, that’s actually more convincing than the claim alone.

Cube Wants to Be the Semantic Layer AI Actually Uses — Not Just the One You Deploy and Forget

The Macro: AI Analytics Is Confidently Wrong at Scale

The Micro: The Semantic Layer Gets a Pilot

The Verdict

More on this