The NL-2-SQL Agent Trap: Why LLMs Need an Ontology Layer to Stop Hallucinating Your Database

Hi there,

Here's the problem nobody tells you about NL-to-SQL agents: LLMs are remarkably good at writing SQL that runs, and remarkably bad at writing SQL that means what you asked for. Today's post is about the ontology layer that fixes that — and why Google's BigQuery + Gemini architecture had to build one to make NL2SQL actually work in enterprise data environments.

🔥 Featured Post

The NL-2-SQL Agent Trap: Why LLMs Need an Ontology Layer to Stop Hallucinating Your Database

LLMs generate SQL that is syntactically valid but semantically wrong — joining the wrong tables, filtering on the wrong column — and it's nearly impossible to catch without an ontology layer to validate intent against schema
Google's BigQuery + Gemini NL2SQL pipeline uses embedding-based schema retrieval + a semantic/ontology layer to ground generation before the prompt even reaches the model
The five failure modes of naive NL2SQL: schema explosion, term ambiguity, dialect drift, aggregation mismatch, and join hallucination — and what an ontology layer does to each
The full agent loop: query decomposition → ontology lookup → schema linking → prompted SQL generation → self-correction → BigQuery execution → result interpretation
What to build today: a lightweight business glossary ontology is 80% of the value, and you don't need OWL/RDF to get there

Read the full post →

📚 In Case You Missed It

When Your AI Vendor Becomes Your Systems Integrator: The Enterprise Architecture Reckoning Behind the OpenAI-Anthropic PE Playbook — OpenAI's $10B 'Deployment Company' and Anthropic's $1.5B Blackstone-Goldman venture both launched May 4 with the same playbook — embed engineers, redesign workflows, lock in the model — and neither enterprise AI governance framework was designed for a world where your model vendor IS your systems integrator.

Cerebras Files for $26.6B IPO With OpenAI as 86% of the Backlog: The Wafer-Scale Tier Just Became an Architecture Decision — Cerebras filed for a $3.5B IPO at $26.6B valuation on May 4, with OpenAI's 750 MW Master Relationship Agreement and an $1B circular loan baked into the S-1 — making wafer-scale inference a real architectural tier and a real concentration risk that enterprise AI teams now have to model into their LLM gateway, latency budget, and vendor strategy.

The 5% Problem: What Datadog's 2026 AI Engineering Data Says About the Production Reliability Crisis Nobody Is Talking About — Datadog's 2026 AI Engineering report found 5% of LLM calls fail in production — 60% from rate limits, not model quality — while 69% of orgs now use 3+ models with frameworks doubling year-over-year, creating a compounding reliability crisis that most enterprise AI teams haven't instrumented for yet.

More posts dropping every day. Stay curious.

— Bhanu @ superml.dev