Back to Podcast Digest
Every··53m

The AI Model Built for What LLMs Can't Do

TL;DR

  • Logical Intelligence is betting on energy-based models for ‘correctness,’ not just generation — CEO Eve says LLMs are already being pushed into codegen, chip design, and other mission-critical systems, but the market still lacks deterministic, verifiable AI for software and hardware.

  • Her core critique of LLMs is architectural: they guess token by token and can’t inspect or verify themselves mid-process — she contrasts that with EBMs, which are non-autoregressive, token-free, and designed to support both internal self-alignment and external verification tools like Lean 4.

  • Eve’s main metaphor is that LLMs navigate like someone with tunnel vision, while EBMs navigate with a bird’s-eye map — an autoregressive model picks one step at a time and can ‘fall into a hole,’ whereas an energy-based model can evaluate multiple routes and avoid dead ends.

  • She frames EBMs as better suited for non-language problems like spatial reasoning, data analysis, and applied engineering — her argument is that tasks like driving a car, distributing power on a grid, or modeling a person moving through an apartment should not be forced through a language-token interface.

  • The company’s model, Kona, is an ‘energy-based reasoning model with latent variables’ meant to learn the rules behind data, not just the surface patterns — Eve describes latent variables as a knowledge store in energy-landscape form that captures the world-model underlying observations.

  • Despite the LLM boom, she thinks progress is becoming incremental rather than another phase change, especially for B2B use cases — she says industries like banking, trading, drug discovery, and grid management still use humans for critical data analysis, and notes that some big model companies are already building EBMs internally.

The Breakdown

Why correctness matters more than flashy demos

Eve opens by positioning Logical Intelligence as a foundational AI company working on both LLMs and EBMs, but with a very specific obsession: correctness in software and hardware. Her argument is blunt: people are already putting AI into mission-critical systems, yet too few ask whether the outputs are actually correct, deterministic, or verifiable.

The airplane example that reframes hallucinations

When Dan asks why correctness matters if something “works,” Eve immediately reaches for visceral examples. A hallucinating AI in a car might be funny at 20% failure; a hallucinating AI in a plane absolutely isn’t. Her point is that AI in high-stakes systems feels inevitable over the next decade, so reliability can’t stay an afterthought.

What an EBM is, in plain English and in physics

Eve defines EBM as an energy-based model and traces the term back to physics: write down the energy of a system, minimize it, and the system’s behavior falls out. She translates that into AI with a memorable image: if Dan comes home exhausted after “thousands of podcasts,” the lowest-energy state is probably him collapsing onto the couch with a drink, and that low point becomes the model’s most probable outcome.

Energy landscapes versus token guessing

This is the big conceptual split of the episode. Eve says EBMs map data directly into an energy landscape with high points for unlikely states and low points for likely ones, while LLMs are forced to predict one token at a time. That matters because, in her view, a lot of real intelligence—walking through your house, driving, visual-spatial reasoning—has nothing to do with language, yet LLMs force those tasks through language anyway.

Latent variables as stored knowledge about the world

Once Dan pushes on what “understanding” means, Eve says the key addition is latent variables. She describes them less as explicit symbolic rules and more as a stored world-model in another energy landscape: knowledge about couches being for sitting, kitchens being for cooking, and how to generalize when the environment changes. That, she says, is what lets the model infer beyond raw pattern matching.

The San Francisco map analogy for why autoregression breaks down

Her sharpest metaphor comes later: an LLM trying to reason is like navigating San Francisco by choosing one direction at a time with tunnel vision. You might hallucinate a wrong turn, walk into a hole in the road, and because you’re autoregressive, you can’t really zoom out and reroute. An EBM, by contrast, keeps the bird’s-eye view and can choose another path before wasting all that compute.

From vibe coding to formally verified software

Dan connects this to his own experience with “vibe coding,” where codebases end up locally correct but globally messy—a patchwork of hot fixes instead of one coherent design. Eve says this is exactly where they want to help: moving from vibe coding in Python or C++ toward coding in natural language, backed by formal verification that checks whether new logic is compatible with the old system and whether constraints are actually obeyed.

Why the industry keeps funding LLMs anyway

In the final stretch, Eve argues that LLM investment persists partly because the ecosystem is already built: data centers, hardware, portfolio companies, and billions already committed. She thinks LLM progress is still real but increasingly incremental, while industries like banking, trading, drug discovery, and grid management still don’t trust frontier models with critical data analysis. Her strategic pitch is not “replace LLMs,” but become a compatible layer under them—and she notes that some big AI companies already have EBM efforts in-house.