Engram Raises $98M to Cut AI Token Costs With Memory Models
The eight-month-old startup promises to slash enterprise AI expenses by up to 100x through organization-specific context retention.

An AI startup founded less than a year ago has raised $98 million to address one of enterprise technology's newest pain points: runaway costs from generative AI deployments.
Engram announced the funding round Tuesday, backed by General Catalyst, Kleiner Perkins, and Sequoia Capital. OpenAI co-founder Andrej Karpathy, who recently joined Anthropic, also participated as an investor.
The company positions itself as a "learned memory" layer for AI systems, building models that retain organization-specific workflows and context. This approach allows the technology to anticipate questions and deliver responses using significantly fewer tokens—the computational units that determine AI query costs.
According to Engram, its models can match or exceed the performance of frontier labs while consuming up to 100 times fewer tokens. The claim arrives as newer, more sophisticated AI models prove increasingly expensive to run, challenging earlier assumptions that scale would drive costs downward.
Why it matters
As enterprises move from AI experimentation to production deployment, token costs are emerging as a major budget concern. Companies that embedded AI features throughout their operations now face mounting bills that weren't anticipated in initial pilots. Engram's approach—trading general capability for specialized efficiency—represents a potential path for organizations seeking to control these expenses without abandoning AI initiatives entirely.
Early traction with enterprise clients
Despite its youth, the 13-person company has already signed clients including Microsoft, Notion, and legal AI startup Harvey. Leigh Marie Braswell, a partner at Kleiner Perkins, described the value proposition as mapping organizational knowledge to deliver "orders of magnitude cheaper output" amid exploding data volumes and costs.
Co-founder and CEO Dan Biderman developed the concept during his time at Stanford University's AI lab, where he identified what he calls the "genius stranger model"—AI systems that demonstrate intelligence but lack persistent memory. His background includes a PhD in computational neuroscience from Columbia University, where he studied how memory traces form in the brain.
Biderman acknowledges that Engram's models aren't universally superior to those from OpenAI or Anthropic. Instead, they excel at specialization, sometimes sacrificing broader capabilities to achieve deep performance in specific domains.
The memory architecture challenge
The startup plans to deploy its new capital toward compute resources and hiring. Its core technical challenge involves building what Biderman describes as "a layer of intuition that humans have, and current models don't"—moving beyond simple note-taking to genuine contextual understanding.
The funding comes as corporate technology leaders begin scrutinizing AI spending more carefully after an initial period of relatively unconstrained developer experimentation. That shift creates an opening for infrastructure companies that can demonstrate concrete cost savings.
Details of the funding and company background were first reported by CNBC.
This is an original analysis by the Omega editorial team. Source reporting: AI Watch.
Want systems like this working for your business?
Book a Call