Elastic runs ElasticGPT and AgentEngine on Elasticsearch as sole data backend

News

5/26/2026, 1:37:50 AM

Elastic runs ElasticGPT and AgentEngine on Elasticsearch as sole data backend

Elastic announced that ElasticGPT and AgentEngine run with the Elasticsearch Platform as the sole data backend; ElasticGPT has seen more than 2,000 users, over 125,000 chats and roughly 400,000 interactions. The company replaced Redis, external vector databases and Postgres — style sidecars with a single engine to handle memory, retrieval and state, a move intended to simplify operations and reduce integration failure modes. AgentEngine’s backend implements four distinct memory functions on top of Elasticsearch rather than relying on separate systems. The design groups episodic, semantic and procedural memories alongside persisted workflow state so each capability is managed inside the indexed engine instead of in ad-hoc sidecars.

Episodic memory stores session — scoped conversation history as a data stream with index lifecycle management (ILM): recent turns remain on fast storage, content is compressed after 30 days, and records auto-delete after 90 days. Semantic memory accumulates long-term facts that are scored by importance and consolidated using locally computed embeddings. Procedural memory records tool usage and outcomes, while workflow state persists serialized execution snapshots produced by PydanticAI’s graph API.

Engineering work prioritized retrieval and memory infrastructure over model selection: the team reports model choice took about a week, while designing the data-platform retrieval and memory layers took months, underscoring that production AI’s hardest problems were data and state management rather than inference tuning. Semantic consolidation runs as a background process that merges related facts, and episodic streams with ILM remove the need for cron jobs or manual retention scripting.

The company frames these features as direct replacements for common sidecars: episodic memory instead of a session cache like Redis; semantic memory instead of hosted vector stores; procedural memory instead of ad-hoc analytics or logging stores; and serialized workflow state instead of relational or NoSQL state backends such as Postgres or DynamoDB. Elastic says the unified approach reduces integration seams where context can go stale.

Most enterprise AI projects typically stitch together vector databases for embeddings, document stores for context, caches for sessions and time-series or analytics stores for telemetry, producing multiple vendor contracts and operational burdens. By consolidating search, embeddings, session history and state in one indexed engine, Elastic argues teams can lower total cost of ownership, simplify pipelines that convert company data into AI-ready formats and accelerate time-to-market.

For builders, the practical outcomes are tangible: a single — engine stack that remembers context across sessions, merges exact — term and semantic search results, resumes interrupted workflows from persisted graphs and lets agents learn from past tool chains through structured procedural records. Elastic presents this architecture as a strategic bet that data infrastructure, not model choice, is the dominant engineering challenge in production AI.

Sources

Elastic AI · 5/19/2026

Replies (0)

No replies in this topic yet.

Back