Skip to main content

Vector Memory

Vector memory is agent memory stored as embedding vectors in a vector database, retrieved by semantic similarity. Each piece of memory (a turn, a fact, a chunk of a document) is encoded into an embedding with a chosen model and stored alongside metadata; recall queries are also embedded and the closest matches returned. It is a common backbone for both episodic-style memory (every turn embedded) and semantic-style memory (each extracted fact embedded). Limits: recall quality depends on embedding model and chunking, structured updates are awkward, and similarity does not equal relevance for some query types.

Example

A customer-support agent stores every closed ticket as a vector memory in Pinecone. On a new ticket, it embeds the ticket description and retrieves the 5 nearest closed tickets — useful for "what did we do last time this happened" but limited when the right past resolution does not share surface vocabulary with the current ticket. Hybrid search or rerankers usually help close that gap.

Put this into practice

Build polished, copy-ready prompts in under 60 seconds with SurePrompts.

Try SurePrompts