Core Concepts
Understand the core concepts of using Cortex
The Cortex SDK is designed to give you the most flexible developer experience for building AI apps and agents. Set up a global memory store for individual users (B2C), or configure organizations as tenants and onboard their users (B2B).
Tenants
Tenants represent isolated knowledge stores for your application. Cortex supports both B2C and B2B use cases out of the box.
- In B2C, each end user operates with their own private memory store inside a single tenant which represents your project.
- In B2B, organizations are modeled as separate tenants, each with their own scoped environment where users, knowledge, memory, and policies are fully encapsulated.
Multi-tenancy ensures data isolation, access control, and efficient retrieval across contexts.
Knowledge
These are the atoms that power your retrievals. Any context you or your users add is treated as knowledge for the retrieval engine to search and retreive. This includes any type of documents, chats, images, webpages, presentations, reports, any form of unstructured information. Cortex can also digest CSVs.
Each piece of knowledge you add is meticulously cleaned, parsed using in-house parsers, chunked, embedded, and prepared for the most optimal retrieval experience.
AI Memory
Memory is where personalization lives. Unlike static knowledge, memory is dynamic and user-specific. Cortex stores evolving insights about users, their preferences, past interactions, and contextual signals.
- Think of it as a long-term cache of user profile—not just what they said, but what they meant and what they’ve done before. Their preferences, intentions, and statements.
- Cortex memories update automatically through conversation, queries, and usage, giving your app the ability to learn, remember, and adapt.
Querying
Querying is how you extract meaningful answers from your knowledge and memory layers. Cortex gives you control over how retrieval works, using a combination of:
- Top-K search across documents, passages, or snippets
- Alpha, recency bias, and contextual expansion:
Each query is processed deeply to determine the optimal retrieval strategy. This ensures results are not just relevant, but also personalized and self-improving over time. - Memory injection automatically infuses long-term user context (e.g. preferences, past interactions) into every query.
- Citations and Bounding Boxes:
Every answer returned by Cortex is backed by verifiable citations—with exact source snippets and optional bounding boxes for visual references (e.g. PDFs, slides, or structured files). This builds trust and lets users trace every insight back to its origin. - Mixture-of-Experts Routing:
Cortex dynamically routes each query through a Mixture-of-Experts engine to select the best answering strategy—whether it needs to use memory, search more deeply, dynamically adjust prompts, or chain multi-hop lookups.
Metadata (& Agentic Querying)
Metadata allows you to structure and filter retrieval with precision. You can tag documents, pages, or entities with key-value pairs and filter results by user role, product, team, time, or any custom logic.
Agentic Querying goes a step further—enabling agents to retrieve, reason, and act autonomously based on structured metadata and memory signals. This unlocks use cases like:
- Agent workflows that chain multi-step queries
- Personalized agent routing per user or org
- Task-specific memory selection for copilots and assistants
Cortex makes it easy to build intelligent agents that retrieve exactly what they need, when they need it.