Retrievals

Cortex gives you complete control over how your AI retrieves and ranks results. You can adjust parameters like recency_bias to prefer newer documents, or search_alpha to balance between exact keyword matches and more flexible semantic understanding.

Search Capabilities

Cortex delivers blazing fast hybrid search, filter, and rerank over billions of vectors. With compute-storage separation for up to 100x savings, Cortex supports vector, full-text, regex, and metadata search. Develop locally and scale to petabytes in the cloud backed by object storage. Serverless search and retrieval that is accurate, fast, and built for AI.

Hybrid Search Architecture

Cortex’s search engine combines multiple search methodologies to deliver the most relevant results:

Vector Search: Semantic understanding using embeddings
Full-Text Search: Traditional keyword-based search
Regex Search: Pattern-based text matching
Metadata Search: Structured data filtering and querying

Metadata Search

Cortex delivers a highly optimised, two-stage hybrid search pipeline that combines metadata search, keyword search, and vector search—each tailored to work in harmony for maximum context and precision. At the heart of our architecture is a flexible metadata engine built on fully dynamic schemas, allowing users to define arbitrary key-value pairs—like status: approved, owner: [email protected], or created_on: 03-10-2025—without being locked into predefined fields. Metadata is ingested and stored separately from document content, enabling fast, index-free filtering and complex relational querying. Our pipeline operates in two stages:

Stage 1: Metadata-Based Candidate Retrieval

When a user submits a natural language query—e.g., “show me all the emails by [email protected] on 03-10-2025” or “what all drafts have been marked as status ‘approved’“—we run the query through a custom query understanding module. This model extracts potential metadata keys and values, and using statistical sampling, pre-indexed metadata maps, and frequency analysis, we intelligently generate the most likely filter combinations. These filters are used to query the metadata database directly, returning a scoped set of high-confidence candidate results. This drastically reduces the search space, ensures relevance from the outset, and avoids brute-force full-text or embedding scans across the entire corpus.

Stage 2: Context-Aware Semantic Retrieval

The second stage takes the metadata-filtered results and uses them to query the vector store for semantically rich, conceptually related chunks or documents. Instead of naïvely searching the entire vector space, we condition the vector queries using the Stage 1 candidates—essentially saying: “Find me content that is not only semantically relevant, but also contextually related to these filtered results.” This hybrid strategy allows us to go beyond traditional semantic search. For instance, if the Stage 1 metadata narrowed the results to 100 documents tagged as finance reports from last month, Stage 2 ensures we extract meaningful passages within those docs—even if they don’t share exact keywords—surfacing deeper, latent context aligned with the user’s intent. This staged architecture offers several critical advantages:

Reduced noise by filtering early with structured data
Increased precision by anchoring vector search to scoped metadata hits
Improved recall through embedding-driven context expansion
Sub-second latency thanks to narrowed query scope and parallelisation

All components are tied together with a hybrid scoring and ranking algorithm that merges structured signals (from metadata) with unstructured semantic relevance (from embeddings). Built on a distributed, low-latency architecture with caching and multi-tenant support, Cortex delivers an intelligent, contextually aware search experience that scales effortlessly with your data. The result isn’t just fast search—it’s search that understands your structure, your semantics, and your intent. Rebuilding this from scratch would take a team, a year, and a lot of pain. We did that already—so you don’t have to.

Fine-Tuning Your Retrievals

What it is:
Cortex gives you complete control over how your AI retrieves and ranks results. You can adjust parameters like recency_bias to prefer newer documents, or search_alpha to balance between exact keyword matches and more flexible semantic understanding. This tuning ensures that your search behavior fits the specific context of your application—whether you’re building a chatbot, search engine, or intelligent assistant. Benefit:
You get more accurate, relevant, and high-quality results every time. Users won’t be shown outdated or irrelevant information. Instead, they receive answers that match both what they say and what they mean. This builds trust and improves task success. Use Case & Scenario:
Imagine you’re building a support chatbot for a SaaS product. A user asks, “Why was my payment declined?” Instead of retrieving every document with the word “payment,” the bot prioritizes recent tickets that semantically match issues related to failed billing. The user gets an instant, relevant response without sifting through old or off-topic results.

Search Types

Vector Search

Semantic search using embeddings to understand meaning and context beyond exact keyword matches. Perfect for finding conceptually related content even when the exact words don’t match.

Full-Text Search

Traditional keyword-based search that finds exact matches and partial matches in document content. Ideal for precise term searching and phrase matching.

Regex Search

Pattern-based text matching using regular expressions. Useful for finding specific text patterns, formats, or structured data within documents.

Metadata Search

Structured data filtering and querying using key-value pairs. Enables filtering by document properties, tags, categories, and other structured attributes.

Performance & Scalability

Compute-Storage Separation: Up to 100x cost savings
Petabyte Scale: Built for massive datasets
Sub-Second Latency: Optimized for real-time applications
Multi-Tenant Support: Secure isolation between users
Local Development: Develop and test locally before scaling to cloud

Get Started

Essentials

Use Cases

Search Capabilities

Hybrid Search Architecture

Metadata Search

Stage 1: Metadata-Based Candidate Retrieval

Stage 2: Context-Aware Semantic Retrieval

Fine-Tuning Your Retrievals

Search Types

Vector Search

Full-Text Search

Regex Search

Metadata Search

Performance & Scalability

Get Started

Essentials

Use Cases

​Search Capabilities

​Hybrid Search Architecture

​Metadata Search

​Stage 1: Metadata-Based Candidate Retrieval

​Stage 2: Context-Aware Semantic Retrieval

​Fine-Tuning Your Retrievals

​Search Types

​Vector Search

​Full-Text Search

​Regex Search

​Metadata Search

​Performance & Scalability

Search Capabilities

Hybrid Search Architecture

Metadata Search

Stage 1: Metadata-Based Candidate Retrieval

Stage 2: Context-Aware Semantic Retrieval

Fine-Tuning Your Retrievals

Search Types

Vector Search

Full-Text Search

Regex Search

Metadata Search

Performance & Scalability