Skip to main content
POST
/
search
/
retrieve
Search
curl --request POST \
  --url https://api.usecortex.ai/search/retrieve \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "query": "Which mode does user prefer",
  "tenant_id": "tenant_1234",
  "sub_tenant_id": "sub_tenant_4567",
  "max_chunks": 123,
  "alpha": 123,
  "recency_bias": 123,
  "personalise_search": true
}'
[
  {
    "chunk_uuid": "<chunk_uuid>",
    "source_id": "CortexDoc1234",
    "chunk_content": "<chunk_content>",
    "source_type": "<source_type>",
    "source_upload_time": "<source_upload_time>",
    "source_title": "<source_title>",
    "source_last_updated_time": "<source_last_updated_time>",
    "layout": "{\"offsets\": {\"document_level_start_index\": 1024, \"page_level_start_index\": 50}, \"page\": 2}",
    "relevancy_score": 123,
    "document_metadata": {
      "author": "John Doe",
      "category": "Internal"
    },
    "tenant_metadata": {
      "department": "R&D"
    }
  }
]
Hit the Try it button to try this API now in our playground. It’s the best way to check the full request and response in one place, customize your parameters, and generate ready-to-use code snippets.

Sample Request

curl --request POST \
  --url https://api.usecortex.ai/search/retrieve \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "query": "Which mode does user prefer",
  "tenant_id": "tenant_1234",
  "sub_tenant_id": "sub_tenant_4567",
  "alpha": 0.8,
  "recency_bias": 0,
  "personalise_search": true
}'
Search across your tenant’s knowledge base using both semantic and keyword matching for comprehensive results.
Default Sub-Tenant Behavior: If you don’t specify a sub_tenant_id, the search will be performed within the default sub-tenant created when your tenant was set up. This searches across organization-wide documents.

Search Modes

The Hybrid Search endpoint combines multiple search strategies to provide the most relevant results:
  • Purpose: Finds content based on meaning and context, not just exact keywords
  • Best for: Conceptual queries, finding related content, understanding intent
  • Example: Searching for “machine learning” will also find content about “AI”, “neural networks”, “deep learning”
  • Purpose: Finds content containing specific terms or phrases
  • Best for: Exact term matching, technical specifications, proper nouns
  • Example: Searching for “TensorFlow 2.0” will find documents mentioning this specific version

Hybrid Approach

  • Purpose: Combines semantic understanding with keyword precision
  • Best for: Most use cases where you want both relevance and accuracy
  • Example: “Python data analysis libraries” finds both semantic matches (pandas, numpy) and exact keyword matches

Search Parameters

Alpha Parameter

Controls the balance between semantic and keyword search:
  • 0.0 - Pure keyword search only
    • Best for: Exact term matching, technical specifications
    • Use when: You need precise keyword matches
  • 1.0 - Pure semantic search only
    • Best for: Conceptual queries, finding related content
    • Use when: You want to discover related concepts
  • 0.8 - Default balanced approach (recommended)
    • Best for: Most general use cases
    • Provides optimal balance of precision and recall
  • "auto" - Intelligent auto-selection
    • Cortex analyzes your query and chooses the optimal alpha
    • Best for: When you’re unsure which approach to use

Recency Bias

Controls how much recent content is prioritized:
  • 0.0 - No recency bias (default)
  • 0.1-0.5 - Light to moderate recency preference
  • 0.6-1.0 - Strong recency preference
  • Best for: News, documentation updates, time-sensitive information

Max Chunks

Controls the number of results returned:
  • Range: 1-1001 chunks
  • Default: System limit
  • Recommendation: Start with 10-20 for most use cases
Enables personalized search results based on user memories from the corresponding tenant and sub-tenant combination:
  • true - Enable personalized search results
    • Leverages user memories stored in the tenant/sub-tenant combination
    • Provides more relevant and tailored search results
    • Considers user’s historical interactions and preferences
  • false - Standard search without personalization (default)
    • Returns results based purely on content relevance
    • No user-specific context applied
Best Practice: Enable personalise_search for applications where user context significantly impacts result relevance, such as personalized dashboards, recommendation systems, or user-specific knowledge bases.

Search Optimization Tips

For Better Precision

  • Use lower alpha values (0.2-0.4) for exact term matching
  • Include specific terminology in your queries
  • Set higher max_chunks to get more comprehensive results

For Better Recall

  • Use higher alpha values (0.6-0.8) for broader semantic matching
  • Try synonyms and related terms in your queries
  • Use conceptual language rather than specific terms
  • Enable recency bias for time-sensitive content

For Complex Queries

  • Use “auto” alpha to let Cortex optimize automatically
  • Combine specific terms with conceptual language
  • Adjust recency bias based on content type
  • Experiment with different alpha values to find optimal results
  • Enable personalise_search for user-specific contexts and preferences

Response

Returns an array of relevant content chunks from your indexed sources based on the search query.
[
  {
    "chunk_uuid": "CortexDoc42f7a8c91e234d89ab7f3e612bc9a1047891234567_15_v1",
    "source_id": "CortexDoc42f7a8c91e234d89ab7f3e612bc9a1047891234567",
    "source_title": "Advanced Machine Learning Algorithms.pdf",
    "chunk_content": "PROCEEDINGS OF NEURAL NETWORKS, VOL. 22, NO. 3, MARCH 2024 8 Fig. 5: Performance metrics comparison between transformer architectures on natural language processing tasks...",
    "source_upload_time": "1753761802.079415",
    "source_collection": [],
    "source_type": "file",
    "layout": "{\"coordinates\": {\"x\": 48.96399688720703, \"y\": 26.49277114868164, \"width\": 514.0717697143555, \"height\": 721.1119651794434}, \"page\": 12}",
    "version": "v1",
    "source_last_updated_time": "",
    "relevancy_score": 0.8363813161849976,
  },
  {
    "chunk_uuid": "CortexDoc8b1e3d5f72c84a16951e8742abd6f3029876543210_7_v1",
    "source_id": "CortexDoc8b1e3d5f72c84a16951e8742abd6f3029876543210",
    "source_title": "Quantum Computing Fundamentals.pdf",
    "chunk_content": "IV. QUANTUM ALGORITHMS This section explores quantum gate operations and their applications in cryptographic systems. The proposed quantum circuit demonstrates improved efficiency in...",
    "source_upload_time": "1753762013.1677928",
    "source_collection": [],
    "source_type": "file",
    "layout": "{\"coordinates\": {\"x\": 311.9779968261719, \"y\": 712.3123168945312, \"width\": 251.05746459960938, \"height\": 35.72357177734375}, \"page\": 2}",
    "version": "v1",
    "source_last_updated_time": "",
    "relevancy_score": 0.800000011920929,
  }
]
Note: This API returns hybrid search results without AI-generated answers. For conversational Q&A with AI-generated responses, use the /search/qna API instead.

Alpha Parameter

The alpha parameter controls the balance between semantic and keyword search:
  • 0.0 = keyword search only
  • 1.0 = semantic search only
  • 0.8 = default balanced approach
  • "auto" = Cortex intelligently decides the optimal alpha value based on the query

Error Responses

All endpoints return consistent error responses following the standard format. For detailed error information, see our Error Responses documentation.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
query
string
required

Search terms to find relevant content

Example:

"Which mode does user prefer"

tenant_id
string
required

Unique identifier for the tenant/organization

Example:

"tenant_1234"

sub_tenant_id
string | null
default:""

Optional sub-tenant identifier used to organize data within a tenant. If omitted, the default sub-tenant created during tenant setup will be used.

Example:

"sub_tenant_4567"

max_chunks
integer | null

Maximum number of results to return

alpha
default:0.8

Search ranking algorithm parameter (0.0-1.0 or 'auto')

recency_bias
number | null
default:0

Preference for newer content (0.0 = no bias, 1.0 = strong recency preference)

Enable personalized search results based on user preferences

Response

Successful Response

chunk_uuid
string
required

Unique identifier for this content chunk

Example:

"<chunk_uuid>"

source_id
string
required

Unique identifier for the source document

Example:

"CortexDoc1234"

chunk_content
string
required

The actual text content of this chunk

Example:

"<chunk_content>"

source_type
string
default:""

Type of the source document (file, webpage, etc.)

Example:

"<source_type>"

source_upload_time
string
default:""

When the source document was originally uploaded

Example:

"<source_upload_time>"

source_title
string
default:""

Title or name of the source document

Example:

"<source_title>"

source_last_updated_time
string
default:""

When the source document was last modified

Example:

"<source_last_updated_time>"

layout
string | null

Layout of the chunk in original document. You will generally receive a stringified dict with 2 keys, offsets and page(optional). Offsets will have document_level_start_index and page_level_start_index(optional)

Example:

"{\"offsets\": {\"document_level_start_index\": 1024, \"page_level_start_index\": 50}, \"page\": 2}"

relevancy_score
number | null

Score indicating how relevant this chunk is to your search query, with higher values indicating better matches

document_metadata
object | null

Metadata extracted from the source document

Example:
{
"author": "John Doe",
"category": "Internal"
}
tenant_metadata
object | null

Custom metadata associated with your tenant

Example:
{ "department": "R&D" }
I