Skip to main content
POST
/
embeddings
/
search
Search Embeddings
curl --request POST \
  --url https://api.usecortex.ai/embeddings/search \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "tenant_id": "tenant_1234",
  "embeddings": [
    [
      0.123413,
      0.655367,
      0.987654,
      0.123456,
      0.789012
    ],
    [
      0.123413,
      0.655367,
      0.987654,
      0.123456,
      0.789012
    ]
  ],
  "sub_tenant_id": "sub_tenant_4567",
  "max_chunks": 1
}'
{
  "chunk_ids": [
    "<string>"
  ],
  "scores": [
    123
  ],
  "success": true,
  "message": "Embeddings search completed successfully"
}
Hit the Try it button to try this API now in our playground. It’s the best way to check the full request and response in one place, customize your parameters, and generate ready-to-use code snippets.

Sample Request

curl --request POST \
  --url https://api.usecortex.ai/embeddings/search \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "tenant_id": "tenant_1234",
  "embeddings": [
    [
      0.123413,
      0.655367,
      0.987654,
      0.123456,
      0.789012
    ],
    [
      0.123413,
      0.655367,
      0.987654,
      0.123456,
      0.789012
    ]
  ],
  "sub_tenant_id": "sub_tenant_4567",
  "max_chunks": 1
}'
Search for similar content using vector embeddings by comparing your input embedding against the vector database to find the most similar content chunks.

Vector Search Concepts

What are Embeddings?

Embeddings are high-dimensional vector representations of text that capture semantic meaning:
  • Semantic Understanding: Similar concepts have similar vector representations
  • Mathematical Distance: Content similarity is measured by vector distance
  • Language Agnostic: Works across different languages and formats
  • Context Preservation: Maintains meaning and relationships between concepts

How Vector Search Works

  1. Input Processing: Your embedding vector is compared against all stored embeddings
  2. Similarity Calculation: Cosine similarity or other distance metrics are computed
  3. Ranking: Results are ranked by similarity score (higher = more similar)
  4. Retrieval: Most similar chunks are returned with their similarity scores

Embedding Dimensions

  • Standard Dimensions: Most embeddings use 384, 512, 768, or 1536 dimensions
  • Quality vs Speed: Higher dimensions = better quality, slower search
  • Compatibility: Ensure your embedding model matches Cortex’s expected format

Search Parameters

Max Chunks

Controls the number of results returned:
  • Range: 1-200 chunks
  • Default: 10 chunks
  • Recommendation:
    • Start with 10-20 for most use cases
    • Use 50-100 for comprehensive searches
    • Use 1-5 for precise, top results only

Embedding Format

  • Type: Single embedding vector (1D array of numeric values)
  • Values: Floating-point numbers (typically between -1 and 1)
  • Length: Must match the embedding model’s dimension size
  • Example: [0.1, -0.2, 0.3, 0.4, -0.5, ...]

Use Cases

  • Content Discovery: Find documents similar to a reference document
  • Recommendation Systems: Suggest related content based on user interests
  • Duplicate Detection: Identify similar or duplicate content
  • Content Clustering: Group related documents together
  • Multilingual Content: Find similar content across different languages
  • Translation Support: Search for content in one language using another
  • Global Knowledge: Access information regardless of original language

Advanced Retrieval

  • Conceptual Search: Find content based on meaning, not exact keywords
  • Context-Aware Search: Retrieve content that matches conceptual context
  • Fuzzy Matching: Find content even with different wording or phrasing

Best Practices

Embedding Quality

  • Use High-Quality Models: Choose well-trained embedding models (OpenAI, Cohere, etc.)
  • Consistent Models: Use the same embedding model for both indexing and searching
  • Preprocessing: Clean and normalize text before generating embeddings
  • Batch Processing: Generate embeddings in batches for better performance

Search Optimization

  • Appropriate Max Chunks: Start with 10-20, adjust based on your needs
  • Similarity Thresholds: Set minimum similarity scores to filter low-quality matches
  • Multiple Queries: Try different embedding representations of the same concept
  • Hybrid Approaches: Combine vector search with keyword search for better results

Performance Considerations

  • Vector Size: Larger vectors provide better quality but slower search
  • Index Size: More indexed content = longer search times
  • Batch Requests: Process multiple embeddings simultaneously when possible
  • Caching: Cache frequently used embeddings to improve response times

Common Patterns

Document Similarity

{
  "embeddings": [0.1, 0.2, 0.3, ...],
  "max_chunks": 20
}
Use when you want to find documents similar to a reference document.
{
  "embeddings": [0.4, -0.1, 0.8, ...],
  "max_chunks": 10
}
Use when searching for content related to a specific concept or topic.

Recommendation Engine

{
  "embeddings": [0.2, 0.5, -0.3, ...],
  "max_chunks": 50
}
Use when building recommendation systems that need many similar items.

Sample Response

{
  "chunk_ids": [
    "CortexEmbeddings123_0",    
    "CortexEmbeddings456_0",
    "CortexEmbeddings456_1",
    "CortexEmbeddings123_2",   
    "CortexEmbeddings123_8"
  ],
  "scores": [
    0.95,
    0.89,
    0.87,
    0.82,
    0.78
  ]
}

Error Responses

All endpoints return consistent error responses following the standard format. For detailed error information, see our Error Responses documentation.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
tenant_id
string
required

Unique identifier for the tenant/organization

Example:

"tenant_1234"

embeddings
number[]

The embedding vector for search

Example:
[
[
0.123413,
0.655367,
0.987654,
0.123456,
0.789012
],
[
0.123413,
0.655367,
0.987654,
0.123456,
0.789012
]
]
sub_tenant_id
string
default:""

Optional sub-tenant identifier used to organize data within a tenant. If omitted, the default sub-tenant created during tenant setup will be used.

Example:

"sub_tenant_4567"

max_chunks
integer
default:10
Example:

1

Response

Successful Response

chunk_ids
string[]
required

List of chunk IDs that match the search query

scores
number[]
required

Similarity scores for each matching chunk (higher is more similar)

success
boolean
default:true

Indicates whether the embeddings search operation completed successfully

Example:

true

message
string
default:Embeddings search completed successfully

Status message about the search operation

I