Try it
button to try this API now in our playground. It’s the best way to check the full request and response in one place, customize your parameters, and generate ready-to-use code snippets.Sample Request
Embedding Processing Pipeline
When you upload pre-computed embeddings, they go through a streamlined processing pipeline optimized for vector data:1. Immediate Upload & Validation
- Your embedding vectors are immediately accepted and validated
- Dimensional consistency is checked across all vectors
- Format validation ensures proper numeric array structure
- You receive a confirmation response with a
file_id
for tracking
2. Vector Processing Phase
Our system automatically handles:- Dimensional Validation: Ensuring all vectors have consistent dimensions
- Data Type Normalization: Converting to optimal numeric formats
- Vector Quality Assessment: Checking for valid numeric ranges and patterns
- Batch ID Generation: Creating unique chunk IDs for each embedding vector
3. Chunk ID Assignment
- Each embedding vector receives a unique chunk ID in format
{batch_id}_{index}
- These IDs serve as references for retrieval and linking to original content
- Example:
[0.1, 0.2, 0.3, 0.4, 0.5]
becomesCortexEmbeddings123_0
- You can use these chunk IDs to link back to your original text content
4. Direct Indexing
- Embeddings are directly stored in our vector database (no embedding generation needed)
- Full-text search indexes are created for associated metadata
- Metadata is indexed for filtering and faceted search
- Cross-references are established for related embedding batches
5. Quality Assurance
- Automated quality checks ensure vector integrity
- Dimensional consistency validation across the tenant
- Vector range and format validation
- Database storage verification
sub_tenant_id
, the embeddings will be uploaded to the default sub-tenant created when your tenant was set up. This is perfect for organization-wide embeddings that should be accessible across all departments.Requirements
- Maximum dimensions: 2000 rows × 3024 columns; i.e, 2000 chunks with the dimensions, not more than 3024
- Format: 2D array of numeric values (int or float)
- Consistency: All embedding vectors must have the same dimension
- Content: Embeddings array cannot be empty
- Processing: Generates unique chunk IDs in format
{batch_id}_{index} for each row
.- Consider them as references of that particular embeddings vector. You will get back these
chunk_ids
, when you query something. - In the example on your right, the reference to
[0.1, 0.2, 0.3, 0.4, 0.5]
isCortexEmbeddings123_0
- You can use these chunk IDs to link the original text which is being embedded
- Consider them as references of that particular embeddings vector. You will get back these
- Dimensional consistency per tenant: All embedding vectors within a tenant must have identical dimensions. Different dimensional vectors require separate tenants
File ID Management: When you provide afile_id
as a key in thedocument_metadata
object, that specific ID will be used to identify your content. If nofile_id
is provided in thedocument_metadata
, the system will automatically generate a unique identifier for you. This allows you to maintain consistent references to your content across your application while ensuring every piece of content has a unique identifier.
Duplicate File ID Behavior
When you upload embeddings with afile_id
that already exists in your tenant:
- Overwrite Behavior: The existing embeddings with the same
file_id
will be completely replaced with the new embeddings - Processing: The new embeddings will go through validation and direct indexing (no embedding generation needed)
- Search Results: Previous search results and vector data from the old embeddings will be replaced with the new embeddings
- Idempotency: Uploading the same embeddings with the same
file_id
multiple times is safe and will result in the same final state
file_id
will be permanently removed and replaced. This action cannot be undone.Error Responses
All endpoints return consistent error responses following the standard format. For detailed error information, see our Error Responses documentation.Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Query Parameters
Unique identifier for the tenant/organization
"tenant_1234"
Optional sub-tenant identifier used to organize data within a tenant. If omitted, the default sub-tenant created during tenant setup will be used.
"sub_tenant_4567"