Inference API

Run inference on a model deployment

post

This endpoint runs inference using a specified model deployment. It accepts optional chat history to provide context for chat-based models. The response includes the model's output and other relevant information.

Authorizations
x-api-keystringRequired
Body
modelstring · uuidRequired

The UUID of the model to be used for inference.

Example: a4a2feb3-efc1-49d6-96ca-5f7ec05cde98
deployment_versionstring | nullableOptional

The ID of the specific deployment version to use. If not provided, the default version will be used.

Example: 1.0.0
Responses
200

Successful response with inference results.

application/json
post
/api/v0/deploy/inference

Last updated

Was this helpful?