Inference API

Run inference on a model deployment

post

This endpoint runs inference using a specified model deployment. It accepts optional chat history to provide context for chat-based models. The response includes the model's output and other relevant information.

Authorizations
Body
modelstring · uuidRequired

The UUID of the model to be used for inference.

Example: a4a2feb3-efc1-49d6-96ca-5f7ec05cde98
deployment_versionstring | nullableOptional

The ID of the specific deployment version to use. If not provided, the default version will be used.

Example: 1.0.0
Responses
200

Successful response with inference results.

application/json
post
/api/v0/deploy/inference
POST /api/v0/deploy/inference HTTP/1.1
Host: api.sandbox.gradientj.com
x-api-key: YOUR_API_KEY
Content-Type: application/json
Accept: */*
Content-Length: 200

{
  "model": "a4a2feb3-efc1-49d6-96ca-5f7ec05cde98",
  "deployment_version": "1.0.0",
  "params": {
    "resume_text": "This is the resume text to process."
  },
  "chat_history": [
    {
      "speaker": "user",
      "text": "USER_MESSAGE_1"
    }
  ]
}
{
  "result": {
    "output": "This is the output from the model.",
    "metadata": {
      "inference_uuid": "9572ef17-5767-4c43-8c89-ec484a97644c"
    }
  }
}

Last updated

Was this helpful?