IONOS Cloud - OpenAI compatible AI Model Hub API (1.0.0)

Download OpenAPI specification:Download

IONOS Cloud Support: support@cloud.ionos.com URL: https://docs.ionos.com/support/general-information/contact-information

IONOS Cloud - AI Model Hub - Documentation

IONOS Cloud AI Model Hub OpenAI compatible API

Please note that this API is not affiliated with OpenAI and is not endorsed by OpenAI in any way.

OpenAI Compatible Endpoints

Endpoints compatible with OpenAI's API specification

Create Chat Completions

Create Chat Completions by calling an available model in a format that is compatible with the OpenAI API. Supports both text-only and multimodal (text + images) inputs for compatible models.

Authorizations:

tokenAuth

Request Body schema: application/json

model required	string ID of the model to use
required	Array of objects (ChatCompletionMessage)
temperature	number Default: 1 The sampling temperature to be used
top_p	number Default: -1 An alternative to sampling with temperature
n	integer Default: 1 The number of chat completion choices to generate for each input message
stream	boolean Default: false If set to true, it sends partial message deltas
stop	Array of strings Up to 4 sequences where the API will stop generating further tokens
max_tokens	integer Deprecated Default: 16 The maximum number of tokens to generate in the chat. This value is now deprecated in favor of max_completion_tokens completion
max_completion_tokens	integer Default: 16 An upper bound for the number of tokens that can be generated for a completion, including visible output tokens
presence_penalty	number Default: 0 It is used to penalize new tokens based on their existence in the text so far
frequency_penalty	number Default: 0 It is used to penalize new tokens based on their frequency in the text so far
logit_bias	object Used to modify the probability of specific tokens appearing in the completion
user	string A unique identifier representing your end-user
	Array of objects (chatCompletionTool) A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
	string or chatCompletionNamedToolChoice (object) (chatCompletionToolChoiceOption) Controls which (if any) tool is called by the model. `none` means the model will not call any tool and instead generates a message. `auto` means the model can pick between generating a message or calling one or more tools. `required` means the model must call one or more tools. Specifying a particular tool via `{"type": "function", "function": {"name": "my_function"}}` forces the model to call that tool. `none` is the default when no tools are present. `auto` is the default if tools are present.

Responses

Request samples

Payload

Content type

application/json

Example

Llama-Models-Example

{"model": "meta-llama/Llama-3.3-70B-Instruct",
"messages": [{"role": "system",
"content": "You are a helpful assistant."
},
{"role": "user",
"content": "Please say hello."
}
],
"temperature": 0.7,
"top_p": 0.9,
"n": 1,
"stream": false,
"stop": ["\n"
],
"max_tokens": 1000,
"presence_penalty": 0,
"frequency_penalty": 0,
"logit_bias": { },
"user": "user-123"
}

Response samples

200

Content type

application/json

{"id": "string",
"choices": [{"finish_reason": "string",
"index": 0,
"message": {"role": "string",
"content": "string",
"tool_calls": [{"id": "string",
"type": "function",
"function": {"name": "string",
"arguments": "string"
}
}
]
}
}
],
"created": 0,
"object": "string",
"model": "string",
"system_fingerprint": "string",
"usage": {"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0
}
}

Create Completions

Create Completions by calling an available model in a format that is compatible with the OpenAI API

Authorizations:

tokenAuth

Request Body schema: application/json

model required	string ID of the model to use
prompt required	string The prompt to generate completions from
temperature	number The sampling temperature to be used
top_p	number An alternative to sampling with temperature
n	integer The number of chat completion choices to generate for each input message
stream	boolean If set to true, it sends partial message deltas
stop	Array of strings Up to 4 sequences where the API will stop generating further tokens
max_tokens	integer The maximum number of tokens to generate in the chat completion
presence_penalty	number It is used to penalize new tokens based on their existence in the text so far
frequency_penalty	number It is used to penalize new tokens based on their frequency in the text so far
logit_bias	object Used to modify the probability of specific tokens appearing in the completion
user	string A unique identifier representing your end-user

Responses

Request samples

Payload

Content type

application/json

{"model": "meta-llama/Llama-3.3-70B-Instruct",
"prompt": "Say this is a test",
"temperature": 0.01,
"top_p": 0.9,
"n": 1,
"stream": false,
"stop": ["\n"
],
"max_tokens": 1000,
"presence_penalty": 0,
"frequency_penalty": 0,
"logit_bias": { },
"user": "user-123"
}

Response samples

200

Content type

application/json

{"id": "string",
"choices": [{"finish_reason": "string",
"index": 0,
"text": "string"
}
],
"created": 0,
"object": "string",
"model": "string",
"usage": {"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0
}
}

Get the entire list of available models

Get the entire list of available models in a format that is compatible with the OpenAI API

Authorizations:

tokenAuth

Responses

Generate an image using a model

Generate an image using a model in a format that is compatible with the OpenAI API

Authorizations:

tokenAuth

Request Body schema: application/json

model required	string ID of the model to use. Please check /v1/models for available models
prompt required	string The prompt to generate images from
n	integer Default: 1 The number of images to generate. Defaults to 1.
size	string Default: "10241024" The size of the image to generate. Defaults to `"10241024"`. Must be one of `"10241024"`, `"17921024"`, or `"10241792"`. The maximum supported resolution is `"17921024"`
response_format	string Default: "b64_json" Value: "b64_json" The format of the response.
user	string A unique identifier representing your end-user

Responses

Request samples

Payload

Content type

application/json

{"model": "stabilityai/stable-diffusion-xl-base-1.0",
"prompt": "A beautiful sunset over the ocean",
"n": 1,
"size": "1024*1024",
"response_format": "b64_json"
}

Response samples

200

Content type

application/json

{"created": 0,
"data": [{"url": null,
"b64_json": "string",
"revised_prompt": "string"
}
]
}

Creates an embedding vector.

Creates an embedding vector representing the input text.

Authorizations:

tokenAuth

Request Body schema: application/json

model	string ID of the model to use. Please check /v1/models for available models
	string or Array of strings

Responses

Request samples

Payload

Content type

application/json

{"input": ["The food was delicious and the waiter."
],
"model": "intfloat/e5-large-v2"
}

Response samples

200

Content type

application/json

{"model": "string",
"object": "string",
"data": [{"index": 0,
"object": "string",
"embedding": [0
]
}
],
"usage": {"prompt_tokens": 0,
"total_tokens": 0
}
}