Text-to-Speech API
Last Updated: 2026-05-11
Speech Synthesis
Service Overview
- Converts input text into natural-sounding speech audio using an AI speech engine.
Service Access
The Speech Synthesis API uses a console-based project access model. You can register and activate services on the ViiTor AI official website (https://www.viitor.com/), then create a project in the console to obtain required authentication credentials.
The business service endpoint is provided by the project gateway. This document uses the following gateway as an example:
https://video-translation.ilivedata.com
Integration
Parameter Specification
- Request URL:
https://video-translation.ilivedata.com/speechSynthesis/textToSpeech - Method:
POST - Content-Type:
application/json - Response Format:
{code, message, data}
HTTP Headers
| Header | Required | Type | Description |
|---|---|---|---|
Content-Type | Yes | String | application/json;charset=UTF-8 |
Accept | Yes | String | application/json;charset=UTF-8 |
Authorization | Yes | String | Login token (Bearer Token) |
X-User-Id | Yes | Long | User ID. The server overrides userId in request body. |
X-Channel | No | Integer | Channel code. Defaults to 0 if missing or invalid. |
X-App-Source | No | String | Source marker. Effective only when X-Channel != 100. |
Notes:
- The server uses
X-User-Idfrom headers as the source of truth.
Request Method: POST
Request Body
Field Definitions
| Field | Type | Required | Description |
|---|---|---|---|
sourceText | String | Yes | Input text to synthesize. Max length is 5000 chars; extra content will be truncated. |
targetLanguage | String | No | Target language. Must be within supported language list if provided. |
voiceName | String | Conditionally required | Voice name. At least one of voiceName or timbreNumber is required. |
common | Integer | No | Voice scope marker (public/private). |
speedFactor | Float | No | Speech rate, range [0.5, 2], default 1.0. |
volume | Float | No | Volume, range [-60, 20], default 0. |
emotion | Integer | No | Emotion parameter, range [0, 6], default 0. |
selectedEngine | Integer | No | Preferred synthesis engine. |
format | String | No | Output format. Supported: pcm/wav/mp3. Default: wav. |
Validation Rules
sourceTextis required.- If
targetLanguageis provided, it must be in supported language list. - Invalid
formatvalues will fall back towav.
Response Body
Standard Response Schema
| Field | Type | Description |
|---|---|---|
code | Integer | 0 means success; non-zero means failure |
message | String | Response message |
data | Object | Business payload |
data Fields on Success
| Field | Type | Description |
|---|---|---|
taskId | String | Task ID |
targetAudioUrl | String | Synthesized audio URL |
sourceLanguage | String | Detected source language |
targetLanguage | String | Target language |
speeches | Array | Segmented speech details (if available) |
Examples
cURL Request
curl -X POST "https://video-translation.ilivedata.com/speechSynthesis/textToSpeech" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <TOKEN>" \
-H "X-User-Id: 123456" \
-H "X-Channel: 100" \
-d '{
"sourceText": "Hello, this is a speech synthesis test.",
"targetLanguage": "en",
"voiceName": "clone",
"speedFactor": 1.0,
"volume": 0,
"emotion": 0,
"format": "mp3"
}'
Success Response Example
{
"code": 0,
"message": "OK",
"data": {
"userId": 123456,
"taskId": "ViiTor_AI_XXXXXXXXXXXXXXXXXXXXXXXX",
"targetAudioUrl": "https://cdn.example.com/tts/result.mp3",
"sourceLanguage": "en",
"targetLanguage": "en"
}
}
Failure Response Example (Invalid Parameters)
{
"code": 2001,
"message": "Invalid Parameter",
"data": null
}
Common Error Codes
| Code | Meaning | Typical Cause |
|---|---|---|
2000 | Missing Parameter | Missing required fields (e.g., sourceText) |
2001 | Invalid Parameter | Invalid speed/volume/emotion/language/voice params |
10091 | insufficient points | User has insufficient points |
2107 | voice synthesis failed | Internal synthesis failure |
Client Integration Recommendations
- Perform client-side pre-validation for
sourceText,voiceName, andformat. - Handle all
code != 0responses uniformly, especially parameter errors and insufficient points.