Audio Transcription
POST /v1/audio/transcriptions
Transcribe audio into text. Supports file upload or URL.
Endpoint
POST https://api.leaper.one/v1/audio/transcriptionsModels
| Model | Pricing | Description |
|---|---|---|
rapid | 0.006 credits/min | Fast transcription, supports file_uri |
whisper-1 | 0.006 credits/min | High accuracy with prompt support |
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| file | file | Either file or file_uri | Audio file to transcribe (multipart upload) |
| file_uri | string | Either file or file_uri | URL of the audio file to transcribe |
| model | string | No | Model to use: "rapid" or "whisper-1". Default: "rapid" |
| response_format | string | No | Output format: "text", "json", "verbose_json", "srt", or "vtt". Default: "json" |
| language | string | No | ISO 639-1 language code (e.g. "en", "zh"). Improves accuracy if specified |
| prompt | string | No | Hint text to improve transcription quality |
| temperature | number | No | Sampling temperature between 0 and 1 (whisper-1 only) |
Request
Upload a file
curl -X POST https://api.leaper.one/v1/audio/transcriptions \
-H "Authorization: Bearer sk-your-api-key" \
-F file=@recording.mp3 \
-F model=rapid \
-F response_format=jsonPass a URL (recommended for large files)
curl -X POST https://api.leaper.one/v1/audio/transcriptions \
-H "Authorization: Bearer sk-your-api-key" \
-F file_uri=https://example.com/audio.mp3 \
-F model=rapid \
-F language=zh \
-F response_format=verbose_jsonUsing file_uri avoids uploading large files through your network. The audio is fetched server-side. Supported with both rapid and whisper-1 models.
Using whisper-1 with prompt
curl -X POST https://api.leaper.one/v1/audio/transcriptions \
-H "Authorization: Bearer sk-your-api-key" \
-F file=@recording.mp3 \
-F model=whisper-1 \
-F response_format=json \
-F prompt="LEAPERone, GTC, NVIDIA"Requesting subtitle output
curl -X POST https://api.leaper.one/v1/audio/transcriptions \
-H "Authorization: Bearer sk-your-api-key" \
-F file=@recording.mp3 \
-F model=whisper-1 \
-F response_format=srtResponse
{
"text": "Hello, this is a sample transcription of the audio file."
}When using verbose_json, the response includes timestamps and segments:
{
"task": "transcribe",
"language": "en",
"duration": 42.5,
"text": "Hello, this is a sample transcription of the audio file.",
"segments": [
{ "start": 0.0, "end": 2.4, "text": "Hello, this is a sample transcription." }
]
}When using srt, the response is plain text subtitle output:
1
00:00:00,000 --> 00:00:02,400
Hello, this is a sample transcription.When using vtt, the response is returned as WebVTT:
WEBVTT
00:00:00.000 --> 00:00:02.400
Hello, this is a sample transcription.Supported Audio Formats
mp3, mp4, mpeg, mpga, m4a, wav, webm, opus
File uploads are limited to 25 MB. When using file_uri, there is no size limit.
Notes
- Billing is based on audio duration, charged at the per-model rate listed above.
- If no
modelis specified,rapidis used by default. - Streaming (
stream=true) is not supported at this time.