LEAPERone Docs

Audio Transcription

Transcribe audio files to text.

The Audio Transcription endpoint converts speech to text. You can upload an audio file or pass a URL.

Models

ModelPricingBest for
rapid (default)0.006 credits/minFast, general-purpose transcription
whisper-10.006 credits/minHigh accuracy with prompt support

Quick Start

POST /v1/audio/transcriptions
curl -X POST https://api.leaper.one/v1/audio/transcriptions \
  -H "Authorization: Bearer sk-your-api-key" \
  -F file=@meeting.mp3
Response
{
  "text": "Welcome to today's meeting. Let's start with the agenda..."
}

Using file_uri

Instead of uploading a file, you can pass a URL. This is recommended for large files or when your audio is already hosted online.

Transcribe from URL
curl -X POST https://api.leaper.one/v1/audio/transcriptions \
  -H "Authorization: Bearer sk-your-api-key" \
  -F file_uri=https://example.com/podcast-episode.mp3 \
  -F language=en \
  -F response_format=verbose_json

file_uri supports any publicly accessible audio URL. Formats include mp3, opus, m4a, wav, and more. No file size limit when using URL.

Choosing a Model

rapid (default)

Best for quick transcription without extra configuration. No model parameter needed. Supports language and prompt parameters.

Using rapid
curl -X POST https://api.leaper.one/v1/audio/transcriptions \
  -H "Authorization: Bearer sk-your-api-key" \
  -F file=@meeting.mp3 \
  -F language=zh \
  -F response_format=json

whisper-1

OpenAI's Whisper model. Supports prompt to improve recognition of specific terms and verbose_json for word-level timestamps.

Using whisper-1 with prompt
curl -X POST https://api.leaper.one/v1/audio/transcriptions \
  -H "Authorization: Bearer sk-your-api-key" \
  -F file=@meeting.mp3 \
  -F model=whisper-1 \
  -F response_format=verbose_json \
  -F prompt="LEAPERone, API, transcription"

Supported Formats

FormatExtension
MP3.mp3
MP4.mp4
MPEG.mpeg, .mpga
M4A.m4a
WAV.wav
WebM.webm
Opus.opus

Response Formats

Set response_format to control the output:

ValueDescription
jsonJSON object with a text field (default).
textPlain text transcription.
verbose_jsonJSON with timestamps, segments, and metadata.
srtSubRip subtitle format.
vttWebVTT subtitle format.

Billing is based on audio duration. See the API Reference for per-model pricing.