Despeech Docs
Despeech offers an extremely simple API from which to schedule jobs and retrieve transcription output. It has an endpoint for each of these that requires a Bearer token set to your api key in your dashboard:
POST /api/v1/transcribe
Accepts an argument 'url' indicating the source of the audio to transcribe. Optionally accepts a 'model' with which to transcribe. Responds with the id of the transcription job created.
Request:
curl -X POST https://despeech.com/api/v1/transcribe \\
-H "Authorization: Bearer $API_KEY" \\
-H "Content-Type: application/json" \\
-d '{"url": "https://myaudiofile.com/audio.mp3"}'
To enable speaker diarisation, pass diarization: true. Optionally provide both min_speakers and max_speakers to constrain the number of speakers detected.
Diarisation request:
curl -X POST https://despeech.com/api/v1/transcribe \\
-H "Authorization: Bearer $API_KEY" \\
-H "Content-Type: application/json" \\
-d '{"url": "https://myaudiofile.com/audio.mp3", "diarization": true, "min_speakers": 2, "max_speakers": 4}'
Response:
{
"id": "beaa6e89-3935-4c72-9ed1-f06237832388-e1",
"status": "IN_QUEUE"
}
POST /api/v1/status/{id}
Returns the status of the transcription, along with output if complete.
Request:
curl -X POST https://despeech.com/api/v1/status/beaa6e89-3935-4c72-9ed1-f06237832388-e1 \\
-H "Authorization: Bearer $API_KEY"
Responses:
{
"id": "beaa6e89-3935-4c72-9ed1-f06237832388-e1",
"status": "PROCESSING"
}
Completed Response:
{
"delayTime": 11829,
"executionTime": 2668,
"id": "beaa6e89-3935-4c72-9ed1-f06237832388-e1",
"output": {
"detected_language": "en",
"device": "cuda",
"model": "base",
"transcription": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi magna velit, aliquam eget metus eget, suscipit tempus orci.",
"translation": null,
"segments": {
...
}
},
"status": "COMPLETED",
"workerId": "w7it92xoifddgx"
}
Completed response (Diarized):
{
"delayTime": 11829,
"executionTime": 5432,
"id": "beaa6e89-3935-4c72-9ed1-f06237832388-e1",
"output": {
"segments": [
{ "start": 0.0, "end": 3.2, "text": "Lorem ipsum dolor sit amet.", "speaker": "SPEAKER_00" },
{ "start": 3.5, "end": 7.1, "text": "Consectetur adipiscing elit.", "speaker": "SPEAKER_01" }
]
},
"status": "COMPLETED",
"workerId": "w7it92xoifddgx"
}
GET /api/v1/transcript/{id}
Returns the transcription.
Request:
curl -X GET https://despeech.com/api/v1/transcript/beaa6e89-3935-4c72-9ed1-f06237832388-e1 \\
-H "Authorization: Bearer $API_KEY"
Responses:
{
"transcript_id": "beaa6e89-3935-4c72-9ed1-f06237832388-e1",
"transcript": {
"detected_language": "en",
"device": "cuda",
"model": "base",
"transcription": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi magna velit, aliquam eget metus eget, suscipit tempus orci.",
"translation": null,
"segments": {
...
}
}
}