With the Speechly Batch API you can transcribe a set of audio files asynchronously.
To transcribe multiple audio files, repeat steps 2 and 3.
This example transcribes the following sample audio file and prints the results of speech-to-text operation in the terminal.
Start by opening a bash
, sh
or zsh
shell in an Unix-like environment (OS X, Linux or Windows Subsystem for Linux).
Store a valid app id from Speechly Dashboard in a shell variable. You can use any English, speech-to-text only app configuration.
# Copy a valid app id from Speechly Dashboard
SPEECHLY_APP_ID=my_app_id
Call Login
method from speechly.identity.v2.IdentityAPI with curl
.
curl -X POST https://api.speechly.com/speechly.identity.v2.IdentityAPI/Login \
-H 'Context-Type: application/json' \
-d \
'{
"deviceId": "'`uuidgen`'",
"application": {
"appId": "'$SPEECHLY_APP_ID'"
}
}'
Copy the authorization token
’s value from the response and store it in a shell variable for the following requests.
SPEECHLY_AUTH_TOKEN=my_token_value
Call ProcessAudio
method from speechly.slu.v1.BatchAPI to queue an audio file URI for processing.
# Send an audio file for processing
curl -X POST https://api.speechly.com/speechly.slu.v1.BatchAPI/ProcessAudio \
-H 'Context-Type: application/json' \
-H 'authorization: Bearer '$SPEECHLY_AUTH_TOKEN \
-d \
'[{
"appId": "'$SPEECHLY_APP_ID'",
"config": {
"encoding": 1,
"sampleRateHertz": 16000,
"channels": 1
},
"uri": "https://dreamy-cori-a02de1.netlify.app/test1_en.wav"
}]'
Copy the operation id
from the response and store it in a shell variable for querying the progress and transcription results.
SPEECHLY_OPERATION_ID=my_operation_id
Call QueryStatus
method from speechly.slu.v1.BatchAPI to get current status of the transcription operation. Call the method periodically until status
goes to STATUS_DONE
.
curl -X POST https://api.speechly.com/speechly.slu.v1.BatchAPI/QueryStatus \
-H 'Context-Type: application/json' \
-H 'authorization: Bearer '$SPEECHLY_AUTH_TOKEN \
-d \
'{
"id": "'$SPEECHLY_OPERATION_ID'"
}'
The response for a finished operation contains a transcripts
array with all the detected words:
// Response JSON
{
"operation": {
"id": "12345678-1234-1234-1234-123456789012",
"status": "STATUS_DONE",
"appId": "12345678-1234-1234-1234-123456789012",
"deviceId": "12345678-1234-1234-1234-123456789012",
"transcripts": [
{
"word": "BANANAS",
"index": 0,
"startTime": 300,
"endTime": 1300
},
{
"word": "APPLES",
"index": 1,
"startTime": 2050,
"endTime": 3120
}
]
}
}
Last updated by Mathias Lindholm on November 2, 2022 at 11:47 +0200
Found an error on our documentation? Please file an issue or make a pull request