Library | Installation and usage | API Reference |
---|---|---|
JavaScript | Installation and usage (GitHub) | API reference (TypeDoc in GitHub) |
React | Installation and usage | API reference (TypeDoc in GitHub) |
Unity and C# | Installation and usage (GitHub) | API reference (DocFX) |
Android (Kotlin) | Installation and usage (GitHub) | |
iOS (Swift) | Installation and usage (GitHub) |
See also Web Speech API, Web Components, Unreal Engine 4 and gRPC protobuf definitions.
// Pseudocode. Preparation steps may vary per client library.
// See specific API reference for details.
speechly_client = new SpeechlyClient()
speechly_client.initialize()
// Open the microphone or audio source before attaching
speechly_client.attach( audio_source )
Preparation consists of the following tasks:
An authorization token needs to be provided during the preparation steps. Either an app_id
or project_id
can be used. These can be acquired via the Dashboard or Command Line Tool.
// Pseudocode. See specific API reference for details.
speechly_client.onSegmentChange( fn(segment) )
fn
is the function you pass for handling the words, intents and entities detected by the speech recognition engine. They are passed in a segment
structure.
As the user speaks, the handler is called repeatedly with an updated results.
// Pseudocode
context_id = await speechly_client.start( app_id )
Starts streaming audio from the microphone (or other audio source) to the speech recognition engine. The client library gathers result events from the HTTP/gRPC API and fires onSegmentChange
and other relevant callbacks.
project_id
authorization during the preparation allows you to direct the audio to any app configuration within the project by providing the app_id
argument.
// Pseudocode
await speechly_client.stop()
Stops streaming audio to the speech recognition engine and wait for remaining results to arrive. Callbacks fire until the audio stream has been fully processed.
// Pseudocode
struct Segment {
contextId: string,
id: int,
isFinal: boolean,
intent: Intent,
entities: list<Entity>,
words: list<Transcript>
}
Name | Type | Description |
---|---|---|
contextId | string | The audio context to which this segment belongs to (UUID). |
id | int | The index (zero-based) of this segment within the audio context. An audio context can consist of several consecutive segments. |
isFinal | boolean | A boolean that indicates if this is the last time callback is called with this segment. Subsequent calls to callback within the same audio context refer to the next segment. Note that none of the data associated with this segment will no longer be attached to the next segment. |
intent | SpeechIntent | The intent associated with this segment. There can only be one intent for a segment. |
entities | List<Entity> | A list of entities. There can be several entities that belong to the same segment. |
words | List<Transcript> | A list of Transcript objects. Together these contain the text produced by speech recognition. |
Intent { name: string, isFinal: boolean }
Name | Type | Description |
---|---|---|
name | string | Name of the intent. |
isFinal | boolean | Boolean that indicates if the intent name is finalised. When isFinal is false it is possible that in subsequent calls to callback the name of the intent can change. When isFinal is true, it is guaranteed that the intent name does not change until the segment changes. |
Entity { name: string, value: string, isFinal: boolean,
startIndex: int, endIndex: int }
Name | Type | Description |
---|---|---|
type | string | The name of the entity. |
value | string | The value of the entity. |
isFinal | boolean | Boolean that indicates if the entity is finalised. Behaves in the same way as Intent.isFinal. |
startIndex | int | Index of the Transcript that contains the first token of the transcript span this entity was extracted from. |
endIndex | int | Index of the Transcript that contains the first token of the transcript span this entity was extracted from. |
Transcript { index: int, value: string, isFinal: boolean }
Name | Type | Description |
---|---|---|
index | int | Position of this Transcript in the complete transcript. |
value | string | The word of this Transcript. |
isFinal | boolean | Boolean that indicates if the word associated with this Transcript is final, or if it can change in subsequent calls to callback. |
Last updated by Mathias Lindholm on October 25, 2022 at 21:29 +0300
Found an error on our documentation? Please file an issue or make a pull request