Client Library API Reference

Library	Installation and usage	API Reference
JavaScript	Installation and usage (GitHub)	API reference (TypeDoc in GitHub)
React	Installation and usage	API reference (TypeDoc in GitHub)
Unity and C#	Installation and usage (GitHub)	API reference (DocFX)
Android (Kotlin)	Installation and usage (GitHub)
iOS (Swift)	Installation and usage (GitHub)

Overview

Overview
Preparing the client library and the audio source
Provide a handler for the results
Start speech processing
Stop speech processing
The Segment data structures
Intent
Entity
Transcript

Preparing the client library and the audio source

// Pseudocode. Preparation steps may vary per client library.
// See specific API reference for details.
speechly_client = new SpeechlyClient()
speechly_client.initialize()
// Open the microphone or audio source before attaching
speechly_client.attach( audio_source )

Preparation consists of the following tasks:

Creating the client instance.
Initializing Speechly’s speech recognition engine.
Opening and attaching an audio source (e.g. microphone).

An authorization token needs to be provided during the preparation steps. Either an app_id or project_id can be used. These can be acquired via the Dashboard or Command Line Tool.

Provide a handler for the results

// Pseudocode. See specific API reference for details.
speechly_client.onSegmentChange( fn(segment) )

fn is the function you pass for handling the words, intents and entities detected by the speech recognition engine. They are passed in a segment structure.

As the user speaks, the handler is called repeatedly with an updated results.

See segment

Start speech processing

// Pseudocode
context_id = await speechly_client.start( app_id )

Starts streaming audio from the microphone (or other audio source) to the speech recognition engine. The client library gathers result events from the HTTP/gRPC API and fires onSegmentChange and other relevant callbacks.

project_id authorization during the preparation allows you to direct the audio to any app configuration within the project by providing the app_id argument.

Stop speech processing

// Pseudocode
await speechly_client.stop()

Stops streaming audio to the speech recognition engine and wait for remaining results to arrive. Callbacks fire until the audio stream has been fully processed.

The Segment data structures

// Pseudocode
struct Segment {
    contextId: string,
    id: int,
    isFinal: boolean,
    intent: Intent,
    entities: list<Entity>,
    words: list<Transcript>
}

Name	Type	Description
`contextId`	`string`	The audio context to which this segment belongs to (UUID).
`id`	`int`	The index (zero-based) of this segment within the audio context. An audio context can consist of several consecutive segments.
isFinal	boolean	A boolean that indicates if this is the last time callback is called with this segment. Subsequent calls to callback within the same audio context refer to the next segment. Note that none of the data associated with this segment will no longer be attached to the next segment.
`intent`	`SpeechIntent`	The intent associated with this segment. There can only be one intent for a segment.
`entities`	`List<Entity>`	A list of entities. There can be several entities that belong to the same segment.
`words`	`List<Transcript>`	A list of Transcript objects. Together these contain the text produced by speech recognition.

Intent

Intent { name: string, isFinal: boolean }

Name	Type	Description
`name`	`string`	Name of the intent.
`isFinal`	`boolean`	Boolean that indicates if the intent name is finalised. When isFinal is false it is possible that in subsequent calls to callback the name of the intent can change. When isFinal is true, it is guaranteed that the intent name does not change until the segment changes.

Entity

Entity { name: string, value: string, isFinal: boolean,
         startIndex: int, endIndex: int }

Name	Type	Description
`type`	`string`	The name of the entity.
`value`	`string`	The value of the entity.
`isFinal`	`boolean`	Boolean that indicates if the entity is finalised. Behaves in the same way as Intent.isFinal.
`startIndex`	`int`	Index of the Transcript that contains the first token of the transcript span this entity was extracted from.
`endIndex`	`int`	Index of the Transcript that contains the first token of the transcript span this entity was extracted from.

Transcript

Transcript { index: int, value: string, isFinal: boolean }

Name	Type	Description
`index`	`int`	Position of this Transcript in the complete transcript.
`value`	`string`	The word of this Transcript.
`isFinal`	`boolean`	Boolean that indicates if the word associated with this Transcript is final, or if it can change in subsequent calls to callback.