Real-time Natural Language Understanding (NLU) features

Real-time intent detection – returns the meaning of the speech segment
Real-time entity detection and classification - returns the keywords and their types in the speech segment
Speech-to-text adaptation: Tweaks the ASR engine to prefer the vocabulary used in the NLU configuration. This improves the accuracy for your use case, especially if you require e.g. uncommon brand names or specialist jargon.

Note: NLU features are not available via Batch API and On-device APIs

Enabling NLU

The NLU configuration is empty by default and NLU features are disabled. Without NLU, Speechly operates in speech-to-text mode and returns no intents or entities.

To enable the NLU features, you’ll need to provide a NLU configuration for your app id in the Dashboard or with the CLI tool.

The configuration contains text phrases users might say. Each phrase is tagged with an intent. Keywords can tagged, too, so they get returned to your app as entities. There’s also special syntax for generating phrases automatically.

Getting started

Configuration basics gives a brief introduction to basic configuration concepts.
Speechly Annotation Language Video Tutorial Series walks you through Speechly configuration using a flight booking example.
Speechly Annotation Language Syntax explains the details of SAL syntax.
Speechly Annotation Language Semantics explains the details of SAL semantics.
Example configurations are useful learning material.
Standard Variables are useful when your configuration must support numbers, dates, times, etc.
Entity Data Types are useful when combined with the Standard Variables to obtain entity values in a normalized format.
Imports and Lookups allow you to import external data to your configuration, and have the API return normalised entity values by using simple lookup tables.

Why must I configure my application?

In general it is necessary to design the utterances for each application separately. With Speechly, the configuration serves two equally important purposes:

Teaching our speech recognition system the vocabulary that is relevant in your application. An application may require the use of uncommon words (e.g. obscure brand names or specialist jargon) that must explicitly be taught to our speech recognition model.
Defining the information (intents and entities) that should be extracted from users' utterances. It is difficult to provide ready-made configurations that would sufficiently suit a variety of use-cases. The set of intents and entities are tightly coupled with the workings of each specific application.

Configuring Your Application

Real-time Natural Language Understanding (NLU) features

Enabling NLU

Getting started

Why must I configure my application?