Speech Recognition Component (Yandex Cloud SpeechKit)

Table of Contents

Description

Performs caller speech recognition into text form.
The service is provided by Yandex Cloud SpeechKit over the protocol HTTP. It has no technical limitations in terms of performance.

For the procedure of registering a Yandex-cloud account and the cost of the service, see Yandex.

The tokens are configured in domain settings, field 'yandex_cloud'..

Allows you to interrupt on silence after a spoken phrase.

Table 1. System Characteristics

Index

223

Short title

asr_yandex

Types of scenarios

IVR

Starter module

r_sip_ivr_script_component_asr_yandex

Mode

Asynchronous

Icon

223

Branching pattern

Branching, interrupting

Properties

Table 2. Properties
Specification Description

Title: Yandex Account
Code: accountKey
Visibility: no
Default: default

Specifies the Yandex account whose options are taken to connect to yandex.
The list includes the value 'default', which sets the root fields 'speech' and 'storage' in the object to be used 'settings.yandex_cloud'.
Additionally, the list includes the keys of the 'settings.yandex_cloud.accounts' object, each of which is also followed by an object with separately configured access parameters.

Title: Theme
Code: topic
Visibility: no
Default: general

Yandex Cloud SpeechKit recognition service parameter: recognition theme.
Possible options:

  • general (0)

  • maps (1)

  • dates (2)

  • names (3)

  • numbers (4)

  • Other (custom, 100) - Specifies an arbitrary topic via argument.

Title: User Topic
Code: topicCustom
Visibility: yes
Default: — 

Argument containing a custom theme for the recognition service Yandex Cloud SpeechKit.

Title: Language
Code: lang
Visibility: no
Default: Russian

Yandex Cloud SpeechKit recognition service parameter: recognition language.
Possible options:

  • Auto (100) - Automatic language detection

  • ru-RU (0) – Russian

  • en-US (1) – English

  • de-DE (2) – German

  • es-ES (3) – Spanish

  • fi-FI (4) – Finnish

  • fr-FR (5) – French

  • he-HE (6) – Hebrew

  • it-IT (7) – Italian

  • kk-KZ (8) – Kazakh

  • nl-NL (9) – Dutch

  • pl-PL (10) – Polish

  • pt-PT (11) – Portuguese

  • pt-BR (12) – Brazilian Portuguese

  • sv-SE (13) – Swedish

  • tr-TR (14) – Turkish

  • uz-UZ (15) – Uzbek (Latin alphabet)

Title: `A profanity filter
Code: profanityFilter
Visibility: no
Default: `Disable'

Yandex Cloud SpeechKit recognition service parameter: switch off the profanity filter.

Title: Recording timeout, s
Code: recordTimeoutSec
Visibility: no
Default: 30

Maximum allowable recording time from the end of preplay, in seconds.

Title: Break by DTMF
Code: checkDTMF
Visibility: no
Default: `none'

DTMF detector switch. Opens the settings for the character save and operation interrupt modes.

Title: Buffer for DTMF
Code: dtmfBuffer
Visibility: yes
Default: — 

Variable to store received DTMF characters.

Title: Clear buffer DTMF
Code: clearDtmfBuffer
Visibility: yes
Default: `Yes'

Buffer pre-clearance switch DTMF.

Title: Number of characters
Code: maxSymbolCount
Visibility: yes
Default: — 

An argument containing a limit on the number of characters that can be entered.
When the specified number of DTMF characters is received during the component execution, the recording is automatically terminated and the last portion of voice data is sent to the recognition service.

Title: Interrupt Symbols
Code: interruptSymbols
Visibility: yes
Default: — 

A string containing sequences of interrupt characters separated by commas.
When a character sequence matching one of the specified interrupt sequences is detected at the end of the DTMF buffer, the recording is automatically terminated and the last portion of data is sent to the recognition service.
For example, *, 7, 123, 9395.

Title: Interrupt when silence is detected
Code: abortOnSilence
Visibility: no
Default: `Yes'

Voice Detector (VAD) switch to automatically end recording and send the last portion of voice data to the recognition service.
The criterion for stopping is the presence of a voice for at least 300 ms and its subsequent absence for the specified interval.

Title: `Interval of Silence, from
Code: silenceTimeoutSec
Visibility: yes
Default: 2

Interval for the Voice Action Detector (VAD) to automatically stop recording and send the last portion of voice data to the recognition service.
Applies when 'Interrupt when silence is detected' is enabled'.

Title: VAD threshold, dB
Code: vadThreshold
Visibility: yes
Default: 30

An argument containing the VAD threshold at which the presence of voice is detected when crossed upwards, in decibels.
Any noise with a level below the threshold is considered as silence.

Title: Response timeout, s
Code: responseTimeoutSec
Visibility: no
Default: 5

Timeout for waiting for a response from the Yandex Cloud SpeechKit recognition service after sending it the last portion of voice data.
When the timeout expires, control is passed to the next component on the Time branch.

Title: Result to variable
Code: varText
Visibility: no
Default: — 

Variable to save the text result of recognition.

Title: Response code to variable
Code: varHttpCode
Visibility: no
Default: — 

Variable to store the HTTP response code of the recognition service.

Title: Response body to variable
Code: varHttpBody
Visibility: no
Default: — 

Variable to store the full content of the HTTP response of the recognition service.

Title: Save Record File
Code: saveRec
Visibility: no
Default: `None'

Switch to save the record file sent to the recognition service.

Title: File path to a variable
Code: varRecordPath
Visibility: yes
Default: — 

Variable to store the path to the record file.
The file is placed in a temporary directory of the script and will be deleted when the script completes.
Long-term file retention requires the script to further move the file to a stationary storage location.

The recording is made on the server with the mg role serving the current call, and then transferred to the server with the ivr role serving the current scenario. The transfer always takes place within the site.

Title: Preview
Code: prePlayFile
Visibility: no
Default: — 

Pre-play audio file to the subscriber during which the voice detector is also activated.
If there is no voice from the subscriber (taking into account the noise threshold of the VAD detector), no data is sent to the recognition service.

Can be selected in one of the modes:

  • a static file attached to the script (loaded from the Script Editor application or via the API);

  • argument-formed path, which must include one of the filecategories.

Title: Transition
Code: transfer
Visibility: no
Default: — 

Component to which control is passed in case of successful completion of the operation.

Title: Transition, Time
Code: transferTimeout
Visibility: no
Default: — 

The component to which control is passed in case the timeout period for HTTP response from the recognition service has expired.

Title: Transition, Error
Code: transferError
Visibility: no
Default: — 

The component to which control is passed if an error occurs.

See also