This AGI script makes use of Google's Cloud Speech API in order to render speech to text and return it back to the dialplan as an asterisk channel variable. See README for a complete list of supported languages.
Perl: The Perl Programming Language
perl-libwww: The World-Wide Web library for Perl
perl-libjson: Module for manipulating JSON-formatted data
IO-Socket-SSL: Perl module that implements an interface to SSL sockets
flac: Free Lossless Audio Codec
To install copy speech-recog.agi to your agi-bin directory.
Usually this is /var/lib/asterisk/agi-bin/
To make sure check your /etc/asterisk/asterisk.conf file
agi(speech-recog.agi,[lang],[timeout],[intkey],[NOBEEP]): Records from the current
channel until 3 seconds of silence are detected (setting the 'timeout' argument to -1 disables
the silence detection) or the interrupt key (# by default) is pressed. If NOBEEP is set, no
beep sound is played back to the user to indicate the start of the recording. The recorded
sound is send over to Google speech recognition service and the returned text string is
assigned as the value of the channel variable 'utterance'.
The scripts sets the following channel variables:
utterance: The transcripted text string.
confidence: A value between 0 and 1 indicating the probability of a correct
recognition.Values bigger than 0.90 usually mean that the resulted text is correct.
Asterisk dialplan examples:
In these examples googletts.agi script
is used for speech synthesis:
;;Simple speech recognition exten => 1234,1,Answer() exten => 1234,n,agi(speech-recog.agi,en-US) exten => 1234,n,Verbose(1,The text you just said is: ${utterance}) exten => 1234,n,Verbose(1,The probability to be right is: ${confidence}) exten => 1234,n,Hangup()
;;Speech recognition demo: exten => 1235,1,Answer() exten => 1235,n,agi(googletts.agi,"Say something in English, when done press the pound key.",en) exten => 1235,n(record),agi(speech-recog.agi,en-US) exten => 1235,n,Verbose(1,Script returned: ${confidence} , ${utterance}) ;Check the probability of a successful recognition: exten => 1235,n(success),GotoIf($["${confidence}" > "0.8"]?playback:retry) ;Playback the text: exten => 1235,n(playback),agi(googletts.agi,"The text you just said was...",en) exten => 1235,n,agi(googletts.agi,"${utterance}",en) exten => 1235,n,goto(end) ;Retry in case speech recognition wasn't successful: exten => 1235,n(retry),agi(googletts.agi,"Can you please repeat more clearly?",en) exten => 1235,n,goto(record) exten => 1235,n(fail),agi(googletts.agi,"Failed to get speech data.",en) exten => 1235,n(end),Hangup()
;;Voice dialing example exten => 1236,1,Answer() exten => 1236,n,agi(googletts.agi,"Please say the number you want to dial.",en) exten => 1236,n(record),agi(speech-recog.agi,en-US) exten => 1236,n,GotoIf($["${confidence}" > "0.8"]?success:retry) exten => 1236,n(success),goto(${utterance},1) exten => 1236,n(retry),agi(googletts.agi,"Can you please repeat?",en) exten => 1236,n,goto(record)
The speech-recog script for asterisk is distributed under the GNU General Public License v2.
Lefteris Zafiris (zaf@fastmail.com)
Development snapshots are available in either zip or tar formats.
You can also clone the project with Git by running:
$ git clone git://github.com/zaf/asterisk-speech-recog
GoogleTTS text to speech script for
asterisk
Text translation using Google Translate
API for Asterisk
Speech synthesis using Microsoft Translator
API for Asterisk
Asterisk Flite text to speech module
Asterisk e-Speak text to speech module