Fork me on GitHub

iSpeech plugins for Asterisk

Automatic Speech Recognition and Text To Speech plugins for Asterisk that use the iSpeech API

These AGI scripts make use of iSpeech API in order to provide ASR and TTS capabilities for Asterisk PBX. See README for a complete list of features and supported languages.

Dependencies

Perl: The Perl Programming Language
perl-libwww: The World-Wide Web library for Perl
IO-Socket-SSL: Perl module that implements an interface to SSL sockets
format_mp3: Asterisk module that supports mp3 playback
Internet access in order to contact iSpeech servers and get the speech and text data.
speex: Patent-free audio compression format designed for speech (Optional).

Install

To install copy ispeech-asr.agi and ispeech-tts.agi to your agi-bin directory.
Usually this is /var/lib/asterisk/agi-bin/
To make sure check your /etc/asterisk/asterisk.conf file

Usage

TTS:

agi(ispeech-tts.agi,"text",[voice],[intkey],[speed]): This will invoke the iSpeech TTS engine, render the text string to speech and play it back to the user. If 'intkey' is set the script will wait for user input. Any given interrupt keys will cause the playback to immediately terminate and the dialplan to proceed to the matching extension (this is mainly for use in IVR, see README for examples). If 'speed' is set the speech rate is altered by that factor.

ASR:

agi(ispeech-asr.agi,[lang],[freeform],[model],[timeout],[intkey],[NOBEEP]): Records from the current channel untill 3 seconds of silence are detected (this can be set by the user by the 'timeout' argument, -1 for no timeout) or the interrupt key (# by default) is pressed. If NOBEEP is set, no beep sound is played back to the user to indicate the start of the recording. For 'freeform' and 'model' please refer to the ispeech API manual. 'freeform' defaults to 3 (Normal speech) The recorded sound is send over to iSpeech ASR service and the returned text string is assigned as the value of the channel variable 'utterance'.
The scripts sets the following channel variables:
status: Return status. 0 means success, non zero values indicating different errors.
utterance: The generated text string.
confidence: A value between 0 and 1 indicating the probability of a correct recognition.Values bigger than 0.90 usually mean that the resulted text is correct.

Asterisk dialplan examples:

;iSpeech TTS test
exten => 3,1,Answer()
exten => 3,n,agi(ispeech-tts.agi,"This is a test of the ispeech text to speech engine in asterisk.")
exten => 3,n,agi(ispeech-tts.agi,"Esta es una simple prueba en español.",usspanishfemale)
exten => 3,n,agi(ispeech-tts.agi,"这是一个简单的测试,在中国。有一个愉快的一天。",chchinesefemale)
exten => 3,n,Hangup()

;Speech recognition test
exten => 4,1,Answer()
exten => 4,n,agi(ispeech-tts.agi,"Please say something in English. When done press the pound key.")
exten => 4,n(record),agi(ispeech-asr.agi,en-US)
exten => 4,n,Noop(== Script returned: ${status} , ${confidence} , ${utterance} ==)
exten => 4,n,GotoIf($["${status}" = "0"]?success:fail)

exten => 4,n(success),GotoIf($["${confidence}" > "0.3"]?playback:retry)

exten => 4,n(retry),agi(ispeech-tts.agi,"Can you please repeat more clearly?")
exten => 4,n,goto(record)

exten => 4,n(playback),agi(ispeech-tts.agi,"The text you just said was...")
exten => 4,n,agi(ispeech-tts.agi,"${utterance}")
exten => 4,n,goto(end)

exten => 4,n(fail),agi(ispeech-tts.agi,"Failed to get speech data.")
exten => 4,n(end),Hangup()

;Voice dialing example
exten => 5,1,Answer()
exten => 5,n,agi(ispeech-tts.agi,"Please say the number you wish to dial.")
exten => 5,n(record),agi(ispeech-asr.agi,en-US,,phonenumber)
exten => 5,n,GotoIf($[$["${status}" = "0"] & $["${confidence}" > "0.3"]]?success:retry)

exten => 5,(success),agi(ispeech-tts.agi,"Dialing ${utterance}")
exten => 5,n,goto(${utterance},1)

exten => 5,n(retry),agi(ispeech-tts.agi,"Can you please repeat?")
exten => 5,n,goto(record)

;IVR test
exten => 6,1,goto(my_ivr,s,1)

[my_ivr]
exten => s,1,Answer()
exten => s,n,Set(TIMEOUT(digit)=5)
exten => s,n,Set(TIMEOUT(response)=8)
exten => s,n,agi(ispeech-tts.agi,"Welcome to my small interactive voice response menu.")
exten => s,n(start),agi(ispeech-tts.agi,"Please dial a digit.",,any)
exten => s,n,Waitexten()

exten => _X,1,agi(ispeech-tts.agi,"You just pressed ${EXTEN}. Try another one please.",,any)
exten => _X,n,Waitexten()

exten => i,1,agi(ispeech-tts.agi,"Invalid extension.")
exten => i,n,goto(s,start)

exten => t,1,agi(ispeech-tts.agi,"Request timed out.")
exten => t,n,goto(h,1)


License

The iSpeech plugins for asterisk are distributed under the GNU General Public License v2.

Authors

Lefteris Zafiris (zaf@fastmail.com)

Download

Develompent snapshots are available in either zip or tar formats.

You can also clone the project with Git by running:

$ git clone git://github.com/zaf/asterisk-ispeech

Links

GoogleTTS text to speech script for asterisk
Speech Recognition for asterisk using Google
Text translation using Google Trnanslate API for Asterisk
Speech synthesis using Microsoft Translator API for Asterisk
Asterisk Flite text to speech module
Asterisk e-Speak text to speech module