Text to Speech Application

Text to Speech Application

TTS is a process to convert text to the corresponding wave file. In order to simplify the process of call automation, Xtend IVR introduces a sample script for text to speech recognition system. The automated IVR can play the text file using the TTS engine. Various SAPI XML tags are used in the SPEAK command to implement tone variations and to handle several number-to-speech conversions.

Download the evaluation version of Xtend IVR and install the telephony application in your system. Run the sample script from the Script Editor. Click here to refer the code.

The following XML tags are used in this script.

<SPELL> Spells out the text
<SILENCE> Introduces an interval of silence
<PARTOFSP> Enables to pronounce a word with multiple pronunciations correctly depending on its part of speech
<VOLUME> Adjusts the output volume level
<VOICE> Selects a voice based on its attributes: Age, Gender, Language, Name, Vendor and VendorPreferred
<LANG> Selects a voice based solely on its language attribute
<EMPH> Emphasizes a section of text
<CONTEXT> Enables the voice to distinguish and normalize special formats like dates, numbers and currency
<PITCH> Controls the pitch of a voice
<RATE> Controls the rate/speed of the voice
Download the source file zip download for the Text to Speech Application

MAIN:
	
	answer 1
	
	speak "Welcome. A variety of speak commands are given below."
	
	speak 'The following words are spelled out. <spell>These words should be
								spelled out</spell>'
	
	speak 'One Thousand milliseconds of silence <silence msec="1000"/> just occurred.'
	
	
	speak 'The following text differentiates the word "record" depending on its
		parts of speech. Did you <partofsp part="verb"> record </partofsp> that
		<partofsp part="noun"> record </partofsp>?'
	
	speak '<volume level="50">This text should be spoken at volume level fifty.
		<volume level="80">This text should be spoken at volume level eighty.
		</volume></volume><volume level="100"/>All text which follows should be
		spoken at volume level one hundered.'
	
	speak '<voice required="Language=409;gender=female">A U.S. English female voice
		should speak this.</voice><lang langid="413">A British English voice
		should speak this.</lang>'
	
	speak '<SAPI>This text is spoken without emphasis. This text is spoken <EMPH>
		with emphasis.</EMPH></SAPI>'
	
	speak 'Date is spoken now as month, day, year. <context id="date_mdy">
		03/04/2001 </context>'
	
	 
	speak 'Date is spoken now as day, month, year. <context id="date_dmy">
		03/04/2001 </context>'
	
	 
	speak 'A Cardinal number is spoken next. <context ID = "number_cardinal">3432
		</context>'
	
	
	speak 'The following Number is spoken as digits. <context ID = "number_digit">
		3432</context> '
	
	
	speak 'A Fractional number is spoken now. <context ID = "number_fraction">3/15
		</context> '
	
	
	speak 'Following is a Decimal Number. <context ID = "number_decimal">423.1243
		</context> '
	
	
	speak 'A pronunciation for Currency follows. <context ID = "currency">$34.90
		</context> '
	
	
	speak '<pitch absmiddle="5">This text should be spoken at pitch five.<pitch 
		absmiddle="-5">This text should be spoken at pitch negative five.</pitch>
		</pitch><pitch absmiddle="10"/>All following text are spoken at pitch 10'
	
	speak '<rate absspeed="5">This text should be spoken at rate five.<rate 
		absspeed="-5">This text should be spoken at rate negative five.</rate>
		</rate><rate absspeed="2"/>All following text are spoken at rate two.'
	
	speak "Good bye."
		
	hangup
	goto MAIN


ONHANGUP:
	hangup
    goto MAIN


ONSYSTEMERROR:
    log $error
    display $error
    hangup
    goto MAIN