News Releases •  
Corporate Information •  
Executive Leadership •  
Media & Analyst •  
Contacts    
eMedia Kit •  
Image Library •  
Feature Article Archive •  

Nortel Networks Feature Articles Now Available! Read more>>


Your Location: Home / News & Events / Media Resources / Speech Technologies - Let Your Web Pages Do The Talking

Feature Article
  Related Information
Download PDF version of this article (129 KB)

Feature Article:
Speech Recognition - Giving Computers the Gift of Gab

In the News:
Speech Automated Self-Service Solutions: From Sidelines To Primetime, Customer Interaction Solutions, September 2004*

Products & Solutions:
Advanced Speech Recognition

Speech Recognition Demo*

Documentation:
Web-Centric Self-Service service brief (PDF, 236 KB)

Speech Recognition - Let your web pages do the talking



By Wendy Herman, Nortel

Advances in speech technologies are making it possible to convert text into voice and voice into text, providing new ways to access information either by telephone or computer. Any information from a web page or computer database can now be transferred to voice applications giving users virtual anywhere, anytime access with a wider variety of devices.

Dr. Manish Sharma In the following Questions and Answers, Nortel's Dr. Manish Sharma discusses the current capabilities of speech technologies and how they are being transferred to real world applications. Sharma has been involved in speech technology research for 13 years, starting his career researching speech verification technologies for military applications while completing his doctorate at Rutgers University. He currently leads a Nortel team of speech application designers and developers that adapts current technologies to specific customer needs.

Sharma was named one of the world's 10 top leaders in speech technology at the speech industry's largest conference, SpeechTEK 2004, held in New York City.

Nortel, an industry leader in speech technologies for more than 25 years, received the Frost & Sullivan 2004 Speech Solutions Competitive Strategy Award for its accomplishments in deploying speech technologies for enterprises. Frost & Sullivan also selected Nortel as the recipient of its 2004 Market Leadership Award for the company's lead in the U.S. Interactive Voice Response (IVR) market.

The company has designed and deployed over 200 speech applications in more than 16 countries around the world, in 15 languages, according to Frost and Sullivan market research.

Q. You were recognized at SpeechTEK as one of the world's 10 top leaders for your contributions to the speech technology industry over the past year. How would you describe those contributions?

Sharma: I believe the award is both a reflection of my personal achievements as well as Nortel leadership in the industry. The SpeechTEK awards are based on voting by my peers and our competitors, and they see me as the face of Nortel leading this particular area of technology development for the company. Nortel is well thought of in the speech technology industry and people respect our position in the enterprise marketplace. I see myself as benefiting from our very positive industry position.

Q. What are the most popular ways speech technologies are being used today?

Sharma: Speech recognition and converting text to speech technologies are now entering mainstream use in a wide range of business sectors such as financial services, airlines, telecommunication, utilities, entertainment and government, completing repetitive tasks over the telephone without needing to involve a human operator.

My team at Nortel has been designing and deploying speech applications, specific to a customer's needs in a wide range of industries, especially where call centers are an integral part of their daily operations. Call centers are so common today that voice applications can be deployed in just about any industry to automate routine, repetitive tasks like routing calls to the right department or taking a customer's name and address. Customer satisfaction also increases because simple transactions and inquiries are completed easily or quickly referred to the appropriate agent based on spoken request.

One large utility company we are working with is going to deploy speech services for bill payments and to start or stop utility services. This has been difficult to automate until now because street names can't be input by the customer through telephone key pads and every call involving an address had to be handled by an agent. Speech technologies can handle that routine service, freeing up operators and agents to work with customers on more complex inquiries.

Now that text on web pages or in computer databases can easily be converted to speech, the possibilities are huge for how information can be accessed either on the computer or by telephone.

Q. What is driving today's growing acceptance of speech applications for business use?

Sharma: In my opinion, there are two key drivers - the maturity of the solutions and the establishment of industry standards.

First, speech solutions have matured over the last several years. The error rate of speech recognition systems is now very low, which means they can understand human voice across a variety of languages and accents without frustrating users. Natural language routing technology allows a user to ask open ended questions about why they are calling and the application uses natural language understanding to accurately route the call to the appropriate destination.

Along with the science and technology of speech, the art of voice user interface has also matured significantly. Not only can these applications understand naturally spoken caller input, but the system's prompting and spoken output is also very natural. The whole human-machine interface has become very refined. This maturity in the science and art of speech application design has greatly increased the success of new speech applications and that has led to a higher comfort level of business leaders in putting these technologies into daily use.

Second, there are industry standards now for speech application development that have been endorsed by the World Wide Web Consortium (W3C). For example, HTML has been a web standard for years. Everyone uses it to present text visually over an easy to read graphical interface. Now, industry standards for converting text to speech and for speech recognition applications such as Voice eXtensible Markup Language (VXML), Call Control eXtensible Markup Language (CCXML) and Session Initiation Protocol (SIP) are providing an impetus to the entire industry. Having industry standards has strengthened the credibility of the speech solutions industry because everyone is designing applications to the same standard and businesses feel their investments are protected because they can buy applications from more than one vendor.

Q. New security systems are being developed through biometrics, the science of using a person's biological characteristics such as fingerprints, voice or eyes as a way to authenticate their identity. Do speech technologies have a role to play in biometrics?

Sharma: The long term vision is that our voice will become like our fingerprint, but the adoption of voice verification is not quite there yet. The error rates in voice verification systems are as close as a one or two percent which are acceptable for several commercial applications but others need an ironclad 100 percent accuracy and that will come.

The commercial value of biometric technologies is strong because security is based on something you are that doesn't change, like your voice or fingerprint, rather than something you have that can be lost like a credit card, or something you know, that can be stolen like a PIN. Voice verification will always be more secure than a PIN because voice is too difficult for an imposter to duplicate.

The range of applications for voice verification in biometrics is broader than with fingerprints or eyes because you don't have to physically be at a location to authenticate who you are. If your voice print is on record with a government agency or company, you can be automatically authenticated over the telephone.

Q. What can we expect from speech technologies in the next four years?

It's difficult to predict which speech technology will be most prevalent but I believe in the next two to four years, speech recognition will become the predominant interface of choice for voice self-service applications. We also will see more speech applications integrated into Internet Protocol (IP)-based technologies like SIP which is beginning to be used for the management of an individual's communication services.

One popular capability of SIP is that it can direct all your messages - voice, email and fax - to one inbox which can be accessed from a computer or PDA. Add speech recognition, text to speech and speech to text capabilities to SIP and you will be able to access all your messages by phone from wherever you are, without computer access. An email message can be spoken to you over the phone, for example, and you can verbally dictate a response to send back.

In general, the real world applications of speech technologies have advanced faster than I ever thought possible. Ten years ago I would have said the widespread use of speech recognition and text to speech applications was at least 20 years away yet they are already starting to surround us today. They matured very quickly and I'm sure the same will hold true for other applications now being developed, particularly in the area of customer service and multimedia applications like SIP. These are areas where Nortel is focusing its research and development efforts and we expect them to bring exciting leading-edge competitive advantages in productivity, cost savings and customer satisfaction for all business, regardless of size.

Get the plug-in*
Get Acrobat