Pittsburgh, PA
Saturday
June 12, 2021
    News           Sports           Lifestyle           Classifieds           About Us
Health & Science
 
Place an Ad
Running Calendar
Travel Getaways
Headlines by E-mail
Home >  Health & Science >  Science Printer-friendly versionE-mail this story
CMU scientists designing automated phone system that 'speaks your language'

Monday, November 25, 2002

By Byron Spice, Post-Gazette Science Editor

Maxine Eskenazi doesn't need to be convinced that elderly people have trouble with automated phone systems. Consider the collect call that the Carnegie Mellon University researcher placed -- or tried to place -- to her aging mother.

As Eskenazi waited on one end of the line, she heard her mother answer the phone and she listened as a fast-paced recorded message asked if her mother would accept the call.

"What?" came her mother's reply.

The message was replayed, just as fast.

"What?"

The recording began again.

Click.

Her mother's hangup continues to reverberate today. Eskenazi, a systems scientist, and computer scientist Alan Black, both of CMU's Language Technologies Institute, are beginning a three-year project that will help the Port Authority of Allegheny County develop an automated information system for its bus and light rail schedules and make it mother-friendly.

The system would respond to spoken queries, not to numbers that the caller punches on the phone's keypad.

The people who staff the Port Authority's customer service line now handle about 3,500 calls a day, said Bob Grove, Port Authority spokesman. But they are available only during certain hours, such as 7 a.m. to 7 p.m. weekdays. Schedules also are available on the authority's Web site, though not everybody has access to the Internet.

When it becomes available, however, the automated phone system would make the information available by phone 24 hours a day. It would be offered in addition to the existing customer service representatives, Grove added.

But, if it's going to work, it has to work for people like Eskenazi's mother. So, with the help of a $650,000 grant from the National Science Foundation, she and Black are trying to figuring out how to design a system that can make itself understood to elderly people and that can also understand people of all ages for whom English is not their native language.

If they can do that, the reasoning goes, then they should have the tools to build a system that could be used by just about anyone.

Getting computers to carry on a spoken dialogue with people isn't easy, but it's possible. The Language Technologies Institute already has done it, Black noted, as part of a program sponsored by the Defense Advanced Research Projects Agency. Called CMU Communicator, it's an automated airplane reservation phone line.

"You simply call and say, 'I want to fly to Boston tomorrow and I don't want to fly through Newark,'" Black said. The system then makes the reservation and prints out tickets.

That project is finished and, though there is talk of developing a commercial version, its use was limited to a narrow group of researchers. Limited tests during CMU Homecoming festivities showed that older people had some trouble with it, Black noted. Bus and T schedules are simpler than airline schedules, Black said, but the broad cross-section of people that will use the Port Authority system will pose a design challenge.

Improving the ability of computers to talk with people has implications well beyond bus schedules, however. As computers become more prevalent, technologists increasingly are looking at ways for people to interact with computers without use of a keyboard. Automobiles may someday rely on voice-activated systems, for instance. Mobile robots being designed to assist nursing home residents might also interact using speech.

Many people who called the Port Authority customer service line in the past month may already be part of the research project, which is called "Let's Go." As the phone line's recorded greeting notes, calls are "monitored and recorded" for, among other reasons, "technical research." In this case, CMU researchers are analyzing the interchanges, paying attention both to what the callers say, what kind of information they seek, and how the customer service workers respond.

"We want to see what people do when someone's having trouble understanding," Eskenazi explained. Do they change their choice of words? the speed at which they talk? the emphasis placed on certain words?

One thing is for sure, she said: "Speaking louder doesn't work."

Even for older people, who hear less well as they lose their ability to hear high-frequency sounds, the problems with understanding computer-synthesized speech have less to do with volume than with the unnatural sound of the speech and the inability of automated systems to make sure that the listener is paying attention.

"We sing a sort of song whenever we speak," Eskenazi said. "That keeps people's attention. Synthetic speech isn't quite there yet."

Black, an expert in speech synthesis, said eliminating the robotic sound of synthetic speech is difficult. Synthetic speech is typically "concatenated" speech -- assembled by stringing together a series of sounds, or phonemes, that have been recorded by human speakers. If done artfully, the result doesn't sound electronic, but it isn't quite human either.

That is because of subtleties of inflection, phrasing and rhythm that are still poorly understood, Black said.

The Let's Go project will try to determine which of these factors are important.

Non-native speakers pose a different challenge. Eskenazi, who has developed software that teaches correct pronunciation, said the system must be designed to handle mispronounced or inappropriate words. Consequently, it will need to first confirm that it understood the request before it conveys the information.

If someone asked for a "timelog," for instance, the system might reply by asking, "Did you say timelog?" or "I believe you asked for the bus schedule..."

The Let's Go team, which includes senior research scientist Lori Levin, visiting scientist Rita Singh and graduate students Antoine Raux and Brian Langner, expects to have a crude form of the system operating and available for initial testing within a few months. Grove said the Port Authority has no firm plans about when the system will be placed in use.


Byron Spice can be reached at bspice@post-gazette.com or 412-263-1578.

Search | Contact Us |  Site Map | Terms of Use |  Privacy Policy |  Advertise | Help |  Corrections