ADD ANI AS A TRUSTED SOURCE
googleads
ANI Logo
Menu
Quirky

Researchers suggest more effective way for automatic speech recognition

Washington [US], September 2 (ANI): Popular voice assistants such as Siri and Amazon Alexa have made automated speech recognition (ASR) available to the general public. Despite decades of development, ASR models suffer with consistency and dependability, particularly in noisy situations. Chinese researchers created a framework that significantly increases ASR performance in the chaos of everyday auditory situations.

ANI Sep 02, 2022 23:55 IST googleads

Representative Image

Washington [US], September 2 (ANI): Popular voice assistants such as Siri and Amazon Alexa have made automated speech recognition (ASR) available to the general public. Despite decades of development, ASR models suffer with consistency and dependability, particularly in noisy situations. Chinese researchers created a framework that significantly increases ASR performance in the chaos of everyday auditory situations.
Researchers from the Hong Kong University of Science and Technology and WeBank proposed a new framework - phonetic-semantic pre-training (PSP) and demonstrated the robustness of their new model against synthetic highly noisy speech datasets.
Their study was published in CAAI Artificial Intelligence Research on Aug. 28.
"Robustness is a long-standing challenge for ASR," said Xueyang Wu from the Hong Kong University of Science and Technology Department of Computer Science and Engineering. "We want to increase the robustness of the Chinese ASR system with a low cost."
ASR uses machine learning and other artificial intelligence techniques to automatically translate speech into text for uses like voice-activated systems and transcription software. But new consumer-focused applications increasingly call for voice recognition to work better -- handle more languages and accents, and perform more reliably in real-life situations like video conferencing and live interviews.
Traditionally, training the acoustic and language models that comprise ASR requires large amounts of noise-specific data, which can be time- and cost-prohibitive.
The acoustic model (AM) turns words into "phones," which are sequences of basic sounds. The language model (LM) decodes phones into natural-language sentences, usually with a two-step process: a fast but relatively weak LM generates a set of sentence candidates, and a powerful but computationally expensive LM selects the best sentence from the candidates.
"Traditional learning models are not robust against noisy acoustic model outputs, especially for Chinese polyphonic words with identical pronunciation," Wu said. "If the first pass of the learning model decoding is incorrect, it is extremely hard for the second pass to make it up."
The newly proposed framework PSP makes it easier to recover misclassified words. By pre-training a model that translates the AM outputs directly to sentences along with the full context information, researchers can help the LM efficiently recover from the noisy outputs of the AM.
The PSP framework allows the model to improve through a pre-training regime called a noise-aware curriculum that gradually introduces new skills, starting easy and gradually moving into more complex tasks.
"The most crucial part of our proposed method, Noise-aware Curriculum Learning, simulates the mechanism of how human beings recognize a sentence from noisy speech," Wu said.
Warm-up is the first stage, where researchers pre-train a phone-to-word transducer on a clean phone sequence, which is translated from unlabeled text data only -- to cut back on the annotation time. This stage "warms up" the model, initializing the basic parameters to map phone sequences to words.
In the second stage, self-supervised learning, the transducer learns from more complex data generated by self-supervised training techniques and functions. Finally, the resultant phone-to-word transducer is fine-tuned with real-world speech data.
The researchers experimentally demonstrated the effectiveness of their framework on two real-life datasets collected from industrial scenarios and synthetic noise. Results showed that the PSP framework effectively improves the traditional ASR pipeline, reducing the relative character error rates by 28.63% for the first dataset and 26.38% for the second.
In the next steps, researchers will investigate more effective PSP pre-training methods with larger unpaired datasets, seeking to maximize the effectiveness of pretraining for noise-robust LM.
Other contributors include Rongzhong Lian, Di Jiang, Yuanfeng Song, Weiwei Zhao, and Qian Xu, and Qiang Yang from WeBank Co. Ltd. Qian Xu and Qiang Yang are also affiliated with The Hong Kong University of Science and Technology. (ANI)

Get the App

What to Read Next

Food

Study finds how diet has major impact on risk of Alzheimer's

Study finds how diet has major impact on risk of Alzheimer's

In a detailed study, researchers identify which diets are effective in lowering the risk of developing Alzheimer's disease.

Read More
Fashion

"50 Balmain pieces stolen" just days before Paris Fashion Week

The Paris Fashion Week will be held between September 25 and October 3. Balmain's show is scheduled in the French capital on September 27.

Read More
Culture

Vishwakarma Puja 2023: Important aspects observed during this day

Vishwakarma Puja 2023: Important aspects observed during this day

‘Vishwakarma Jayanti’ is a Hindu festival that celebrates Lord Vishwakarma, the divine architect and craftsman of the gods. It is celebrated on September 17 this year.

Read More
Relationships

Moral reasoning displays characteristic patterns in brain: Study

Moral reasoning displays characteristic patterns in brain: Study

Philosophers, psychologists and neuroscientists have passionately argued whether moral judgments share something distinctive that separates them from non-moral matters. Moral monists claim that morality is unified by a common characteristic and that all moral issues involve concerns about harm.

Read More
Parenting

Kindergarten misbehaviour may cost society in the long run: Study

Kindergarten misbehaviour may cost society in the long run: Study

For the first time, a new economic analysis has linked kindergarten pupils' misbehaviour to significant societal costs in terms of criminality, associated medical expenses, and lost productivity as they grow up.

Read More
Quirky

Air pollution makes it difficult for bees to find flowers: Study

Air pollution makes it difficult for bees to find flowers: Study

According to a new study, air pollution prevents bees from finding flowers because it degrades the scent.

Read More
Quirky

Sense of order distinguishes humans from other animals: Study

Sense of order distinguishes humans from other animals: Study

Already earlier research at Stockholm University has suggested that only humans have the ability to recognize and remember so-called sequential information and that this ability is a fundamental building block underlying unique human cultural abilities.

Read More
Quirky

Exciting the brain might be key to boosting maths learning: Study

Exciting the brain might be key to boosting maths learning: Study

According to a new study from the Universities of Surrey and Oxford, Loughborough University, and Radboud University in the Netherlands, activating a brain region with electrical noise stimulation may improve mathematical learning in those who struggle with the subject.

Read More
Quirky

Youth with poor learning skills most vulnerable to email scams

Youth with poor learning skills most vulnerable to email scams

According to an international study published in the peer-reviewed British Journal of Educational Studies, disadvantaged youth are more vulnerable to email scams and require more protection.

Read More
Food

Replacing saturated fat, salt...is tasty, healthy: Study

Replacing saturated fat, salt...is tasty, healthy: Study

A team of Penn State researchers has figured the how to reduce some saturated fat, sugar, and salt from popular American dishes while keeping them tasty.

Read More
Home About Us Our Products Advertise Contact Us Terms & Condition Privacy Policy

Copyright © aninews.in | All Rights Reserved.