# Technology

Google AI can focus on individual speakers in a crowd

This is an important development as computers as not as good as humans at focusing their attention on a particular person in a noisy environment.

IANS | San Francisco | April 13, 2018 11:41 am

Google AI can focus on individual speakers in a crowd

Just as most smartphone cameras now allow users to focus on a single object among many, it may soon be possible to pick out individual voices in a crowd by suppressing all other sounds, thanks to a new Artificial Intelligence (AI) system developed by Google researchers.

This is an important development as computers as not as good as humans at focusing their attention on a particular person in a noisy environment.

Known as the cocktail party effect, the capability to mentally “mute” all other voices and sounds comes natural to us humans.

However, automatic speech separation — separating an audio signal into its individual speech sources — remains a significant challenge for computers, Inbar Mosseri and Oran Lang, software engineers at Google Research, wrote in a blog post this week.

In a new paper, the researchers presented a deep learning audio-visual model for isolating a single speech signal from a mixture of sounds such as other voices and background noise.

“In this work, we are able to computationally produce videos in which speech of specific people is enhanced while all other sounds are suppressed,” Mosseri and Lang said.

The method works on ordinary videos with a single audio track, and all that is required from the user is to select the face of the person in the video they want to hear, or to have such a person be selected algorithmically based on context.

The researchers believe this capability can have a wide range of applications, from speech enhancement and recognition in videos, through video conferencing, to improved hearing aids, especially in situations where there are multiple people speaking.

“A unique aspect of our technique is in combining both the auditory and visual signals of an input video to separate the speech,” the researchers said.

“Intuitively, movements of a person’s mouth, for example, should correlate with the sounds produced as that person is speaking, which in turn can help identify which parts of the audio correspond to that person,” they explained.

The visual signal not only improves the speech separation quality significantly in cases of mixed speech, but, importantly, it also associates the separated, clean speech tracks with the visible speakers in the video, the researchers said.

# India

India will soon become the third largest economy in the world: Prez Murmu

Laying out her government’s vision of the country's future, President Droupadi Murmu Friday said India will soon become the third largest economy in the world .

# Opinion

Will China’s AI challenge impact US dominance?

The global artificial intelligence (AI) landscape has witnessed a new shift with the emergence of China’s DeepSeek, a powerful AI model developed to counterbalance American AI giants like OpenAI, Google, and Meta. The release of DeepSeek sent shockwaves through the stock market, causing NVIDIA, one of the biggest players in AI hardware, to lose nearly $500 billion in market capitalization overnight

# India

India to develop its own AI model in next 10 months:Vaishnaw

India’s large language model (LLM) is expected to be ready within the next 10 months, said the Minister of Electronics and IT Ashwini Vaishnaw on Thursday.

Google AI can focus on individual speakers in a crowd

This is an important development as computers as not as good as humans at focusing their attention on a particular person in a noisy environment.

You might be interested in

Unacceptable: Rasthrapati Bhavan on Sonia, Rahul’s remarks on Prez

India’s real GDP for FY26 projected at 6.3-6.8%: Economic Survey

Kejriwal now claims supply of toxic water to Delhi finally stopped

Top Headlines

Unacceptable: Rasthrapati Bhavan on Sonia, Rahul’s remarks on Prez

India’s real GDP for FY26 projected at 6.3-6.8%: Economic Survey

Kejriwal now claims supply of toxic water to Delhi finally stopped

Subtle discrimination is greatest challenge for women: Vice-Prez

OPINION

Will China’s AI challenge impact US dominance?

Kumbh Tragedy

DRC conflict

What to expect

Strategic Ties

Privatisation U-Turn

Google AI can focus on individual speakers in a crowd

This is an important development as computers as not as good as humans at focusing their attention on a particular person in a noisy environment.

Related posts

You might be interested in

Top Headlines

OPINION