Monday, March 10, 2025

Chinese Man Fined $3,500 for Secretly Filming Flight Attendant: The Shocking Details

A Chinese man was fined $3,500 for illegally filming a flight attendant on a flight to Jeju Island, admitting to his actions in court.

West Point: America’s Premier Forge of Military Leaders

On July 4, 1802, the United States...

Intel Hit Hard as Trump Calls for Repeal of Semiconductor Subsidies

Trump's hints at repealing the semiconductor law caused Intel's stock to drop 2.44%, highlighting its reliance on government support.

Growing Threat of AI-Generated Voice Deepfakes: A New Tool for Scams and Misinformation

EconomyGrowing Threat of AI-Generated Voice Deepfakes: A New Tool for Scams and Misinformation
News 1 DB
News 1 DB

At a recent rally supporting the impeachment of South Korean President Yoon Suk Yeol, a song synthesized using artificial intelligence (AI) to mimic the president’s voice echoed through the crowd. Deep learning technology used for voice replication has advanced rapidly, making it increasingly difficult to distinguish between real and synthetic audio.

The main concern is the misuse of voice deepfakes in scams like voice phishing or spreading election-related misinformation.

The tech industry is actively developing detection technologies to counter these threats and differentiate between human and AI-generated voices.

According to a weekly technology trend report released Tuesday by the Korea Information and Communication Planning and Evaluation Agency, voice deepfake detection technology analyzes differences between two voices. Researchers compile extensive voice datasets and train deep learning models to identify frequency and acoustic characteristics variations.

These models analyze frequency bands using unique metrics. Deepfake voices have distinct high-frequency components that differ from human voices, making detection possible.

However, as text-to-speech (TTS) models improve, relying solely on frequency analysis has limitations. Recent methods use large-scale speech corpora to learn acoustic characteristics such as tone and intonation.

Institute for Information & Communication Technology Planning & Evaluation
Institute for Information & Communication Technology Planning & Evaluation

Once voice characteristics are extracted, detection models identify differences. The latest detection technologies use deep learning models called AASIST and Conformer.

The AASIST model learns frequency and temporal voice information. It accurately detects voice spoofing using a Graph Attention Network (GAT) that assigns weights to key features in graph data.

The Conformer model combines Convolution and Transformer modules. Convolution captures short-term patterns and local voice features, while Transformers learn global signal characteristics. This enables the model to analyze long-context information effectively.

This combination allows Conformer to recognize both long-context and detailed patterns, significantly improving voice recognition accuracy.

Detection technology performance is evaluated using the Equal Error Rate (EER), which measures the point where the False Acceptance Rate (FAR) equals the False Rejection Rate (FRR). A lower EER indicates higher accuracy.

However, detection technology alone cannot guarantee complete protection. New threats have emerged, such as adding noise to voices or partially synthesizing them with TTS.

Researchers use sample training to counter adversarial attacks like noise insertion, generating various adversarial samples to enhance detection models. This method requires implementing all attack types and testing each sample, making it resource-intensive.

For partial modulation, detection is applied at both the segment and utterance levels. Segment-level analysis breaks sentences into parts to detect alterations, while utterance-level analysis checks whether the entire sentence has been modified.

Professor Hong Gi Hoon from Soongsil University’s Department of Electronic Information Engineering emphasized the growing importance of detecting voice deepfakes. He warned that fake voices can spread misinformation and stressed the need for continued research led by governments and academic institutions to establish a secure AI environment.

Check Out Our Other Content

Check Out Other Tags:

Most Popular Articles