In recent years, authentication requirements in the financial services sector have undergone a major transformation. It’s no longer just about validating credentials; it’s about protecting identities, preserving user experience, and, most importantly, complying with increasingly strict regulations like PSD2 and the upcoming PSD3.
One of the most promising developments in this area is the use of voice biometrics for authentication. Thanks to its unique characteristics, this AI-powered technology is gaining traction as a secure, convenient, and regulation-compliant method, whether as a standalone factor in certain contexts or as part of a multi-factor authentication (MFA) framework.
Download the Use Case: Voice Biometrics as a client authentication solution
The new authentication paradigm under PSD2 and PSD3
The revised Payment Services Directive (PSD2) introduced the concept of Strong Customer Authentication (SCA), which requires payment service providers (PSPs) to apply at least two out of three authentication elements:
- Knowledge: Something only the user knows (e.g., a password).
- Possession: Something only the user has (e.g., a mobile device).
- Inherence: Something the user is (e.g., a biometric trait).
Voice biometrics falls into this third category.
The upcoming PSD3 proposal, currently under development, not only maintains this framework but adds more flexibility in how factors can be combined. It could allow, for instance, the use of two elements from the same category if the context justifies it, such as combining two different biometric methods (like voice and fingerprint), or two forms of possession.
The regulatory framework is also tightening requirements around remote identity verification, access traceability, and the need for auditable mechanisms that demonstrate how authentication was applied in each transaction. This is where AI-driven solutions, particularly voice biometrics, offer both a competitive and operational edge.
What are Voice Biometrics, and how do they work?
Voice biometrics authenticates a person based on the unique characteristics of their voice. Unlike traditional voice recognition technologies that focus on what is being said, voice biometrics focuses on who is saying it.
Every human voice has distinctive patterns resulting from both physiological factors (like the length of the vocal tract or size of the larynx) and behavioral ones (intonation, rhythm, pauses). Voice biometric systems analyze these traits to generate a unique digital voiceprint.
When a user interacts with a system, such as calling a support center or speaking to a virtual assistant, AI compares their voice with the stored voiceprint and calculates a match score to determine if the authentication is valid.
Voice Biometrics as an inherent authentication method
Under the SCA requirements, voice biometrics is an especially effective “inherence” method and offers several advantages over other biometric technologies:
- No special hardware required: Unlike facial or fingerprint recognition, all that’s needed is a microphone, something available on virtually every smartphone, computer, or phone.
- Remote and frictionless: No physical interaction or complex gestures needed. Users can authenticate simply by speaking.
- High accuracy and noise tolerance: Thanks to advancements in deep learning, modern systems can authenticate users in noisy environments, even with short phrases or natural language.
Voice biometrics also enables passive authentication; the system can verify the user’s identity while they’re performing another task, such as requesting a service or dictating a message, with no additional steps required.
As a second authentication factor: seamless security
In many implementations, voice biometrics is used not as the sole authentication method but as part of a multi-factor approach. This hybrid model is particularly effective when:
- You want to eliminate passwords or PINs, which are often weak or vulnerable.
- The user doesn’t have access to a visual interface (such as during phone interactions).
- There’s a need to boost security without sacrificing user experience.
A typical secure authentication might involve:
- A code sent by SMS (possession).
- Voice verification upon response (inherence).
This setup complies with PSD2/PSD3’s SCA requirements and offers a better user experience than more invasive methods like facial recognition, especially in environments where such methods aren’t practical (offices, outdoor areas, etc.).
Download the Whitepaper: Voice Biometric as a two-factor authentication
Strengths compared to other biometric methods
Method | Strengths | Limitations |
---|---|---|
Fingerprint | Fast, familiar | Requires a physical sensor |
Facial Recognition | High visual precision | Affected by lighting and positioning |
Voice Biometrics | Hands-free, non-intrusive, no special hardware; can detect deepfakes | Can be impacted by audio quality and ambient noise |
Voice biometrics stands out especially in screenless environments (call centers, IoT devices, conversational interfaces) and for user segments where ease of use is critical, like older adults or individuals with visual impairments.
Regulatory compliance and traceability
Voice biometric authentication not only aligns with SCA principles but also enables:
- Process auditability: Each authentication includes a match score, session logs, etc., making it easy to audit.
- Secure and complete recording: With certified recordings or interaction hashes, the authentication process can be recorded unalterably.
- User privacy protection: Through anonymization and encryption techniques, compliance with GDPR is assured.
These features not only satisfy regulators but also offer transparency in the case of disputes over unauthorized access.
Real-world use cases in financial services
Voice biometrics is already being adopted in various real-world financial scenarios:
- Banking call centers: Users are automatically identified during the call, eliminating the need for security questions and reducing call duration.
- Payment app access: The system confirms the user’s identity in virtual assistants before authorizing a transaction.
- Voice-based contract signing: Paired with certified recordings, this validates the signer’s identity and intent.
- Remote onboarding identity verification: Voice complements document checks and facial recognition.
All of these use cases align closely with PSD2/PSD3 objectives: secure authentication that’s traceable, audit-ready, and low-friction.
Discover more about: AI solutions for financial and banking services
Technical challenges and considerations
Despite the clear advantages, there are a few challenges to address when implementing voice biometrics:
- Spoofing protection: AI must detect voice fraud attempts, such as replay attacks or voice synthesis. Modern systems now include “voice spoofing” and deepfake detection mechanisms.
- Voice changes: Illness, aging, or emotional states can alter a person’s voice. Reliable systems need to adapt or allow for controlled retraining.
- Privacy concerns: While less invasive than other biometrics, some users may still hesitate. Clear communication about data protection and user benefits is essential.
What role does AI play in all of this?
Modern voice biometrics is powered by machine learning models trained on thousands of hours of speech. These systems:
- Learn to distinguish individual voices, even in tough conditions.
- Detect anomalies in real time.
- Continuously update to adapt to new threats, use cases, and regulations.
AI isn’t just a supporting tool; it’s the backbone that enables voice-based authentication to be secure, scalable, and user-friendly.
In today’s landscape, and even more so in what’s coming with PSD3, voice biometrics is emerging as a strategic solution for secure authentication. Especially when integrated into broader AI-driven architectures, it not only meets regulatory demands but also enhances user experience and automates formerly manual processes.
Find out more about how voice biometrics works and how it can benefit you, here.