- Home
- Hardware
- SDKs
- Cloud
- Solutions
- Support
- Ecosystem
- Company
- Contact
Any voice-enabled product needs to perform well in a noisy environment, and audio front-end (AFE) algorithms play an important role in building a sensitive voice-user interface (VUI). Espressif’s AI Lab has created a set of audio front-end algorithms that can offer this functionality. Customers can use these algorithms with Espressif’s powerful ESP32 and ESP32-S3 SoCs, in order to build high-performance, yet low-cost, products with a voice-user interface.
AEC
BSS
NS
WakeNet
Acoustic Echo Cancellation (AEC)
Acoustic Echo Cancellation is achieved with an algorithm designed to remove echoes from the audio input filtered through a microphone. This is beneficial when the device is playing back some audio through its speakers.
Blind Source Separation (BSS)
The Blind Source Separation algorithm uses multiple microphones to detect the direction of the incoming audio, while enhancing the input from a certain direction. This algorithm improves the quality of the desired audio source in a noisy environment.
Noise Suppression (NS)
The Noise Suppression algorithm takes effect on single-channel audio signals. It works toward eliminating unwanted non-human noise (for example sound of vacuum cleaner or air conditioner), thus improving the audio signal that needs to be processed.
WakeNet
Espressif's wake word engine WakeNet is specially designed to provide a high performance and low memory footprint wake word detection algorithm, which enable devices to 'hear' wake words, such as “Alexa”, “Hi, lexin” and “Hi, ESP”.
Advantages
Outstanding
Acoustic Performance
Espressif's AFE algorithms have been qualified by Amazon in Alexa far-field tests. The algorithms use our own wake word engine, which can meet Amazon's test requirements for multi-languages.
Low-Resource Consumption
Espressif's AFE algorithms are optimized, as they take advantage of Espressif’s AI accelerator that is available in the ESP32-S3 SoC. Espressif's AFE algorithms consume around 22% of CPU, 48 KB SRAM and 1.1 MB PSRAM. This provides sufficient headroom for customer applications on the ESP32-S3 SoC.
Flexibility
Espressif's AFE algorithms offer an easy and intuitive API for customer applications, so that their performance can change as dynamically as it is required. The distance between the two microphones can be between 20-80 mm, which allows considerable flexibility for the hardware design of developers’ end-products.
Amazon-Qualified “Audio Front End” Solution
Espressif's AFE algorithms have been qualified by Amazon as an “Audio Front End” solution for Alexa built-in devices. Espressif's AFE algorithms have been optimized with the AI accelerator of ESP32-S3. The combination of Espressif’s audio algorithms and hardware provides a 360-degree voice pickup, while using just two microphones.