Technical Paper
A Target-Speech-Feature-Aware Module for U-Net Based Speech Enhancement
2024-04-09
2024-01-2021
Speech enhancement can extract target speech contaminated by noise and improve its perception quality and intelligibility. This technology has significant potential in intelligent voice interaction for automotive applications. However, the noise environment in vehicles is highly complex, especially due to prominent human voice interference, which poses substantial challenges for automotive voice interaction systems. To address this issue, this paper proposes a module called target-speech-feature-aware for U-net-based speech enhancement that effectively extracts clean speech in environments with human voice interference, enhancing its perceptual quality and intelligibility. In order to extract the features of the target speech, this paper proposes a method designed for the intermediate layer of the U-Net network based on LSTM. Firstly, bidirectional LSTM is used to capture temporal characteristics of encoding compression features.