End-to-End Deep Learning-Based Adaptation Control for Linear Acoustic Echo Cancellation (2024)

research-article

Authors: Thomas Haubner, Andreas Brendel, and Walter Kellermann

IEEE/ACM Transactions on Audio, Speech, and Language Processing, Volume 32

Pages 227 - 238

Published: 19 October 2023 Publication History

  • 0citation
  • 6
  • Downloads

Metrics

Total Citations0Total Downloads6

Last 12 Months6

Last 6 weeks2

  • Get Citation Alerts

    New Citation Alert added!

    This alert has been successfully added and will be sent to:

    You will be notified whenever a record that you have chosen has been cited.

    To manage your alert preferences, click on the button below.

    Manage my Alerts

    New Citation Alert!

    Please log in to your account

  • Get Access

      • Get Access
      • References
      • Media
      • Tables
      • Share

    Abstract

    The attenuation of acoustic loudspeaker echoes remains to be one of the open challenges to achieve pleasant full-duplex hands free speech communication. In many modern signal enhancement interfaces, this problem is addressed by a linear acoustic echo canceler which subtracts a loudspeaker echo estimate from the recorded microphone signal. To obtain precise echo estimates, the parameters of the echo canceler, i.e., the filter coefficients, need to be estimated quickly and precisely from the observed loudspeaker and microphone signals. For this a sophisticated adaptation control is required to deal with high-power double-talk and rapidly track time-varying acoustic environments which are often faced with portable devices. In this paper, we address this problem by end-to-end deep learning. In particular, we suggest to infer the step-size for a least mean squares frequency-domain adaptive filter update by a Deep Neural Network (DNN). Two different step-size inference approaches are investigated. On the one hand broadband approaches, which use a single DNN to jointly infer step-sizes for all frequency bands, and on the other hand narrowband methods, which exploit individual DNNs per frequency band. The discussion of benefits and disadvantages of both approaches leads to a novel hybrid approach which shows improved echo cancellation while requiring only small DNN architectures. Furthermore, we investigate the effect of different loss functions, signal feature vectors, and DNN output layer architectures on the echo cancellation performance from which we obtain valuable insights into the general design and functionality of DNN-based adaptation control algorithms.

    References

    [1]

    E. Hänsler and G. Schmidt, Acoustic Echo and Noise Control: A Practical Approach. New York, NY, USA: Wiley, 2004.

    [2]

    G. Enzner, H. Buchner, A. Favrot, and F. Kuech, “Acoustic echo control,” in Academic Press Library in Signal Processing, vol. 4. Florida, USA: Elsevier, 2014, pp. 807–877.

    [3]

    K. Sridhar et al., “ICASSP 2021 acoustic echo cancellation challenge: Datasets, testing framework, and results,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process.2021, pp. 151–155.

    [4]

    R. Cutler et al., “Interspeech 2021 acoustic echo cancellation challenge,” in Proc. Interspeech, 2021, pp. 4748–4752.

    [5]

    R. Cutler et al., “ICASSP 2022 acoustic echo cancellation challenge,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2022, pp. 9107–9111.

    [6]

    S. Haykin, Adaptive Filter Theory, 4th ed. Upper Saddle River, Englewood Cliffs, NJ, USA: Prentice Hall, 2002.

    [7]

    A. Mader, H. Puder, and G. U. Schmidt, “Step-size control for acoustic echo cancellation filters–An overview,” Signal Process., vol. 80, no. 9, pp. 1697–1719, 2000.

    [8]

    T. Gansler, M. Hansson, C.-J. Ivarsson, and G. Salomonsson, “A double-talk detector based on coherence,” IEEE Trans. Commun., vol. 44, no. 11, pp. 1421–1427, Nov. 1996.

    [9]

    J. Benesty, D. R. Morgan, and J. H. Cho, “A new class of doubletalk detectors based on cross-correlation,” IEEE Speech Audio Process., vol. 8, no. 2, pp. 168–172, Mar. 2000.

    [10]

    B. H. Nitsch, “A frequency-selective stepfactor control for an adaptive filter algorithm working in the frequency domain,” Signal Process., vol. 80, no. 9, pp. 1733–1745, Sep. 2000.

    [11]

    G. Enzner and P. Vary, “Frequency-domain adaptive Kalman filter for acoustic echo control in hands-free telephones,” Signal Process., vol. 86, no. 6, pp. 1140–1156, 2006.

    [12]

    F. Kuech, E. Mabande, and G. Enzner, “State-space architecture of the partitioned-block-based acoustic echo controller,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2014, pp. 1295–1299.

    [13]

    J. Benesty, H. Rey, L. Vega, and S. Tressens, “A Nonparametric VSS NLMS Algorithm,” IEEE Signal Process. Lett., vol. 13, no. 10, pp. 581–584, Oct. 2006.

    [14]

    J.-M. Valin, “On adjusting the learning rate in frequency domain echo cancellation with double-talk,” IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 3, pp. 1030–1034, Mar. 2007.

    [15]

    F. Nesta, T. S. Wada, and B. Juang, “Batch-online semi-blind source separation applied to multi-channel acoustic echo cancellation,” IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 19, no. 3, pp. 583–599, Mar. 2011.

    [16]

    J. Gunther, “Learning echo paths during continuous double-talk using semi-blind source separation,” IEEE/ACM Trans. Audio Speech Lang. Process., vol. 20, no. 2, pp. 646–660, Feb. 2012.

    [17]

    G. Cheng, L. Liao, H. Chen, and J. Lu, “Semi-blind source separation for nonlinear acoustic echo cancellation,” IEEE Signal Process. Lett., vol. 28, pp. 474–478, 2021.

    [18]

    F. Yang, G. Enzner, and J. Yang, “Frequency-domain adaptive Kalman filter with fast recovery of abrupt echo-path changes,” IEEE Signal Process. Lett., vol. 24, no. 12, pp. 1778–1782, Dec. 2017.

    [19]

    T. Haubner, A. Brendel, M. Elminshawi, and W. Kellermann, “Noise-robust adaptation control for supervised acoustic system identification exploiting a noise dictionary,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2021, pp. 945–949.

    [20]

    T. Haubner, M. M. Halimeh, A. Brendel, and W. Kellermann, “A synergistic Kalman and deep postfiltering approach to acoustic echo cancellation,” in Proc. IEEE 29th Eur. Signal Process. Conf., 2021, pp. 990–994.

    [21]

    A. Ivry, I. Cohen, and B. Berdugo, “Deep adaptation control for acoustic echo cancellation,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2022, pp. 741–745.

    [22]

    O. Schwartz and A. Schwartz, “RNN-based step-size estimation for the RLS algorithm with application to acoustic echo cancellation,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2023, pp. 1–5.

    [23]

    T. Haubner, A. Brendel, and W. Kellerman, “End-to-End deep learning-based adaptation control for frequency-domain adaptive system identification,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2022, pp. 766–770.

    [24]

    H. Zhang, S. Kandadai, H. Rao, M. Kim, T. Pruthi, and T. Kristjansson, “Deep adaptive AEC: Hybrid of deep learning and adaptive acoustic echo cancellation,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2022, pp. 756–760.

    [25]

    T. Haubner and W. Kellermann, “Deep learning-based joint control of acoustic echo cancellation, beamforming and postfiltering,” in Proc. IEEE 30th Eur. Signal Process. Conf., 2022, pp. 752–756.

    [26]

    J. Casebeer, N. J. Bryan, and P. Smaragdis, “Meta-AF: Meta-learning for adaptive filters,” IEEE/ACM Trans. Audio Speech Lang. Process., vol. 31, pp. 355–370, 2023.

    [27]

    J. Wu, J. Casebeer, N. J. Bryan, and P. Smaragdis, “Meta-learning for adaptive filters with higher-order frequency dependencies,” in Proc. IEEE Int. Workshop Acoust. Signal Enhancement, 2022, pp. 1–5.

    [28]

    D. Yang, F. Jiang, W. Wu, X. Fang, and M. Cao, “Low-complexity acoustic echo cancellation with neural Kalman filtering,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2023, pp. 1–5.

    [29]

    Y. Zhang, M. Yu, H. Zhang, D. Yu, and D. Wang, “NeuralKalman: A learnable Kalman filter for acoustic echo cancellation,” 2023, arXiv:2301.12363.

    [30]

    W. Kellermann, “Analysis and design of multirate systems for cancellation of acoustical echoes,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 1988, pp. 2570–2573.

    [31]

    P. S. Diniz, Adaptive Filtering: Algorithms and Practical Implementation, 4th ed. Berlin, Germany: Springer, 2012.

    [32]

    Y. Avargel and I. Cohen, “System identification in the short-time fourier transform domain with crossband filtering,” IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 4, pp. 1305–1319, May 2007.

    [33]

    J. Franzen, E. Seidel, and T. Fingscheidt, “AEC in a netshell: On target and topology choices for FCRN acoustic echo cancellation,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2021, pp. 156–160.

    [34]

    E. Seidel, J. Franzen, M. Strake, and T. Fingscheidt, “Y2-Net FCRN for acoustic echo and noise suppression,” in Proc. Interspeech, 2021, pp. 4763–4767.

    [35]

    S. Braun and M. L. Valero, “Task splitting for DNN-based acoustic echo and noise removal,” in Proc. IEEE Int. Workshop Acoust. Signal Enhancement, 2022.

    [36]

    J. E. Greenberg, P. M. Zurek, and M. Brantley, “Evaluation of feedback-reduction algorithms for hearing aids,” J. Acoust. Soc. Amer., vol. 108, no. 5, pp. 2366–2376, Nov. 2000.

    [37]

    A. Spriet, S. Doclo, M. Moonen, and J. Wouters, “Feedback control in hearing aids,” in Springer Handbook Speech Process. Berlin, Germany: Springer, 2008, pp. 979–1000.

    [38]

    M. L. Valero, “Acoustic echo reduction for multiple loudspeakers and microphones: Complexity reduction and convergence enhancement” doctoral thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany, 2019.

    [39]

    A. Schwarz, C. Hofmann, and W. Kellermann, “Spectral feature-based nonlinear residual echo suppression,” in Proc. IEEE Workshop Appl. Signal Process. Audio Acoust., 2013, pp. 1–4.

    [40]

    P. Vary and R. Martin, Digital Speech Transmission Hoboken, NJ, USA: Wiley, 2006.

    Digital Library

    [41]

    H. Dubey et al., “ICASSP 2022 deep noise suppression challenge,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2022, pp. 9271–9275.

    [42]

    J. Barker, R. Marxer, E. Vincent, and S. Watanabe, “The third ‘CHiME’ speech separation and recognition challenge: Dataset, task and baselines,” in Proc. IEEE Workshop Autom. Speech Recognit. Understanding, 2015, pp. 504–511.

    [43]

    J. Traer and J. H. McDermott, “Statistics of natural reverberation enable perceptual separation of sound and space,” Proc. Nat. Acad. Sci., vol. 113, no. 48, pp. E7856–E7865, 2016.

    [44]

    V. Panayotov, G. Chen, D. Povey, and S. Khudanpur, “Librispeech: An ASR corpus based on public domain audio books,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2015, pp. 5206–5210.

    [45]

    G. Tzanetakis and P. Cook, “Musical genre classification of audio signals,” IEEE Speech Audio Process., vol. 10, no. 5, pp. 293–302, Jul. 2002.

    [46]

    B. L. Sturm, “An analysis of the GTZAN music genre dataset,” in Proc. Int. ACM Workshop Music Inf. Retrieval User-Centered Multimodal Strategies, 2012, pp. 7–12.

    Digital Library

    [47]

    LibriVox: Free public domain audiobooks. Accessed: Mar. 27, 2023. [Online]. Available: https://librivox.org

    [48]

    D. Kingma and J. Ba, “ADAM: A method for stochastic optimization,” 2014, arXiv:1412.6980.

    [49]

    Wideband Extension to Recommendation P.862 for the Assessment of Wideband Telephone Networks and Speech Codecs, ITU Standard P.862.2, ITU Recommendation, Geneva, Switzerland, Nov. 2007.

    [50]

    A. Briegleb, T. Haubner, V. Belagiannis, and W. Kellermann, “Localizing spatial information in neural spatiospectral filters,” in Proc. Eur. Signal Process. Conf., 2023, pp. 920–924.

    [51]

    F. Pedregosa et al., “Scikit-learn: Machine learning in python,” J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011.

    Index Terms

    1. End-to-End Deep Learning-Based Adaptation Control for Linear Acoustic Echo Cancellation

      1. Applied computing

        1. Arts and humanities

          1. Sound and music computing

        2. Computing methodologies

          1. Machine learning

            1. Machine learning approaches

              1. Neural networks

          2. Hardware

            1. Communication hardware, interfaces and storage

              1. Signal processing systems

            2. Information systems

              1. Information retrieval

                1. Specialized information retrieval

                  1. Multimedia and multimodal retrieval

            Index terms have been assigned to the content through auto-classification.

            Recommendations

            • State-Space Microphone Array Nonlinear Acoustic Echo Cancellation Using Multi-Microphone Near-End Speech Covariance

              Nonlinear acoustic echo cancellation AEC is a highly challenging task in a single-microphone; hence, the AEC technique with a microphone array has also been considered to more effectively reduce the residual echo. However, these algorithms track only a ...

              Read More

            • Deep Neural Network Based Regression Approach for Acoustic Echo Cancellation

              ICMSSP '19: Proceedings of the 2019 4th International Conference on Multimedia Systems and Signal Processing

              An acoustic echo canceller (AEC) aims to remove the acoustic echo in the mixture signal received by the near-end microphone. The conventional method uses an adaptive finite impulse response (FIR) filter to identify a room impulse response (RIR)which is ...

              Read More

            • Deep Learning for Acoustic Echo Cancellation and Active Noise Control

              Read More

            Comments

            Information & Contributors

            Information

            Published In

            End-to-End Deep Learning-Based Adaptation Control for Linear Acoustic Echo Cancellation (4)

            IEEE/ACM Transactions on Audio, Speech and Language Processing Volume 32, Issue

            2024

            2883 pages

            ISSN:2329-9290

            EISSN:2329-9304

            Issue’s Table of Contents

            2329-9290 © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

            Publisher

            IEEE Press

            Publication History

            Published: 19 October 2023

            Published inTASLPVolume 32

            Qualifiers

            • Research-article

            Contributors

            End-to-End Deep Learning-Based Adaptation Control for Linear Acoustic Echo Cancellation (5)

            Other Metrics

            View Article Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Total Citations

            • 6

              Total Downloads

            • Downloads (Last 12 months)6
            • Downloads (Last 6 weeks)2

            Other Metrics

            View Author Metrics

            Citations

            View Options

            Get Access

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            Get this Article

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            Media

            Figures

            Other

            Tables

            End-to-End Deep Learning-Based Adaptation Control for Linear Acoustic Echo Cancellation (2024)

            References

            Top Articles
            Latest Posts
            Article information

            Author: Neely Ledner

            Last Updated:

            Views: 6625

            Rating: 4.1 / 5 (42 voted)

            Reviews: 89% of readers found this page helpful

            Author information

            Name: Neely Ledner

            Birthday: 1998-06-09

            Address: 443 Barrows Terrace, New Jodyberg, CO 57462-5329

            Phone: +2433516856029

            Job: Central Legal Facilitator

            Hobby: Backpacking, Jogging, Magic, Driving, Macrame, Embroidery, Foraging

            Introduction: My name is Neely Ledner, I am a bright, determined, beautiful, adventurous, adventurous, spotless, calm person who loves writing and wants to share my knowledge and understanding with you.