伊人依成久久人综合网,99久久精品国产成人综合

摘要

A spatial audio processing system operable to enable audio signals to be spatially extracted from, or transmitted to, discrete locations within an acoustic space. Embodiments of the present disclosure enable an array of transducers being installed in an acoustic space to combine their signals via inverting physical and environmental models that are measured, learned, tracked, calculated, or estimated. The models may be combined with a whitening filter to establish a cooperative or non-cooperative information-bearing channel between the array and one or more discrete, targeted physical locations in the acoustic space by applying the inverted models with whitening filter to the received or transmitted acoustical signals. The spatial audio processing system may utilize a model of the combination of direct and indirect reflections in the acoustic space to receive or transmit acoustic information, regardless of ambient noise levels, reverberation, and positioning of physical interferers.

說明書

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation-in-part of U.S. patent application Ser. No. 16/985,133, filed on Aug. 4, 2020 entitled “SPATIAL AUDIO ARRAY PROCESSING SYSTEM AND METHOD,” which is a continuation of U.S. patent application Ser. No. 16/879,470, filed on May 20, 2020 entitled “SPATIAL AUDIO ARRAY PROCESSING SYSTEM AND METHOD,” which claims the benefit of U.S. Provisional Application Ser. No. 62/902,564, filed on Sep. 19, 2019 entitled “SPATIAL AUDIO ARRAY PROCESSING SYSTEM AND METHOD”; the disclosures of said applications being hereby incorporated in the present application in their entireties at least by virtue of this reference.

FIELD

The present disclosure relates to the field of audio processing; in particular, a spatial audio array processing system and method operable to enable audio signals to be received from, or transmitted to, selected locations in an acoustic space.

BACKGROUND

A wide variety of acoustic transducers, such as microphones, are commonly used to acquire sounds from a target audio source, such as speech from a human speaker. The quality of the sound acquired by microphones is adversely affected by a variety of factors, such as attenuation over the distance between the target audio source to the microphone(s), interference from other acoustic sources particularly in high noise environments, and sound wave reverberation and echo.

權(quán)利要求

1

What is claimed is:

1. A method for spatial audio processing comprising:receiving, with at least one wearable sensor, sensor data corresponding to a direction of a user's head within an acoustic environment;

determining, with at least one processor, at least one source location within the acoustic environment based at least in part on the sensor data;

receiving, with an audio processor, an audio input comprising audio signals captured within the acoustic environment,

wherein the audio input comprises at least one target audio signal emanating from the at least one source location;

converting, with the audio processor, the audio input from a time domain to a frequency domain according to at least one transform function;

determining, with the audio processor, at least one acoustic propagation model for the at least one source location,

wherein determining the at least one acoustic propagation model comprises calculating one or more spatial and temporal properties for a sound field of the audio input;

processing, with the audio processor, the audio input according to the at least one acoustic propagation model to spatially filter the at least one target audio signal from one or more non-target audio signals in the audio input,

wherein processing the audio input according to the at least one acoustic propagation model comprises refocusing the sound field of the audio input to extract the at least one target audio signal emanating from the at least one source location; and

applying, with the audio processor, a whitening filter to a spatially filtered target audio signal to derive at least one separated audio output signal,

wherein applying the whitening filter comprises suppressing the one or more non-target audio signals in the audio input according to the at least one acoustic propagation model, wherein the one or more non-target audio signals comprise one or more audio signals emanating from a location in the acoustic environment other than the at least one source location.

2. The method of claim 1 wherein the at least one transform function is selected from the group consisting of Fourier transform, Fast Fourier transform, Short Time Fourier transform and modulated complex lapped transform.

3. The method of claim 1 wherein the audio input comprises a training audio input.

4. The method of claim 1 wherein the acoustic environment comprises a waveguide location.

5. The method of claim 1 further comprising rendering, with the audio processor, an audio file comprising the at least one separated audio output signal.

6. The method of claim 4 further comprising rendering, with at least one loudspeaker, an audio output comprising the at least one separated audio output signal.

7. The method of claim 6 wherein the at least one loudspeaker is incorporated within a loudspeaker array.

8. The method of claim 7 wherein the loudspeaker array corresponds to the waveguide location.

9. The method of claim 1 wherein the audio input comprises two or more channels of audio input data.

10. The method of claim 9 wherein each channel in the two or more channels of audio input data corresponds to a transducer located in the acoustic environment.

11. The method of claim 1 further comprising determining, with the audio processor, the at least one source location according to at least one training audio input.

12. A spatial audio processing system, comprising:at least one wearable sensor configured to receive at least one sensor input corresponding to a movement and direction of a user's head;

a processing device comprising an audio processing module configured to receive an audio input comprising acoustic audio signals captured within an acoustic environment; and

at least one non-transitory computer readable medium communicably engaged with the processing device and having instructions stored thereon that, when executed, cause the processing device to perform one or more audio processing operations, the one or more audio processing operations comprising:

receiving sensor data corresponding to the direction of the user's head within the acoustic environment;

determining at least one source location within the acoustic environment based at least in part on the sensor data;

receiving the audio input comprising the acoustic audio signals captured within the acoustic environment,

wherein the audio input comprises at least one target audio signal emanating from the at least one source location;

converting the audio input from a time domain to a frequency domain according to at least one transform function;

determining at least one acoustic propagation model for the at least one source location within the acoustic environment,

wherein determining the at least one acoustic propagation model comprises calculating one or more spatial and temporal properties for a sound field of the audio input;

processing the audio input according to the at least one acoustic propagation model to spatially filter the at least one target audio signal from one or more non-target audio signals in the audio input,

wherein processing the audio input according to the at least one acoustic propagation model comprises refocusing the sound field of the audio input to extract the at least one target audio signal emanating from the at least one source location; and

applying a whitening filter to a spatially filtered target audio signal to derive at least one separated audio output signal,

wherein applying the whitening filter comprises suppressing the one or more non-target audio signals in the audio input according to the at least one acoustic propagation model, wherein the one or more non-target audio signals comprise one or more audio signals emanating from a location in the acoustic environment other than the at least one source location.

13. The system of claim 12 wherein the at least one transform function is selected from the group consisting of Fourier transform, Fast Fourier transform, Short Time Fourier transform and modulated complex lapped transform.

14. The system of claim 12 further comprising two or more transducers communicably engaged with the processing device.

15. The system of claim 14 wherein each transducer in the two or more transducers comprises a separate audio input or output channel.

16. The system of claim 12 wherein the one or more audio processing operations further comprise rendering an audio file comprising the at least one separated audio output signal.

17. A method for spatial audio processing comprising:receiving, with at least one camera, a live video feed of an acoustic environment;

displaying, on at least one display device, the live video feed of the acoustic environment;

selecting, with at least one input device, an audio source within the live video feed;

determining, with at least one processor, at least one source location within the acoustic environment based at least in part on the selected audio source within the live video feed;