白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Computerized intelligent assistant for conferences

專利號
US10867610B2
公開日期
2020-12-15
申請人
Microsoft Technology Licensing, LLC(US WA Redmond)
發(fā)明人
Adi Diamant; Karen Master Ben-Dor; Eyal Krupka; Raz Halaly; Yoni Smolin; Ilya Gurvich; Aviv Hurvitz; Lijuan Qin; Wei Xiong; Shixiong Zhang; Lingfeng Wu; Xiong Xiao; Ido Leichter; Moshe David; Xuedong Huang; Amit Kumar Agarwal
IPC分類
H04N7/14; G10L15/26; H04N7/15; G06K9/00; G10L17/00
技術(shù)領(lǐng)域
conference,transcript,assistant,or,may,in,speech,machine,e.g,remote
地域: WA WA Redmond

摘要

A method for facilitating a remote conference includes receiving a digital video and a computer-readable audio signal. A face recognition machine is operated to recognize a face of a first conference participant in the digital video, and a speech recognition machine is operated to translate the computer-readable audio signal into a first text. An attribution machine attributes the text to the first conference participant. A second computer-readable audio signal is processed similarly, to obtain a second text attributed to a second conference participant. A transcription machine automatically creates a transcript including the first text attributed to the first conference participant and the second text attributed to the second conference participant.

說明書

Returning briefly to FIG. 1B, computerized conference assistant 106 includes a sound source localization (SSL) machine 120 that is configured to estimate the location(s) of sound(s) based on signals 112. FIG. 2 schematically shows SSL machine 120 analyzing signals 112a-g to output an estimated origination 140 of the sound modeled by signals 112a-g. As introduced above, signals 112a-g are respectively generated by microphones 108a-g. Each microphone has a different physical position and/or is aimed in a different direction. Microphones that are farther from a sound source and/or aimed away from a sound source will generate a relatively lower amplitude and/or slightly phase delayed signal 112 relative to microphones that are closer to and/or aimed toward the sound source. As an example, while microphones 108a and 108d may respectively produce signals 112a and 112d in response to the same sound, signal 112a may have a measurably greater amplitude if the recorded sound originated in front of microphone 108a. Similarly, signal 112d may be phase shifted behind signal 112a due to the longer time of flight (ToF) of the sound to microphone 108d. SSL machine 120 may use the amplitude, phase difference, and/or other parameters of the signals 112a-g to estimate the origination 140 of a sound. SSL machine 120 may be configured to implement any suitable two- or three-dimensional location algorithms, including but not limited to previously-trained artificial neural networks, maximum likelihood algorithms, multiple signal classification algorithms, and cross-power spectrum phase analysis algorithms. Depending on the algorithm(s) used in a particular application, the SSL machine 120 may output an angle, vector, coordinate, and/or other parameter estimating the origination 140 of a sound.

權(quán)利要求

1
微信群二維碼
意見反饋