白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Computerized intelligent assistant for conferences

專利號
US10867610B2
公開日期
2020-12-15
申請人
Microsoft Technology Licensing, LLC(US WA Redmond)
發(fā)明人
Adi Diamant; Karen Master Ben-Dor; Eyal Krupka; Raz Halaly; Yoni Smolin; Ilya Gurvich; Aviv Hurvitz; Lijuan Qin; Wei Xiong; Shixiong Zhang; Lingfeng Wu; Xiong Xiao; Ido Leichter; Moshe David; Xuedong Huang; Amit Kumar Agarwal
IPC分類
H04N7/14; G10L15/26; H04N7/15; G06K9/00; G10L17/00
技術(shù)領域
conference,transcript,assistant,or,may,in,speech,machine,e.g,remote
地域: WA WA Redmond

摘要

A method for facilitating a remote conference includes receiving a digital video and a computer-readable audio signal. A face recognition machine is operated to recognize a face of a first conference participant in the digital video, and a speech recognition machine is operated to translate the computer-readable audio signal into a first text. An attribution machine attributes the text to the first conference participant. A second computer-readable audio signal is processed similarly, to obtain a second text attributed to a second conference participant. A transcription machine automatically creates a transcript including the first text attributed to the first conference participant and the second text attributed to the second conference participant.

說明書

Non-limiting examples of training procedures for speech recognition machine 130 include supervised training, zero-shot, few-shot, unsupervised learning methods, reinforcement learning and/or generative adversarial neural network training methods. In some examples, a plurality of components of speech recognition machine 130 may be trained simultaneously with regard to an objective function measuring performance of collective functioning of the plurality of components in order to improve such collective functioning. In some examples, one or more components of speech recognition machine 130 may be trained independently of other components. In an example, speech recognition machine 130 may be trained via supervised training on labelled training data comprising speech audio annotated to indicate actual lexical data (e.g., words, phrases, and/or any other language data in textual form) corresponding to the speech audio, with regard to an objective function measuring an accuracy, precision, and/or recall of correctly recognizing lexical data corresponding to speech audio.

權(quán)利要求

1
微信群二維碼
意見反饋