白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Computerized intelligent assistant for conferences

專利號
US10867610B2
公開日期
2020-12-15
申請人
Microsoft Technology Licensing, LLC(US WA Redmond)
發(fā)明人
Adi Diamant; Karen Master Ben-Dor; Eyal Krupka; Raz Halaly; Yoni Smolin; Ilya Gurvich; Aviv Hurvitz; Lijuan Qin; Wei Xiong; Shixiong Zhang; Lingfeng Wu; Xiong Xiao; Ido Leichter; Moshe David; Xuedong Huang; Amit Kumar Agarwal
IPC分類
H04N7/14; G10L15/26; H04N7/15; G06K9/00; G10L17/00
技術(shù)領(lǐng)域
conference,transcript,assistant,or,may,in,speech,machine,e.g,remote
地域: WA WA Redmond

摘要

A method for facilitating a remote conference includes receiving a digital video and a computer-readable audio signal. A face recognition machine is operated to recognize a face of a first conference participant in the digital video, and a speech recognition machine is operated to translate the computer-readable audio signal into a first text. An attribution machine attributes the text to the first conference participant. A second computer-readable audio signal is processed similarly, to obtain a second text attributed to a second conference participant. A transcription machine automatically creates a transcript including the first text attributed to the first conference participant and the second text attributed to the second conference participant.

說明書

CROSS REFERENCE TO RELATED APPLICATIONS

This application claims priority to U.S. Provisional Patent Application Ser. No. 62/667,368, filed May 4, 2018, the entirety of which is hereby incorporated herein by reference for all purposes.

BACKGROUND

Individuals and organizations frequently arrange conferences in which a plurality of local and/or remote users participate to share information and to plan and report on tasks and commitments. Such conferences may include sharing information across multiple different modalities, e.g., including spoken and textual conversation, shared visual images, shared digital files, gestures, and non-verbal cues.

SUMMARY

This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Furthermore, the claimed subject matter is not limited to implementations that solve any or all disadvantages noted in any part of this disclosure.

權(quán)利要求

1
The invention claimed is:1. A method for facilitating a remote conference, comprising:receiving a digital video from a remote computing devicereceiving a first computer-readable audio signal from the remote computing device;operating a face identification machine to recognize a first face of a first remote conference participant and a second face of a second remote conference participant in the digital video;operating a speech recognition machine to translate the first computer-readable audio signal to a first text;operating an attribution machine configured to 1) attribute first portions of the first text to the first remote conference participant based on the first remote conference participant being recognized by the face identification machine, and 2) attribute second portions of the first text to the second remote conference participant based on the second remote conference participant being recognized by the face identification machine; andoperating a transcription machine configured to automatically create a transcript of the conference, the transcript including 1) the first portions of the first text attributed to the first remote conference participant, and 2) the second portions of the first text attributed to the second remote conference participant.2. The method of claim 1, wherein the transcript further includes an arrival time indicating a time of arrival of the first remote conference participant and a departure time indicating a time of departure of the first remote conference participant.3. The method of claim 2, wherein the arrival time is determined based on a time of recognition of the first remote conference participant by the face identification machine.4. The method of claim 1, wherein the transcription machine is configured to:recognize content of interest for the first remote conference participant;automatically recognize the content of interest in the transcript; andinclude within the transcript an indication of a portion of the transcript related to the content of interest.5. The method of claim 4, wherein the transcription machine is configured, responsive to recognizing the content of interest in the transcript, to send a notification to a companion device of the first remote conference participant including the indication of the portion of the transcript related to the content of interest.6. The method of claim 1, wherein the transcription machine is further configured to receive, from a companion device of the first remote conference participant, an indication of a digital file to be shared with the second remote conference participant, wherein the transcript further includes an indication that the digital file was shared.7. The method of claim 6, wherein the transcription machine is further configured to recognize a portion of the digital file being accessed by one or more of the first remote conference participant and the second remote conference participant, and wherein the transcript further includes an indication of the portion of the digital file that was accessed and a time at which the portion of the digital file was accessed.8. The method of claim 1, wherein the transcription machine is further configured to recognize, in the digital video, visual information being shared by the first remote conference participant, and wherein the transcript further includes a digital image representing the visual information.9. The method of claim 8, wherein the transcription machine is further configured to recognize a change to the visual information, and the transcript further includes a difference image showing the change to the visual information and an indication of a time at which the visual information was changed.10. The method of claim 9, wherein the transcription machine is further configured to recognize an occlusion of the visual information and to process one or more difference images to create a processed image showing the visual information with the occlusion removed; and wherein the transcript further includes the processed image.11. The method of claim 10, further comprising visually presenting a reviewable transcript at a companion device of a remote conference participant, wherein the reviewable transcript includes the difference image showing the change to the visual information and wherein the reviewable transcript is configured, responsive to selection of the difference image, to navigate to a portion of the transcript corresponding to the time at which the visual information was changed.12. The method of claim 1, wherein the transcription machine is configured to transcribe speech of a first conference participant in real time, the method further comprising presenting a notification at a companion device of a second conference participant that the first conference participant is currently speaking and including transcribed speech of the first conference participant.13. The method of claim 1, wherein the transcription machine is further configured to analyze the transcript to detect words having a predefined sentiment, the method further comprising presenting a sentiment analysis summary at a companion device of a conference participant, the sentiment analysis summary indicating a frequency of utterance of words having the predefined sentiment.14. The method of claim 1, further comprising a gesture recognition machine configured to recognize a gesture by the first remote conference participant indicating an event of interest, and wherein the transcription machine is configured to include an indication that the event of interest occurred responsive to detection of the gesture by the gesture recognition machine.15. A method for facilitating participation in a conference by a client device, comprising:receiving a digital video captured by a camera;receiving a computer-readable audio signal captured by a microphone;operating a face identification machine to recognize a face of a local conference participant in the digital video;operating a speech recognition machine to translate the computer-readable audio signal to text;operating an attribution machine to attribute a portion of the text to the local conference participant based on the local conference participant being recognized by the face identification machine;sending, to a conference server device, the portion of the text attributed to the local conference participant;receiving, from the conference server device, a running transcript of the conference including the text attributed to the local conference participant, and further including different text attributed to a remote conference participant; anddisplaying, in real time, new text added to the running transcript and attribution for the new text.16. A computerized conference assistant, comprising:a camera configured to convert light of one or more electromagnetic bands into digital video;a face identification machine configured to 1) recognize a first face of a first local conference participant in the digital video, and 2) recognize a second face of a second local conference participant in the digital video;a microphone array configured to convert sound into a computer-readable audio signal;a speech recognition machine configured to translate the computer-readable audio signal to text;an attribution machine configured to 1) attribute a first portion of the text to the first local conference participant based on the first local conference participant being recognized by the face identification machine, and 2) attribute a second portion of the text to the second local conference participant based on the second local conference participant being recognized by the face identification machine; anda transcription machine configured to automatically create a transcript of the conference, the transcript including 1) the first portion of the text attributed to the first local conference participant, and 2) the second portion of the text attributed to the second local conference participant.17. The computerized conference assistant of claim 16, further comprising a communication subsystem configured to receive a second text attributed to a remote conference participant, wherein the transcription machine is configured to add, to the transcript, the second text attributed to the remote conference participant.18. The computerized conference assistant of claim 16, wherein the transcription machine is further configured to recognize, in the digital video, visual information being shared by a local conference participant, and wherein the transcript further includes a digital image representing the visual information.19. The computerized conference assistant of claim 16, further comprising a gesture recognition machine configured to recognize a hand gesture by a local conference participant requesting that recording be stopped, wherein the transcription machine is configured to stop creating the transcript responsive to recognition of the hand gesture by the gesture recognition machine.20. The method of claim 15, wherein:the face identification machine is further configured to recognize, for each of a plurality of local conference participants in the digital video, the face of the local conference participant;the attribution machine is further configured, for each local conference participant of the plurality of local conference participants, to attribute one or more portions of the text to the local conference participant; andthe transcript includes, for each local conference participant of the plurality of local conference participants, the one or more portions of the text attributed to the local conference participant.
微信群二維碼
意見反饋