白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Video conferencing system

專利號
US11528451B2
公開日期
2022-12-13
申請人
Eyecon AS(NO Stavanger)
發(fā)明人
Jan Ove Haaland; Eivind Nag; Joar Vaage
IPC分類
H04N7/15; G06T7/00; G06T17/20
技術領域
video,camera,data,sensor,or,image,virtual,in,e.g,cameras
地域: Stavanger

摘要

A method of capturing data for use in a video conference includes capturing data of a first party at a first location using an array of one or more video cameras and/or one or more sensors. The three-dimensional position(s) of one or more features represented in the data captured by the video camera(s) and/or sensor(s) are determined. A virtual camera positioned at a three-dimensional virtual camera position is defined. The three-dimensional position(s) determined for the feature(s) are transformed into a common coordinate system to form a single view of the feature(s) as appearing to have been captured from the virtual camera. The video image and/or sensor data of the feature(s) viewed from the perspective of the virtual camera and/or data representative of the transformed three-dimensional position(s) of the feature(s) are then transmitted or stored.

說明書

CROSS-REFERENCE TO RELATED APPLICATIONS

This application represents the U.S. National Phase of International Application number PCT/EP2019/066091 entitled “Video Conferencing System” filed 18 Jun. 2019, which claims benefit from Great Britain Application number 1809960.6 filed 18 Jun. 2018, all of which are incorporated herein by reference.

BACKGROUND OF THE INVENTION

This invention relates to a video conferencing system and a method of holding a video conferencing call, in particular to a video conferencing system that defines a virtual camera.

Video conferencing involves the exchange of audio and video data between multiple parties at different locations to facilitate audio and visual communication. While the inclusion of video data provides enhanced communication over a telephone call, video conferencing still does not provide the same experience as a face to face meeting.

One of the main problems is the lack of eye to eye contact between the participants involved in a video conference, which is an important part of human interaction. This is because, for each party, the images of the other participants involved in the video conference are shown on their screen but the video camera capturing the image data of this party is outside of the area of the screen.

SUMMARY OF THE INVENTION

It is an aim of the present invention to provide an improved video conferencing system.

When viewed from a first aspect the invention provides a method of capturing data for use in a video conference, the method comprising:

    權利要求

    1
    The invention claimed is:1. A method of capturing data for use in a video conference, the method comprising:capturing data of a party at a first location using an array of one or more video cameras and/or one or more sensors;wherein the one or more video cameras and/or one or more sensors in the array are located in the same plane;wherein the field of view of the one or more video cameras and/or one or more sensors in the array is directed outwards perpendicularly to the plane in which they are located; anddetermining, for each of the one or more video cameras and/or each of the one or more sensors in the array, the three-dimensional position(s) of one or more features represented in the data captured by the video camera or sensor;defining a virtual camera positioned at a three-dimensional virtual camera position;transforming the three-dimensional position(s) determined for the feature(s) represented in the data into a common coordinate system to form a single view of the feature(s) as appearing to have been captured from the virtual camera using the video image data from the one or more video cameras and/or the data from the one or more sensors;transmitting and/or storing the video image and/or sensor data of the feature(s) viewed from the perspective of the virtual camera and/or data representative of the transformed three-dimensional position(s) of the feature(s); andwherein the method further comprises determining a depth component of the three-dimensional position(s) of the feature(s) and transforming the image data and/or the sensor data of the feature(s) into the common coordinate system using an xy translation inversely proportional to the determined depth of the feature(s).2. The method as claimed in claim 1, further comprising selecting the feature(s) in the video image and/or sensor data having transformed three-dimensional position(s) in the common coordinate system that are within a particular range of three-dimensional positions.3. The method as claimed in claim 1, wherein a depth component of the three-dimensional position(s) of the feature(s) is determined by triangulating the positions of the feature(s) using the video image data from the video camera(s) and/or the sensor data from the sensor(s).4. The method as claimed in claim 1, wherein the method comprises calibrating the positions of the video camera(s) and/or sensor(s) in the array of video camera(s) and/or sensor(s).5. The method as claimed in claim 1, wherein the method comprises identifying feature(s) in the video image data and/or the other sensor data captured by the array of video camera(s) and/or sensor(s).6. The method as claimed in claim 5, wherein the step of identifying feature(s) in the video image data or other sensor data comprises identifying feature(s) in one or more blocks of the video image data and/or the other sensor data.7. The method as claimed in claim 1, wherein the method comprises identifying participant(s) of the first party in the video image and/or sensor data captured by the array of video camera(s) and/or sensor(s).8. The method as claimed in claim 7, wherein the virtual camera is positioned using the participant(s) of the first party identified in the captured video image and/or sensor data and/or the direction in which the participant(s) are looking or facing.9. The method as claimed in claim 5, wherein the method comprises comparing one or more identified features or participants in the video image data and/or other sensor data from one of the video camera(s) and/or sensor(s) in the array with one or more identified features or participants in the video image data and/or other sensor data from other(s) of the video camera(s) and/or sensor(s) in the array, and matching the same or similar identified features or participants with each other.10. The method as claimed in claim 9, wherein the method comprises matching the video image data and/or other sensor data from one or more pairs of video camera(s) and/or sensor(s) in the array.11. The method as claimed in claim 9, wherein the method comprises forming a depth map, a 3D point cloud, a 3D mesh or a depth buffer for each pair of video camera(s) and/or sensor(s) in the array between which identified feature(s) have been matched and storing the determined three-dimensional position(s) of the identified and matched feature(s) in the depth map, 3D point cloud, 3D mesh or depth buffer.12. The method as claimed in claim 9, wherein the method comprises using the video image data and/or sensor data from other(s) of the video camera(s) and/or sensor(s) in the array to refine the three-dimensional position(s) of the identified and matched feature(s).13. The method as claimed in claim 1, the method further comprising defining a plurality of virtual cameras positioned at respective three-dimensional virtual camera positions.14. The method as claimed in claim 1, the method further comprising filling a depth buffer with a transformed depth position of each of the features represented in the video image and/or sensor data.15. The method as claimed in claim 1, wherein the single view of the feature(s) is formed such that the face(s) and/or eye(s) and/or body of the participant(s) in the video image and/or data are oriented perpendicularly to the direction to them from the virtual camera.16. The method as claimed in claim 1, wherein the video image and/or sensor data from the array of video camera(s) and/or sensor(s) of the selected feature(s) are combined by forming a triangulated mesh, point cloud or depth buffer of the feature(s); and wherein the triangulated mesh, point cloud or depth buffer of the selected feature(s) is filled with image and/or sensor data of the selected feature(s) from the video camera(s) and/or sensor(s) in the array.17. The method as claimed in claim 1, wherein the method comprises combining the video image data from the one or more video cameras and/or the data captured by the one or more sensors to form the single view of the feature(s) as appearing to have been captured from the virtual camera; and wherein the method comprises averaging the colour data from the one or more video cameras and/or the data captured by the one or more sensors to form the single view of the feature(s) as appearing to have been captured from the virtual camera.18. A video conferencing system for capturing data for use in a video conference, the system comprising:an array of one or more video cameras and/or one or more sensors for capturing data of a party at a first location;wherein the one or more video cameras and/or one or more sensors in the array are located in the same plane;wherein the field of view of the one or more video cameras and/or one or more sensors in the array is directed outwards perpendicularly to the plane in which they are located; andprocessing circuitry configured to:determine, for each of the one or more video cameras and/or each of the one or more sensors in the array, the three-dimensional position(s) of one or more features represented in the data captured by the video camera or sensor;define a virtual camera positioned at a three-dimensional virtual camera position;transform the three-dimensional position(s) determined for the feature(s) represented in the data into a common coordinate system to form a single view of the feature(s) as appearing to have been captured from the virtual camera using the video image data from the one or more video cameras and/or the data from the one or more sensors; andtransmit and/or store the video image and/or sensor data of the feature(s) as viewed from the perspective of the virtual camera(s) and/or data representative of the transformed three-dimensional position(s) of the feature(s); andwherein the processing circuitry is further configured to determine a depth component of the three-dimensional position(s) of the feature(s) and transform the image data and/or sensor data of the feature(s) into the common coordinate system using an xy translation inversely proportional to the determined depth of the feature(s).19. A non-transitory computer readable storage medium storing computer software code which when executing on a data processing system performs a method of capturing data for use in a video conference, the method comprising:determining, for each of one or more video cameras and/or one or more sensors in an array, the three-dimensional position(s) of one or more features represented in data of a party at a first location captured by the video camera or sensor;wherein the one or more video cameras and/or one or more sensors in the array are located in the same plane;wherein the field of view of the one or more video cameras and/or one or more sensors in the array is directed outwards perpendicularly to the plane in which they are located; anddefining a virtual camera positioned at a three-dimensional virtual camera position;transforming the three-dimensional position(s) determined for the feature(s) represented in the data into a common coordinate system to form a single view of the feature(s) as appearing to have been captured from the virtual camera using the video image data from the one or more video cameras and/or the data from the one or more sensors;transmitting and/or storing the video image and/or sensor data of the feature(s) viewed from the perspective of the virtual camera and/or data representative of the transformed three-dimensional position(s) of the feature(s) and;wherein the method further comprises determining a depth component of the three-dimensional position(s) of the feature(s) and transforming the image data and/or the sensor data of the feature(s) into the common coordinate system using an xy translation inversely proportional to the determined depth of the feature(s).
    微信群二維碼
    意見反饋