白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Reduced rendering of six-degree of freedom video

專利號
US11212506B2
公開日期
2021-12-28
申請人
Intel Corporation(US CA Santa Clara)
發(fā)明人
Jill Boyce
IPC分類
H04N13/178; H04N13/398; H04N13/366
技術(shù)領(lǐng)域
graphics,pipeline,shader,in,video,processor,data,or,execution,texture
地域: CA CA Santa Clara

摘要

Embodiments described herein provide for techniques to reduce the complexity of rendering immersive 3D video content. One embodiment provides for an apparatus comprising one or more processors to receive a data set that represents a two-dimensional encoding of planar projections of a frame of a three-dimensional video, decode the two-dimensional encoding into texture data, geometry data, and metadata, determine, based on the metadata, a visibility status and an occupancy status for a sample position in the three-dimensional video, and render video data for the sample position when the sample position is visible and occupied.

說明書

BACKGROUND

Six degree of freedom (6DoF) video is an emerging immersive video use case, which provides a viewer an immersive media experience where the viewer controls the viewpoint of a scene. The simpler three degree of freedom (3DoF) video, (e.g. 360 degree or panoramic video) allows a viewer to change orientation around the X, Y, and Z axes, (described as yaw, pitch, and roll), from a fixed position. 6DoF video enables the viewer to change position through translational movements along the X, Y, and Z axes.

6DoF video can be represented using point clouds. However, the rendering of point cloud data is computationally expensive, making it difficult to render point cloud video containing large number of points at high frame rates. Furthermore, point cloud data rates are large, requiring a large capacity for storage or transmission.

BRIEF DESCRIPTION OF THE DRAWINGS

So that the manner in which the above recited features of the present embodiments can be understood in detail, a more particular description of the embodiments, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments and are therefore not to be considered limiting of its scope.

FIG. 1 is a block diagram of a processing system, according to an embodiment;

FIG. 2 is a block diagram of a processor according to an embodiment;

FIG. 3 is a block diagram of a graphics processor, according to an embodiment;

權(quán)利要求

1
What is claimed is:1. An apparatus comprising:one or more processors to:receive a data set that represents a two-dimensional encoding of planar projections of an input frame of a three-dimensional video, wherein the planar projections are two-dimensional projections of a point cloud representation of the three-dimensional video;decode the two-dimensional encoding into texture data, geometry data, and projection metadata, the geometry data to specify a set of sample positions in a set of geometry planes that are positioned relative to a set of projection planes of the input frame of the three-dimensional video and the texture data includes a set of texture planes to specify color data for the set of sample positions;determine, based on the projection metadata, a visibility status for one or more projection planes in the set of projection planes and an occupancy status for one or more sample positions in the set of sample positions, the visibility status determined from multiple view points for the input frame; andrender video data for occupied sample positions that are associated with a visible projection plane, wherein to render the video data includes to color the occupied sample positions based on the texture data without reconstruction of the point cloud representation of the three-dimensional video.2. The apparatus as in claim 1, wherein the set of geometry planes include depth data for each of a plurality of samples in the set of sample positions, the geometry planes generated based on texture patches from separate projections of the three-dimensional video.3. The apparatus as in claim 2, wherein the projection metadata includes auxiliary patch information that indicates at least one projection plane for each of the plurality of samples in the set of sample positions and the auxiliary patch information is input to an occlusion filling unit to fill an occluded portion of an object in a first view based in part on the auxiliary patch information.4. The apparatus as in claim 1, wherein the data set is a bitstream to be received from a remote device.5. The apparatus as in claim 1, the one or more processors additionally to:determine the visibility status from a first unit of the projection metadata; anddetermine the occupancy status from a second unit of the projection metadata.6. The apparatus as in claim 5, wherein to determine the visibility status, the one or more processors are to:receive a position and orientation associated with a display device;determine a projection plane associated with the sample position; anddetermine the visibility status based on visibility of the projection plane from the position and orientation of the display device.7. The apparatus as in claim 6, wherein the position and orientation associated with the display device is the position and orientation in a view space of the three-dimensional video.8. The apparatus as in claim 7, wherein the geometry data and the texture data include two-dimensional patch image representations for projection angles determined based on a maximization of a dot product of a point normal and a plane normal.9. The apparatus as in claim 7, wherein the first unit of the projection metadata includes auxiliary patch information that indicates the projection plane of the sample position.10. The apparatus as in claim 7, wherein the second unit of the projection metadata includes an occupancy map that indicates whether a sample position has corresponding point cloud data.11. A method comprising:receiving a data set that represents a two-dimensional encoding of planar projections of an input frame of a three-dimensional video, wherein the planar projections are two-dimensional projections of a point cloud representation of the three-dimensional video;decoding the two-dimensional encoding into texture data, geometry data, and projection metadata, wherein the geometry data specifies a set of sample positions in a set of geometry planes that are positioned relative to a set of projection planes of the input frame of the three-dimensional video and the texture data includes a set of texture planes that specify color data for the set of sample positions;determining, based on the projection metadata, a visibility status for one or more projection planes in the set of projection planes and an occupancy status for one or more sample positions in the set of sample positions, the visibility status determined from multiple view points for the input frame; andrendering video data for occupied sample positions that are associated with a visible projection plane, wherein to render the video data includes to color the occupied sample positions based on the texture data without reconstruction of the point cloud representation of the three-dimensional video.12. The method as in claim 11, wherein the set of geometry planes include depth data for each of a plurality of samples in the set of sample positions, the geometry planes are generated based on texture patches from separate projections of the three-dimensional video, the projection metadata includes auxiliary patch information that indicates at least one projection plane for each of the plurality of samples in the set of sample positions, and the auxiliary patch information is input to an occlusion filling unit to fill an occluded portion of an object in a first view based in part on the auxiliary patch information.13. The method as in claim 11, additionally comprising:determining the visibility status from a first unit of the projection metadata; anddetermining the occupancy status from a second unit of the projection metadata, wherein determining the visibility status includes receiving a position and orientation associated with a display device, determining a projection plane associated with the sample position, and determining the visibility status based on visibility of the projection plane from the position and orientation of the display device.14. The method as in claim 13, wherein the position and orientation associated with the display device is the position and orientation in a view space of the three-dimensional video and the geometry data and the texture data include two-dimensional patch image representations for projection angles determined based on a maximization of a dot product of a point normal and a plane normal.15. A non-transitory machine-readable medium storing instructions to cause one or more processors to perform operations comprising:receiving a data set that represents a two-dimensional encoding of planar projections of an input frame of a three-dimensional video, wherein the planar projections are two-dimensional projections of a point cloud representation of the three-dimensional video;decoding the two-dimensional encoding into texture data, geometry data, and projection metadata, wherein the geometry data specifies a set of sample positions in a set of geometry planes that are positioned relative to a set of projection planes of the input frame of the three-dimensional video and the texture data includes a set of texture planes that specify color data for the set of sample positions;determining, based on the projection metadata, a visibility status for one or more projection planes in the set of projection planes and an occupancy status for one or more sample positions in the set of sample positions, the visibility status determined from multiple view points for the input frame; andrendering video data for occupied sample positions that are associated with a visible projection plane, wherein to render the video data includes to color the occupied sample positions based on the texture data without reconstruction of the point cloud representation of the three-dimensional video.16. The non-transitory machine-readable medium as in claim 15, wherein the set of geometry planes include depth data for each of a plurality of samples in the set of sample positions, the geometry planes are generated based on texture patches from separate projections of the three-dimensional video, the projection metadata includes auxiliary patch information that indicates at least one projection plane for each of the plurality of samples in the set of sample positions, and the auxiliary patch information is input to an occlusion filling unit to fill an occluded portion of an object in a first view based in part on the auxiliary patch information.17. The non-transitory machine-readable medium as in claim 16, wherein the data set is a bitstream received from a remote device.18. The non-transitory machine-readable medium as in claim 16, additionally comprising:determining the visibility status from a first unit of the projection metadata; anddetermining the occupancy status from a second unit of the projection metadata.19. The non-transitory machine-readable medium as in claim 18, wherein determining the visibility status includes:receiving a position and orientation associated with a display device;determining a projection plane associated with the sample position; anddetermining the visibility status based on visibility of the projection plane from the position and orientation of the display device.20. The non-transitory machine-readable medium as in claim 19, wherein the position and orientation associated with the display device is the position and orientation in a view space of the three-dimensional video and the geometry data and the texture data include two-dimensional patch image representations for projection angles determined based on a maximization of a dot product of a point normal and a plane normal.
微信群二維碼
意見反饋