a processing module, configured to perform video parsing and presentation based on the information about the multiplex video stream.
With reference to the third aspect, in a first possible implementation, the information about the multiplex video stream includes information about N multiplexed sub video streams that are respectively obtained by performing preset multiplexing processing on N sub video streams, the N sub video streams are corresponding sub video streams generated by dividing the target spatial object into N sub spatial objects and encoding the N sub spatial objects, N is a natural number greater than 1, the target request feedback further includes multiplexing description information, and the multiplexing description information includes at least one of the following:
a quantity N of the sub video streams that is included in the information about the multiplex video stream;
a starting location offset, in the information about the multiplex video stream, of a starting sub video stream of the N sub video streams;
a data volume of the N multiplexed sub video streams;
spatial location information, in the VR content component, respectively corresponding to the N multiplexed sub video streams;
resolution information of the N multiplexed sub video streams; and
a video stream multiplexing type of the N multiplexed sub video streams.
With reference to the first possible implementation of the third aspect, in a second possible implementation, the multiplexing description information further includes: the spatial location information, in the VR content component, respectively corresponding to the N multiplexed sub video streams.