Some embodiments perform a pre-analysis step in which the encoder performs a region level analysis (e.g. for every N×M block with, for example, N=M=4, or an analysis based on object segmentation) to extract for each color component in that region the intensity (e.g. mean value, or lightness for luma and saturation for color), hue, variance/activity/texture characteristics, noise characteristics, and motion characteristics (e.g. motion vector and/or prediction distortion value).
Since video contents of different types can be combined into a same video stream or even a same video image, some embodiments identify different regions in an image that are of different types of video content. In some of these embodiments, different regions with different types of video content are assigned different chroma QP offset values or into different quantization groups. Some embodiments distinguish graphics content from real video content. Some embodiments distinguish 4:4:4 video content that are originally coded in 4:4:4 format from 4:4:4 video content that are up-sampled from 4:2:0 format. Some embodiments discern video content that may have originally been of different bit-depths. These characteristics of video content, in addition to their relationships across color components as well as rate control information, are used by some embodiments to determine the quantization levels or quantization relationships among all color components.