What is claimed is:1. A method for video processing, the method comprising:receiving a bitstream of a video block comprising a first transform block of a first color component and a second transform block of a second color component, wherein the first transform block and the second transform blocks are co-located blocks;obtaining the first transform block of the first color component and the second transform block of the second color component from the bitstream of the video block;determining a first flag indicating that all transform coefficients in the first transform block are zero;determining a second flag indicating that a cross component level reconstruction (CCLR) is applied to the first transform block; andin response to determining that CCLR is applied to the first transform block:refining one or more of the transform coefficients in the first transform block by adding one or more offset values, to obtain a refined first transform block, the one or more offset values being derived based on transform coefficients that are in the second transform block and are co-located with the one or more of the transform coefficients in the first transform block;determining a target transform kernel for the refined first transform block;performing a reverse transform on the refined first transform block based on the target transform kernel to obtain a target block; andreconstructing the first color component of the video block based on at least the target block.2. The method of claim 1, wherein:the first color component comprises one chroma component whereas the second color component comprise another chroma component;the first color component comprises a luma component whereas the second color component comprises one chroma component; orthe first color component comprises one chroma component whereas the second color component comprises luma component.3. The method of claim 1, wherein determining the target transform kernel comprises:selecting the target transform kernel for the refined first transform block as a same transform kernel for the second transform block.4. The method of claim 1, wherein determining the target transform kernel comprises:extracting an indicator signaled in the bitstream, wherein the indicator specifies the target transform kernel and the indicator is signaled in response to determining that the CCLR is applied the first transform block; andselecting the target transform kernel based on the indicator.5. The method of claim 1, wherein determining the target transform kernel comprises:in response to the video block being predicted under an intra prediction, deriving the target transform kernel based on a mode of the intra prediction.6. The method of claim 5, wherein the target transform kernel is different from a transform kernel for the second transform block when CCLR is not applied to the second transform block.7. The method of claim 1, wherein determining the target transform kernel comprises:in response to the video block being inter predicted, selecting the target transform kernel according to a luma transform block co-located with the first transform block.8. The method of claim 1, wherein determining the target transform kernel comprises:selecting the target transform kernel from a list of kernels based on a block size of the first transform block, wherein the list of kernels is predefined or signaled in the bitstream.9. The method of claim 1, wherein CCLR is only allowed to be applied on the first transform block when the first transform block is associated with a predefined set of primary transform types.10. The method of claim 9, wherein a transform associated with each primary transform type in the predefined set of primary transform types is a two-dimension transform, the two-dimension transform is a formed by two one-dimension transforms, wherein the two one-dimension transforms are both Discrete Cosine Transforms (DCTs) or both Incremental Distance Transforms (IDTs).11. The method of claim 1, further comprising:deriving the one or more offset values based on: 1) transform coefficients in the second transform block co-located with the one or more of the transform coefficients in the first transform block, and 2) the target transform kernel.12. A device for video processing, the device comprising a memory for storing computer instructions and a processor in communication with the memory, wherein, when the processor executes the computer instructions, the processor is configured to cause the device to:receive a bitstream of a video block comprising a first transform block of a first color component and a second transform block of a second color component, wherein the first transform block and the second transform blocks are co-located blocks;obtain the first transform block of the first color component and the second transform block of the second color component from the bitstream of the video block;determine a first flag indicating that all transform coefficients in the first transform block are zero;determine a second flag indicating that a cross component level reconstruction (CCLR) is applied to the first transform block; andin response to determining that CCLR is applied to the first transform block:refine one or more of the transform coefficients in the first transform block by adding one or more offset values, to obtain a refined first transform block, the one or more offset values being derived based on transform coefficients that are in the second transform block and are co-located with the one or more of the transform coefficients in the first transform block;determine a target transform kernel for the refined first transform block;perform a reverse transform on the refined first transform block based on the target transform kernel to obtain a target block; andreconstruct the first color component of the video block based on at least the target block.13. The device of claim 12, wherein, when the processor is configured to cause the device to determine the target transform kernel, the processor is configured to cause the device to:select the target transform kernel for the refined first transform block as a same transform kernel for the second transform block.14. The device of claim 12, wherein, when the processor is configured to cause the device to determine the target transform kernel, the processor is configured to cause the device to:extract an indicator signaled in the bitstream, wherein the indicator specifies the target transform kernel and the indicator is signaled in response to determining that the CCLR is applied the first transform block; andselect the target transform kernel based on the indicator.15. The device of claim 12, wherein, when the processor is configured to cause the device to determine the target transform kernel, the processor is configured to cause the device to:in response to the video block being predicted under an intra prediction, derive the target transform kernel based on a mode of the intra prediction.16. The device of claim 15, wherein the target transform kernel is different from a transform kernel for the second transform block when CCLR is not applied to the second transform block.17. The device of claim 12, wherein, when the processor is configured to cause the device to determine the target transform kernel, the processor is configured to cause the device to:in response to the video block being inter predicted, select the target transform kernel according to a luma transform block co-located with the first transform block.18. A non-transitory storage medium for storing computer readable instructions, the computer readable instructions, when executed by a processor of device for processing video data, causing the processor to:receive a bitstream of a video block comprising a first transform block of a first color component and a second transform block of a second color component, wherein the first transform block and the second transform blocks are co-located blocks;obtain the first transform block of the first color component and the second transform block of the second color component from the bitstream of the video block;determine a first flag indicating that all transform coefficients in the first transform block are zero;determine a second flag indicating that a cross component level reconstruction (CCLR) is applied to the first transform block; andin response to determining that CCLR is applied to the first transform block:refine one or more of the transform coefficients in the first transform block by adding one or more offset values, to obtain a refined first transform block, the one or more offset values being derived based on transform coefficients that are in the second transform block and are co-located with the one or more of the transform coefficients in the first transform block;determine a target transform kernel for the refined first transform block;perform a reverse transform on the refined first transform block based on the target transform kernel to obtain a target block; andreconstruct the first color component of the video block based on at least the target block.19. The non-transitory storage medium of claim 18, wherein, when the computer readable instructions cause the processor to determine the target transform kernel, the computer readable instructions cause the processor to:extract an indicator signaled in the bitstream, wherein the indicator specifies the target transform kernel and the indicator is signaled in response to determining that the CCLR is applied the first transform block; andselect the target transform kernel based on the indicator.20. The non-transitory storage medium of claim 18, wherein, when the computer readable instructions cause the processor to determine the target transform kernel, the computer readable instructions cause the processor to:in response to the video block being predicted under an intra prediction, derive the target transform kernel based on a mode of the intra prediction.