Initially, the present inventors tried palette learning on random 32×32 patches extracted from images rescaled to 512×512 in the combined target and dirty training sets but determined that random patches mostly contain muted solid colors from background and low-frequency areas of the image; making it not only hard to train, but to evaluate performance on such data. To overcome this, it was observed that histogram entropy computed over the patch colors is a rough indicator of color complexity present in the patch (
In order to encourage more difficult patches to occur in the training set during target patch selection, the random patch selector was center biased, as the central area of the image is more likely to contain the more detailed main subject. In addition, patches of random size were generated and then the distribution was rescaled. This generally makes the training set of the first neural network more analogous to the region-based input seen during training of the second neural network.