Existing solutions for producing multi-label synthetic images include geometric deformations of existing images, and use of Generative Adversarial Networks (GANs). Other solutions require additional semantic information, beyond what is available in the existing set of training data.
Henceforth, the terms “group of features” and “set of features” are used interchangeably. In addition, the term “vector of features” is used to mean a set (or group) of features having an identified order. Common implementations of computer vision using groups of features use vectors of features, however the present invention is not limited to the use of vectors of features and may be applied to one or more groups of features matched in a matching method other than by order.
There exist methods for extracting a group of image features from a digital image and other methods for synthesizing a synthetic digital image from a group of image features. The present invention, in some embodiments thereof, focuses on generating a new training data sample having a label set corresponding to a combination of label sets of one or more data samples of the existing training data, without requiring additional semantic information describing the existing training data. To do so, the present invention proposes, in some embodiments thereof, training one or more prediction models to combine by example, i.e. without explicitly specifying which features, features extracted from two or more input digital images to produce a new group of features for performing a feature related task. Some examples of a feature related task are training an image classification model, retrieving an image according to the new group of features, and generating a new digital image according to the new group of features.