白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Method and system for producing digital image features

專利號
US11176417B2
公開日期
2021-11-16
申請人
International Business Machines Corporation(US NY Armonk)
發(fā)明人
Amit Aides; Amit Alfassy; Leonid Karlinsky; Joseph Shtok
IPC分類
G06K9/62; G06N3/04; G06N3/08
技術領域
model,prediction,training,features,labels,group,plurality,optionally,score,operator
地域: NY NY Armonk

摘要

A system for generating a set of digital image features, comprising at least one hardware processor adapted for: producing a plurality of input groups of features, each produced by extracting a plurality of features from one of a plurality of digital images; computing an output group of features by inputting the plurality of input groups of features into at least one prediction model trained to produce a model group of features in response to at least two groups of features, such that a model set of labels indicative of the model group of features is similar, according to at least one similarity test, to a target set of labels computed by applying at least one set operator to a plurality of input sets of labels each indicative of one of the at least two groups of features; and providing the output group of features to at least one other processor.

說明書

BACKGROUND

The present invention, in some embodiments thereof, relates to data synthesis and, more specifically, but not exclusively, to synthesis of data related to digital images.

There is an increasing need for data augmentation and data synthesis, for example when accruing sufficient data for a task is difficult to achieve. For example, there is an increasing need to augment existing training data or produce synthetic training data in order to provide sufficient data to accurately train a deep machine learning model, such as a deep neural network with a large amount of internal layers. One field where training data is needed is computer vision, where a model is trained to identify, or extract, or both, semantic content of interest in a digital image using large quantities of labeled data tailored to a given task. In computer vision the trained model is expected to encode within the model all semantic content of interest, including one or more object categories present in a digital image, one or more visual attributes of an object and location of an object and of a visual attribute. However, it may be the case that only a small amount of labeled samples, depicting an object or a feature, are available.

權利要求

1
What is claimed is:1. A system for generating a set of digital image features, comprising at least one hardware processor adapted for:producing a plurality of input groups of features, each produced by extracting a plurality of features from one of a plurality of digital images;computing an output group of features by inputting the plurality of input groups of features into at least one prediction model trained to produce a model group of features in response to at least two groups of features, such that a model set of labels indicative of the model group of features is similar, according to at least one similarity test, to a target set of labels computed by applying at least one set operator to a plurality of input sets of labels each indicative of one of the at least two groups of features; and providing the output group of features to at least one other hardware processor for the purpose of performing at least one feature related task;wherein the at least one prediction model comprises a first prediction model and a second prediction model;wherein the first prediction model is associated with a first set operator applied to another plurality of input sets of labels, each indicative of a plurality of features of one of the plurality of digital images;wherein the second prediction model is associated with a second set operator applied to the other plurality of input sets of labels;wherein computing the output group of features comprises:computing a first intermediate group of features by inputting into the first prediction model a first plurality of groups of features; andcomputing a second intermediate group of features by inputting into the second prediction model a second plurality of groups of features;wherein the second plurality of groups of features comprises the first intermediate group of features.2. The system of claim 1, wherein the at least one prediction model is trained using a loss score, where computing the loss score comprises:computing the target set of labels by applying the at least one set operator to the plurality of input sets of labels;computing the model set of labels by providing the model group of features to at least one classification model; andcomputing a difference between the target set of labels and the model set of labels.3. The system of claim 1, wherein at least one of the at least one set operator is selected from a group of set operators consisting of: union, intersection, and subtraction.4. The system of claim 1, wherein the output group of features is the second intermediate group of features.5. The system of claim 1, wherein at least some of the plurality of input groups of features are included in at least one of: the first plurality of groups of features, and the second plurality of groups of features.6. The system of claim 1, wherein access to the plurality of digital images is by at least one of: receiving the plurality of digital images from at least one other hardware processor, and retrieving the plurality of digital images from at least one non-volatile digital storage, connected to the at least one hardware processor.7. The system of claim 1, wherein at least one of the at least one prediction model is a neural network.8. The system of claim 1, wherein at least one of the at least one feature related task is selected from a group of tasks consisting of: generating a digital image, retrieving a digital image, and training at least one other classification model.9. A system for generating a set of digital image features, comprising at least one hardware processor adapted for:producing a plurality of input groups of features, each produced by extracting a plurality of features from one of a plurality of digital images;computing an output group of features by inputting the plurality of input groups of features into at least one prediction model trained to produce a model group of features in response to at least two groups of features, such that a model set of labels indicative of the model group of features is similar, according to at least one similarity test, to a target set of labels computed by applying at least one set operator to a plurality of input sets of labels each indicative of one of the at least two groups of features; and providing the output group of features to at least one other hardware processor for the purpose of performing at least one feature related task;wherein the at least one prediction model is trained using a loss score, where computing the loss score comprises:computing the target set of labels by applying the at least one set operator to the plurality of input sets of labels;computing the model set of labels by proving the model group of features to at least one classification model; andcomputing a difference between the target set of labels and the model set of labelswherein training the at least one prediction model comprises:in each of a plurality of iterations:generating a plurality of groups of training features, each group of training features extracted from one of the plurality of training images;providing the plurality of groups of training features to each of a plurality of set-operator prediction models, each set-operator prediction model associated with one of a plurality of set operators and adapted to produce one of a plurality of model output groups of features corresponding to a model target set of labels computed by applying the respective set operator to the plurality of input sets of labels;providing the plurality of model output groups of features to at least one multi-label classification model to produce a plurality of output sets of labels, each output set of labels associated with one of the plurality of model output groups of features and having a score set comprising a plurality of label-scores, each label-score indicative of a confidence of identifying by the at least one multi-label classification model one label of the output set of labels in respective model output group of features;computing the loss score using the plurality of output sets of labels and the plurality of input sets of labels; andmodifying at least one model value of at least one of the plurality of set-operator prediction models to reduce another loss score computed in another iteration of the plurality of iterations.10. The system of claim 9, wherein computing the loss score comprises computing for each of the plurality of set-operator prediction models a model loss score, using the respective score set of the respective output set of labels, where the model loss score is indicative of the difference between the respective model target set of labels and a respective model output set of labels, predicted by at least one classification model for the model output group of features.11. The system of claim 9, wherein providing the plurality of groups of training features to each of the plurality of set-operator prediction models comprises providing to at least one set-operator prediction model of the plurality of set-operator prediction models a first group of training features of the plurality of groups of training features as a first input and a second group of training features of the plurality of groups of training features as a second input, to produce a first model group of features; andwherein computing the loss score further comprises:providing to the at least one set-operation prediction model the first training group of features as the second input and the second group of training features as the first input, to produce a second model group of features;applying a mean square error method to the first model group of features and the second model group of features to produce a symmetric reconstruction error score; andcomputing the loss score further using the symmetric reconstruction error score.12. The system of claim 9, wherein the plurality of set-operator prediction models comprises an intersection model of the plurality of set-operator prediction models such that a target intersection group of features of the intersection model is computed by applying an intersection operator to at least two first groups of features provided to the intersection model;wherein the plurality of set-operator prediction models comprises a subtraction model of the plurality of set-operator prediction models, such that a target subtraction group of features of the subtraction model is computed by applying a subtraction operator to at least two second groups of features provided to the subtraction model;wherein the plurality of set-operator prediction models comprises a union model of the plurality of set-operator prediction models, such that a target union group of features of the third model is computed by applying a union operator to at least two third groups of features provided to the union model; andwherein computing the loss score further comprises:providing a first group of features and a second group of features, both of the plurality of groups of training features, to the intersection model to produce an intersection group of features;providing the first group of features and the second group of features to the subtraction model to produce a subtraction group of features;providing the subtraction group of features and the intersection group of features to the union model, to produce a union group of features;applying a mean square error method to the union group of features and the first group of features to produce a mode-collapse reconstruction error score; andcomputing the loss score further using the mode-collapse reconstruction error score.13. The system of claim 12, wherein computing the loss score further comprises:providing the second group of features and the first group of features to the subtraction model to produce another subtraction group of features;providing the other subtraction group of features and the intersection group of features to the union model to produce a another union group of features;applying the mean square error method to the other union group of features and the second group of features to produce another mode-collapse reconstruction error score; andcomputing the loss score further using the other mode-collapse reconstruction error score.14. The system of claim 9, further comprising training the at least one multi-label classification model comprising:in at least some of the plurality of iterations:computing a multi-label classification loss score using the plurality of output sets of labels and the plurality of input sets of labels; andmodifying at least one classification model value of the at least one multi-label classification model to reduce another multi-label classification loss score computed in another iteration of the plurality of iterations;wherein computing the multi-label classification loss score comprises, for each of some of the plurality of training images:computing a set of classification scores by providing the training image to the at least one multi-label classification model; andcomputing a binary cross-entropy loss value for the set of classification scores and the respective input set of labels.15. The system of claim 9, wherein the plurality of training images comprises a plurality of training image pairs each comprising two of the plurality of training images; andwherein the plurality of groups of training features comprises two groups of training features, each extracted from one of the two training images of one of the plurality of training image pairs.16. The system of claim 9, wherein at least one of the plurality of set-operator prediction models is another neural network; andwherein the at least one multi-label classification model is a second other neural network.17. A system for training a plurality of set-operation prediction models, comprising at least one hardware processor adapted to:in each of a plurality of iterations:generating a plurality of groups of training features, each group of training features extracted from one of a plurality of training images, each training image having an input set of labels indicative of a plurality of training features of the respective training image;providing the plurality of groups of training features to each of a plurality of set-operator prediction models, each set-operator prediction model associated with one of a plurality of set operators and adapted to produce one of a plurality of model output groups of features corresponding to a model target set of labels computed by applying the respective set operator to a plurality of input sets of labels of the plurality of training images;providing the plurality of model output groups of features to at least one multi-label classification model to produce a plurality of output sets of labels, each output set of labels associated with one of the plurality of model output groups of features and having a score set comprising a plurality of label-scores, each label-score indicative of a confidence of identifying by the at least one multi-label classification model one label of the output set of labels in respective model output group of features;computing a loss score using the plurality of output sets of labels and the plurality of input sets of labels; andmodifying at least one model value of at least one of the plurality of set-operator prediction models to reduce another loss score computed in another iteration of the plurality of iterations;wherein the at least one prediction model comprises a first prediction model and a second prediction model;wherein the first prediction model is associated with a first set operator applied to another plurality of input sets of labels, each indicative of a plurality of features of one of the plurality of digital images;wherein the second prediction model is associated with a second set operator applied to the other plurality of input sets of labels;wherein computing the output group of features comprises:computing a first intermediate group of features by inputting into the first prediction model a first plurality of groups of features; andcomputing a second intermediate group of features by inputting into the second prediction model a second plurality of groups of features;wherein the second plurality of groups of features comprises the first intermediate group of features.18. The system of claim 17, wherein computing the loss score comprises for at least one model of the plurality of set-operation prediction models:computing a target set of labels by applying at least one set operator associated with the at least one model to the plurality of input sets of labels; andcomputing a difference between the target set of labels and at least one output set of labels of the respective at least one model.
微信群二維碼
意見反饋