For detected objects, feature extraction 112 may identify various features within data that correspond to detected objects as part of object detection 110. For example, feature extraction 112 may be implemented as part of a deep neural network (e.g., a convolutional neural network (CNN)) which may be trained to generate feature vectors which, when compared with other feature vectors generated using the same deep learning model to indicate similarity between objects according to the respective distance between the feature vectors, in some embodiments. Feature extraction 112 may encode or generate extracted features (e.g., as a feature vector), in various embodiments, which may be used to represent a detected object. In some embodiments, features may be extracted using an CNN or other neural network model, and domain-specific attributes may use the extracted features as intermediate features from which to extract the domain-specific attributes as additional features for object recognition. For example, a bounding box value detected for a recognized object in image data may be then be used to direct sharpness, brightness, or other image data specific attributes for the bounding box area which can be used as additional features (including as features for indexing criteria as discussed below).
In the illustrated example, object detection 110 may detect two objects 154, which may be surrounded by bounding boxes as detected in image data 152. Because object detection 110 may be tuned (or implemented separately) for detecting different types of objects (e.g., human faces, animals, inanimate objects, text, etc.), the previous examples are not intended to be limiting.