For example, all possible pairs of all persons (all detection IDs) detected in previously processed frames and the detection ID of the person detected from the frame to be processed may be created, and similarity determination may be performed for each pair. In this case, however, the number of pairs would become enormous. As a result, the processing speed may be reduced.
Therefore, for example, the person extraction unit 10 may index a person detected from each frame as shown in
The indexes shown in
In the third layer, nodes corresponding to each of all the detection IDs obtained from all the frames processed up to that point are arranged. Then, the plurality of nodes arranged in the third layer are grouped such that those having a similarity (similarity between the feature values shown in