Then, the person extraction unit 10 creates a pair of the detection ID of a person detected from the frame to be processed and each of the plurality of detection IDs included in the above group of the second layer. Then, the person extraction unit 10 calculates a similarity for each pair, and determines whether the calculated similarity is equal to or higher than the second threshold value. As described above, the second threshold value is higher than the first threshold value.
In a case where there is no detection ID whose similarity is equal to or higher than the second threshold value in the group of the second layer, the person extraction unit 10 determines that the outer appearance feature value of the person detected from the frame to be processed is not similar by the predetermined level or more to that of any of persons detected in previously processed frames. Then, the person extraction unit 10 associates a new person ID with the detection ID of the person detected from the frame to be processed, and registers the new person ID in the detected person information shown in