In the first to sixth example embodiments, the examples in which the moving image data and the text data are data to be analyzed have been described. However, even in the case of analyzing other data, such as voice data, music data, image data, figure data, fingerprint data, biometric information, time series data (stock price fluctuation time series data and the like), file archive, object file, and binary data, the same effect can be obtained by similar processing.
That is, by subjecting the above pieces of data to (1) processing for detecting predetermined subjects, (2) processing for grouping the detected subjects based on the similarity between the detected subjects (similarity between the feature values), (3) processing for calculating the appearance frequency of each subject based on the grouping result, and (4) processing for extracting a subject whose appearance frequency satisfies a predetermined condition, it is possible to extract a desired subject (a subject whose appearance frequency satisfies a predetermined condition).
Hereinafter, examples of reference embodiments are additionally described.
1. A data processing apparatus including:
an extraction unit that analyzes data to be analyzed and extracts a subject whose appearance frequency in the data to be analyzed satisfies a predetermined condition among subjects detected in the data to be analyzed; and
an output unit that outputs information regarding the extracted subject.
2. The data processing apparatus described in 1,