In addition to the information described above, the output unit 20 may output the calculated appearance frequency. In a case where attribute information (for example, upload date and time, data creation date and time) is associated with the text data inputted to the word string extraction unit 40, the output unit 20 may perform the output using the attribute information. For example, for each extracted group, the output unit 20 may count the appearance timings (for example, upload date and time, data creation date and time) of multiple word strings belonging to each group. Then, the output unit 20 may create and output a graph showing a temporal change in appearance frequency. Information indicating the presence or absence of appearance for each predetermined time zone may be created and outputted. As a result, it is possible to output extraction result in the same manner of display as in
According to the present example embodiment described above, it is possible to retrieve a word string satisfying a predetermined condition from text data. For example, in a case where a plurality of pieces of text data created by a plurality of users are provided to the data processing apparatus 1 as data to be analyzed, it is possible to extract word strings relevant to topics with high appearance frequency among the text data of the plurality of users. As a result, the user who sees the extraction result can recognize topics with a high degree of attention.
It should be noted that, also in the present example embodiment, the technique described in the second example embodiment can be used.