As described above, a word string is a group of multiple words. For example, a word string may be multiple words in one sentence, multiple words in one paragraph, multiple words in one chapter, multiple words in one article, and multiple words on one page. Multiple words from other groups may also be a word string. The definition applies to each “word string” appearing below.
After detecting multiple word strings, the word string extraction unit 40 groups word strings having similarities equal to or higher than a predetermined level. In this manner, word strings relevant to similar topics can be grouped. Similarities between word strings may be calculated according to the related art.
Then, the word string extraction unit 40 calculates an appearance frequency for each group of word strings. The appearance frequency is calculated as, for example, the number of appearances (for example, the number of constituent members (word strings) of each group).
Thereafter, the word string extraction unit 40 extracts a group of word strings whose appearance frequency satisfies a predetermined condition (for example, appearance frequency of a predetermined level or higher). As a result, topics with high appearance frequency and high degree of attention are extracted.
The output unit 20 outputs information regarding the extracted group of word strings. The output unit 20 outputs information by which the details of each extracted group can be recognized. For example, the output unit 20 may output some of multiple word strings belonging to each extracted group. In addition, the output unit 20 may output words commonly appearing in multiple word strings belonging to each extracted group.