In an embodiment, the textual extraction application 106 may calculate string metrics for portions of the computer-readable text in the document 118. For instance, the string metrics may include Levenshtein distance, Damerau-Levenshtein distance, longest common subsequence (LCS) distance, Hamming distance, and/or Jaro distance. The textual extraction application 106 may further identify the at least one textual element based upon the string metrics.
Responsive to identifying the at least one textual element, the textual extraction application 106 outputs the at least one textual element. In an example, the textual extraction application 106 may output the at least one textual element by presenting the at least one textual element as part of the graphical features 110 presented on the display 108 of the computing device 100. In another example, the textual extraction application 106 may output the at least one textual element by storing the at least one textual element in a data structure that is conducive to further data processing. For instance, the textual extraction application 106 may cause the at least one textual element to be stored in an eXtensible Markup Language (XML) file (e.g., an XML-based spreadsheet), in a comma separated value (CSV) file, or as an entry in a database. The textual extraction application 106 may store the at least one textual element from the document 118 as part of the extracted textual elements 122 stored in the data store 114.