Computing system for extraction of textual elements from a document

專利號

US11176364B2

公開日期

2021-11-16

申請人

Hyland Software, Inc.（US OH Westlake）

發(fā)明人

Ralph Meier; Thorsten Wanschura; Johannes Hausmann; Harry Urbschat

IPC分類

G06K9/00; G06K9/20; G06T7/70; G06K9/72; G06T7/50; G06K9/62

技術(shù)領(lǐng)域

textual,document,text,computer,readable,in,extraction,element,computing,documents

地域： OH OH Westlake

摘要

Described herein are various technologies pertaining to text extraction from a document. A computing device receives the document. The document comprises computer-readable text and a layout, wherein the layout defines positions of the computer-readable text within a two-dimensional area represented by the document. Responsive to receiving the document, the computing device identifies at least one textual element in the computer-readable text based upon spatial factors between portions of the computer-readable text and contextual relationships between the portions of the computer-readable text. The computing device then outputs the at least one textual element.

說明書

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

In an embodiment, the textual extraction application 106 may identify the at least one textual element in the computer-readable text based upon font types and/or font sizes of the computer-readable text in the document 118. The textual extraction application 106 may provide indications of the font types and/or the font sizes of the computer-readable text to the computer-implemented model 120. The plurality of textual elements and the scores output by the computer-implemented model 120 may thus be further based upon the font types and/or font sizes of the computer-readable text.

It is to be understood that the at least one textual element identified by the textual extraction application 106 may vary in length and/or type. In an example, the at least one textual element may include a first textual element and a second textual element. The first textual element may be a word in the computer-readable text of the document 118, while the second textual element may be a number in the computer-readable text of the document 118. In another example, the first textual element may be indicative of an identifier for the defined criteria that is found within the computer-readable text of the document 118, while the second textual element may be a word that meets the defined criteria. In yet another example, the first textual element may include a first word and a second word, while the second textual element may include a third word.

權(quán)利要求

微信群二維碼

意見反饋

^{<blockquote id="ugj2k"></blockquote>}

白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Computing system for extraction of textual elements from a document

摘要

說明書

權(quán)利要求

該功能需要專業(yè)版企業(yè)版VIP權(quán)限，您可以：

該功能需要專業(yè)版企業(yè)版VIP權(quán)限，您可以：