白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Computing system for extraction of textual elements from a document

專利號(hào)
US11176364B2
公開日期
2021-11-16
申請人
Hyland Software, Inc.(US OH Westlake)
發(fā)明人
Ralph Meier; Thorsten Wanschura; Johannes Hausmann; Harry Urbschat
IPC分類
G06K9/00; G06K9/20; G06T7/70; G06K9/72; G06T7/50; G06K9/62
技術(shù)領(lǐng)域
textual,document,text,computer,readable,in,extraction,element,computing,documents
地域: OH OH Westlake

摘要

Described herein are various technologies pertaining to text extraction from a document. A computing device receives the document. The document comprises computer-readable text and a layout, wherein the layout defines positions of the computer-readable text within a two-dimensional area represented by the document. Responsive to receiving the document, the computing device identifies at least one textual element in the computer-readable text based upon spatial factors between portions of the computer-readable text and contextual relationships between the portions of the computer-readable text. The computing device then outputs the at least one textual element.

說明書

In an embodiment, the computing device 100 may be in communication with a scanner (not shown). The scanner may generate the document image 116 by scanning a physical copy of a document.

The data store 114 also stores a document 118. The document 118 comprises computer-readable text (i.e., text that is searchable by the computing device 100) and a layout. The computer-readable text may include combinations of American Standard Code for Information Interchange (ASCII) characters and/or combinations of Unicode characters. For instance, the computer-readable text may include letters, numbers, punctuation, and/or mathematical symbols.

The layout defines positions of the computer-readable text within a two-dimensional area represented by the document 118. Thus, the document 118 has a length and a width. In a non-limiting example, the two-dimensional area may correspond to an A4 paper size, a letter paper size, or a legal paper size.

In an embodiment, the document 118 may be a tabular document such that the computer-readable text is arranged within one or more tables in the document 118. Thus, in the embodiment, the layout of the document 118 may define positions of the computer-readable text within the one or more tables.

權(quán)利要求

1
微信群二維碼
意見反饋