白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Computing system for extraction of textual elements from a document

專利號
US11176364B2
公開日期
2021-11-16
申請人
Hyland Software, Inc.(US OH Westlake)
發(fā)明人
Ralph Meier; Thorsten Wanschura; Johannes Hausmann; Harry Urbschat
IPC分類
G06K9/00; G06K9/20; G06T7/70; G06K9/72; G06T7/50; G06K9/62
技術領域
textual,document,text,computer,readable,in,extraction,element,computing,documents
地域: OH OH Westlake

摘要

Described herein are various technologies pertaining to text extraction from a document. A computing device receives the document. The document comprises computer-readable text and a layout, wherein the layout defines positions of the computer-readable text within a two-dimensional area represented by the document. Responsive to receiving the document, the computing device identifies at least one textual element in the computer-readable text based upon spatial factors between portions of the computer-readable text and contextual relationships between the portions of the computer-readable text. The computing device then outputs the at least one textual element.

說明書

In an example, the textual extraction application 106 may select the first textual element 402 and the second textual element 404 based upon an angle 412 between the first textual element 402, the second textual element 404, and an axis 410 despite the fact that the sixth textual element 404 is located at a different position than the second textual element shown in FIG. 3. In another example, the textual extraction application 106 may select the seventh textual element 406 and the eight textual element 408 as the seventh textual element 406 and the eight textual element 408 are within the distance range 328 described above. Notably, the textual extraction application 106 selects the seventh textual element 406 and the eight textual element 408 despite the fact that positions of the seventh textual element 406 and the eight textual element 408 are different from positions of the third textual element 324 and the fourth textual element 326, respectively, and despite the fact that the identifiers for the classes comprise four digits instead of three digits.

權利要求

1
微信群二維碼
意見反饋