白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Computing system for extraction of textual elements from a document

專利號
US11176364B2
公開日期
2021-11-16
申請人
Hyland Software, Inc.(US OH Westlake)
發(fā)明人
Ralph Meier; Thorsten Wanschura; Johannes Hausmann; Harry Urbschat
IPC分類
G06K9/00; G06K9/20; G06T7/70; G06K9/72; G06T7/50; G06K9/62
技術(shù)領(lǐng)域
textual,document,text,computer,readable,in,extraction,element,computing,documents
地域: OH OH Westlake

摘要

Described herein are various technologies pertaining to text extraction from a document. A computing device receives the document. The document comprises computer-readable text and a layout, wherein the layout defines positions of the computer-readable text within a two-dimensional area represented by the document. Responsive to receiving the document, the computing device identifies at least one textual element in the computer-readable text based upon spatial factors between portions of the computer-readable text and contextual relationships between the portions of the computer-readable text. The computing device then outputs the at least one textual element.

說明書

In an embodiment, the computing device 100 (or another computing device) may generate the document 118 from the document image 116 by applying an optical character recognition (OCR) process to the document image 116. Exemplary file formats for the document 118 include, but are not limited to, a searchable PDF and/or a document format, such as .doc or .docx.

Although the data store 114 has been depicted and described as storing a single document (the document 118), it is to be understood that the data store 114 may store many different documents having varying areas, layouts, computer-readable text, fonts, font sizes, and/or typographical emphasis. Moreover, the many different documents may be of different defined types.

權(quán)利要求

1
微信群二維碼
意見反饋