白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Computing system for extraction of textual elements from a document

專利號(hào)
US11176364B2
公開(kāi)日期
2021-11-16
申請(qǐng)人
Hyland Software, Inc.(US OH Westlake)
發(fā)明人
Ralph Meier; Thorsten Wanschura; Johannes Hausmann; Harry Urbschat
IPC分類
G06K9/00; G06K9/20; G06T7/70; G06K9/72; G06T7/50; G06K9/62
技術(shù)領(lǐng)域
textual,document,text,computer,readable,in,extraction,element,computing,documents
地域: OH OH Westlake

摘要

Described herein are various technologies pertaining to text extraction from a document. A computing device receives the document. The document comprises computer-readable text and a layout, wherein the layout defines positions of the computer-readable text within a two-dimensional area represented by the document. Responsive to receiving the document, the computing device identifies at least one textual element in the computer-readable text based upon spatial factors between portions of the computer-readable text and contextual relationships between the portions of the computer-readable text. The computing device then outputs the at least one textual element.

說(shuō)明書(shū)

The document 118 may have a defined type, wherein the defined type is indicative of a purpose of the document 118, and it should be understood that various documents with a specified purpose have similar relevant content often in similar spatial formats. In an example, the defined type may be an educational transcript that conveys information relating to grades received by a student for classes completed by the student. In another example, the defined type may be a taxation form that includes financial information of an entity that is used in determining taxes incurred by the entity. In yet another example, the defined type may be an invoice for goods or services. In a further example, the defined type may be a medical record. In an additional example, the defined type may be a personnel record. Other defined types may include human resource related documents, financial documents, such as documents related to insurance and mortgages, business cards, identification documents, such as drivers' licenses or visa documents, ballot papers, trade documents, bills of lading, and/or bank statements. In such examples, there will be similar context and text, such as capital letters standing alone, e.g., A, B, C, D, or F, or with a plus or minus sign for a transcript, and numerical values near a side or bottom of a document for an invoice or tax form. While there are similarities in these documents, there are also substantial variations, e.g., in location and how the content is presented. These variations make errors in automatic reading and extraction of relevant information from such documents by a computing device a significant problem, which is addressed by the features disclosed herein.

權(quán)利要求

1
微信群二維碼
意見(jiàn)反饋