白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Computing system for extraction of textual elements from a document

專利號(hào)
US11176364B2
公開日期
2021-11-16
申請(qǐng)人
Hyland Software, Inc.(US OH Westlake)
發(fā)明人
Ralph Meier; Thorsten Wanschura; Johannes Hausmann; Harry Urbschat
IPC分類
G06K9/00; G06K9/20; G06T7/70; G06K9/72; G06T7/50; G06K9/62
技術(shù)領(lǐng)域
textual,document,text,computer,readable,in,extraction,element,computing,documents
地域: OH OH Westlake

摘要

Described herein are various technologies pertaining to text extraction from a document. A computing device receives the document. The document comprises computer-readable text and a layout, wherein the layout defines positions of the computer-readable text within a two-dimensional area represented by the document. Responsive to receiving the document, the computing device identifies at least one textual element in the computer-readable text based upon spatial factors between portions of the computer-readable text and contextual relationships between the portions of the computer-readable text. The computing device then outputs the at least one textual element.

說(shuō)明書

Turning now to FIG. 8, a methodology 800 executed by a computing device for extracting textual elements from computer-readable text of a document is illustrated. The methodology 800 begins at 802, and at 804, the computing device receives a document comprising computer-readable text and a layout. The layout defines positions of the computer-readable text within a two-dimensional area represented by the document.

At 806, the computing device identifies at least one textual element in the computer-readable text of the document based upon spatial factors between portions of the computer-readable text and contextual relationships between the portions of the computer-readable text. As such, the computing device provides the computer-readable text and the positions of the computer-readable text within the document as input to at least one computer-implemented model. The at least one computer-implemented model outputs, based upon the input, a plurality of textual element within the computer-readable text and scores assigned to the textual elements. The at least one textual element is included in the plurality of textual elements. The computing device identifies the at least textual element based on a score in the scores. The score is indicative of a likelihood that the at least one textual element represents relevant content in the document based upon defined criteria for a defined type of the document.

At 808, responsive to identifying the at least one textual element in the computer-readable text, the computing device outputs the at least one textual element. The methodology 800 concludes at 810.

權(quán)利要求

1
微信群二維碼
意見(jiàn)反饋