白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Handwriting detector, extractor, and language classifier

專利號(hào)
US11176361B2
公開日期
2021-11-16
申請(qǐng)人
Raytheon Company(US MA Waltham)
發(fā)明人
Darrell L. Young; Kevin C. Holley
IPC分類
G06F40/171; G06F40/263; G06K9/00; G06K9/34; G06K9/38; G06K9/62; G06K9/68; G06K9/72
技術(shù)領(lǐng)域
language,may,or,in,bounding,be,hardware,features,geometric,image
地域: MA MA Waltham

摘要

Disclosed are methods for handwriting recognition. In some aspects, an image representing a page of a sample document is analyzed to identify a region having indications of handwriting. The region is analyzed to determine frequencies of a plurality of geometric features within the region. The frequencies may be compared to profiles or histograms of known language types, to determine if there are similarities between the frequencies in the sample document relative to those of the known language types. In some aspects, machine learning may be used to characterize the document as a particular language type based on the frequencies of the geometric features.

說(shuō)明書

FIG. 3 illustrates separation of algorithmic infrastructure from storage infrastructure. FIG. 3 shows that images 302 of documents may be stored in a data store 304. In some aspects, the data store 304 may utilize a “Mongo” database. A document processing component 306 reads the images from the datastore 304 to detect handwriting on the documents 302. The document processing component 306 may utilize a variety of technologies to perform this task. For example, the processing may be implemented in a variety of programming languages, including java, C++, python, Perl, or other languages known in the art. In some aspects, the document processing component 306 may rely on MatLab? functions.

By separating the processing infrastructure from document storage infrastructure as illustrated by FIG. 3, the infrastructure is able to scale to available resources required to handles millions of documents in an automatic workflow. This provides for users to direct and annotate processing results. Users can point the system to collections of scanned images and route the processed result to the appropriate language specialist. Users can annotate the machine learning results as being incorrect or missing. The annotations may be used for further analysis such as algorithm refinement.

One goal of binarization is to convert the input document so that the foreground which includes the handwriting, is logical true. This simple procedure proves to be a difficult task due to variations in illumination, condition of the paper, and other factors such as variations in the ink. The success, however, of the later stages of handwriting recognition and language classification depend on a good binarization.

權(quán)利要求

1
微信群二維碼
意見反饋