基于模板匹配OCR的报告自动归档系统研究-AET-电子技术应用

基于模板匹配OCR的报告自动归档系统研究

信息技术与网络安全

张辰1，陈阳2

(1.广东省建设工程质量安全检测总站有限公司，广东广州510500； 2.广东省建筑科学研究院集团股份有限公司，广东广州510500)

摘要：针对建筑检测行业中检测报告多、人工归档效率低下的问题，利用模板匹配算法与LeNet框架建立了一套强鲁棒性用于报告文件数字符号的OCR识别系统。针对报告中感兴趣区域(ROI)位置和大小不固定的问题，采用了机器视觉领域中的模板匹配定位算法来定位报告文件的ROI区域。结合LeNet网络与模板匹配定位算法，实现了传统机器视觉方法与人工智能方法的结合，构建了一套检测报告自动归档系统。所构建的自动归档系统的正确归档率达到了95.8%，有效节约了人工成本与时间成本。

关键词： 模板匹配 OCR识别自动归档

中图分类号：TP274
文献标识码：A
DOI：10.19358/j.issn.2096-5133.2021.08.014
引用格式：张辰，陈阳. 基于模板匹配OCR的报告自动归档系统研究[J].信息技术与网络安全，2021，40(8)：84-89.

Research on automatic filing system of detection report based on template matching and OCR recognition

Zhang Chen1，Chen Yang2

(1.Guangdong Construction Engineering Quality and Safety Inspection Station Co.，Ltd.，Guangzhou 510500，China； 2.Guangdong Building Research Institute Group Co.，Ltd.，Guangzhou 510500，China)

Abstract：In view of the problems of many detection reports and low efficiency of manual filing in the construction inspection industry, a strong robust OCR identification system for digital symbols of report files is established by using template matching algorithm and LeNet framework. Aiming at the problem that the location and size of ROI in the report are not fixed, a template matching location algorithm in the field of machine vision is used to locate the ROI region of the report file. Combined with the matching and positioning algorithm of LeNet network and template, the combination of traditional machine vision method and artificial intelligence method is realized, and a set of automatic filing system of detection report is constructed. The correct filing rate of the automatic archive system is 95.8%, which effectively saves labor cost and time cost.

Key words :template matching；OCR identification；automatic filing

0 引言

光学字符识别(Optical Character Recognition，OCR)是指对文本资料的图像文件进行分析识别处理，获取文字及版面信息的过程。亦即将图像中的文字进行识别，并以文本的形式返回。其在文档归档应用背景下具有广阔的市场前景。OCR字符识别技术经过多年发展，已有LeNet[1]、RRPN[2]、DMPNet[3]、CTPN[4]等OCR网络结构被提出。其中，CTPN是目前应用最广的文本检测模型之一。其基本假设是单个字符相较于异质化程度更高的文本行更容易被检测，因此先对单个字符进行类似R-CNN的检测，并在检测网络中加入双向LSTM[5]，使检测结果形成序列提供了文本的上下文特征，便可以将多个字符进行合并得到文本行。LeNet网络提出时间较早，在银行票据手写体字符识别方面有着长期的应用。上述网络结构可以在通用背景下有效识别场景中的字符，对于非垂直文本也能进行检测。对于大多数OCR的应用场景，并不需要对图片中的所有字符进行识别，往往只需要对部分ROI区域的字符进行检测，但OCR技术对ROI区域的位移与旋转适应性较差，需要训练单独的网络来对ROI区域进行定位。机器视觉技术在制造业领域有着广泛的应用，特别是在工件定位、视觉测量等方面有大量成熟的算法，其中，模板匹配算法则针对工业定位[6-7]的应用背景，提出了基于灰度[8]、边缘[9]、变换域[10]的模板匹配算法，能适应各种工业定位需求[11-15]。

本文详细内容请下载：http://www.chinaaet.com/resource/share/2000003731

作者信息：

张辰1，陈阳2

(1.广东省建设工程质量安全检测总站有限公司，广东广州510500；

2.广东省建筑科学研究院集团股份有限公司，广东广州510500)

原创声明：此内容为AET网站原创，未经授权禁止转载。

相关内容