Comparative analysis of selected programs for optical text recognition

Edyta Łukasik

e.lukasik@pollub.pl
Institute of Computer Science, Lublin University of Technology, Nadbystrzycka 36B, 20-618 Lublin, Poland (Poland)

Tomasz Zientarski


Institute of Computer Science, Lublin University of Technology, Nadbystrzycka 36B, 20-618 Lublin, Poland (Poland)

Abstract

The aim of the article is to compare three programs for the optical text recognition. The problem of the optical text recognition has been defined. Next, briefly the functionality of this technology was described. The most important programs realizing the discussed problem were also characterized. The selected programs were tested using two samples of machine writing in Polish. The speed of the text recognition process was determined. The correctness of characters and words recognition in the analyzed text was also specified.


Keywords:

Optical Character Recognition; OCR; Tesseract; Ocrad; GOCR

[1] Bieniecki, Analiza wymagań dla metod przetwarzania wstępnego obrazów w automatycznym rozpoznawaniu tekstu, http://wbieniec.kis.p.lodz.pl/research/files/05_Bronislawow_OCR.pdf [12.11.2017].
[2] Tobias Blanke, Michael Bryant, Mark Hedges, Open source optical character recognition for historical research, Journal of Documentation 68 (2012), 659-683.
[3] Inad Aljarrah, Osama Al-Khaleel, Khaldoon Mhaidat, Mu’ath Alrefai, Abdullah Alzu’bi, Mohammad Rabab’ah, Automated System for Arabic Optical Character Recognition with Lookup Dictionary, Journal of Emerging Technologies in Web Intelligence 4 (2012), 362-370.
[4] Abbyy Technology Portal, https://abbyy.technology/en:start, [22.11.2017].
[5] The Tesseract open source OCR engine, http://code.google.com/p/tesseract-ocr [20.11.2017].
[6] https://products.aspose.com/ocr, [01.11.2017].
[7] GOCR open-source character recognition, http://jocr.sourceforge.net, [25.11.2017].
[8] www.gnu.org/software/ocrad/manual/ocrad_manual.html, [10.12.2017].
[9] Review of Linux OCR software, https://www.mathstat.dal.ca/~selinger/ocr-test [01.12.2017].
[10] Linux OCR Software Comparison, httpswww.splitbrain.org/blog/2010-06/15-linux_ocr_software_comparison, [02.12.2017].
Download


Published
2018-09-30

Cited by

Łukasik, E., & Zientarski, T. (2018). Comparative analysis of selected programs for optical text recognition . Journal of Computer Sciences Institute, 7, 191–194. https://doi.org/10.35784/jcsi.676

Authors

Edyta Łukasik 
e.lukasik@pollub.pl
Institute of Computer Science, Lublin University of Technology, Nadbystrzycka 36B, 20-618 Lublin, Poland Poland

Authors

Tomasz Zientarski 

Institute of Computer Science, Lublin University of Technology, Nadbystrzycka 36B, 20-618 Lublin, Poland Poland

Statistics

Abstract views: 292
PDF downloads: 243