Abstract. This paper deals with the problem of segmentation of images of text fragments with known constraints on the relative position of elements. The model in which the constraints form a path graph is considered. It is shown that the segmentation problem in this case can be solved precisely with use of a dynamic programming algorithm, and this algorithm has an optimal asymptotic complexity. This algorithm was built into two recognition systems. The first system was designed to recognize identity documents, such as passports and driver's licenses. The proposed algorithm was used in this system to extract information fields. To do this, a two-level field hierarchy was introduced, in which the fields were grouped in rows, within which they were ordered from left to right, and the lines themselves were ordered from top to bottom. The second system was designed to recognize license plates in which the proposed algorithm was used to segment plates into individual characters. In this case, the natural ordering of characters from left to right was introduced. Thus, the generality of the proposed approach is demonstrated. Experiments were conducted on a closed data set to measure the quality and performance of the solutions obtained on a mobile phone. Experimental results showed that the solutions obtained are superior in quality to algorithms that do not use constraints on the mutual arrangement of elements, and their complexity allows them to work on mobile devices in real time. Keywords: text segmentation, dynamic programming, document recognition, image processing, OCR. PP. 66-78. DOI 10.14357/20718632190306 References 1. Nagy G. Disruptive developments in document recognition //Pattern Recognition Letters. – 2016. – V. 79. – P. 106-112. 2. Arlazarov V.V., Zhukovsky A.E., Krivtsov V.E., Nikolaev D.P., Polevoy D.V. Analiz osobennostey ispol’zovaniya statsionarnykh i mobil’nykh malorazmernykh tsifrovykh video kamer dlya raspoznavaniya dokumentov [Analysis of features of the use of fixed and mobile small-sized digital video camera for OCR]. //Informatsionnye tekhnologii i vychislitel’nye sistemy [Journal of Information Technologies and Computing Systems] – 2014. – V. 3. – P. 71-81. 3. Konovalenko I. A., Shemiakina J. A. Error values analysis for inaccurate projective transformation of a quadrangle //Journal of Physics: Conference Series. – IOP Publishing, 2018. – Т. 1096. – №. 1. – С. 012038. 4. Feldbach M., Tönnies K. D. Robust Line Detection in Historical Church Registers //Joint Pattern Recognition Symposium. – Springer, Berlin, Heidelberg, 2001. – P. 140-147. 5. Arlazarov V. V., Postnikov V. V., Sholomov D.L. Cognitive Forms – sistema massovogo vvoda strukturirovannykh dokumentov [Cognitive Forms – system for mass input of structured documents] //Trudy Instituta sistemnogo analiza rossiyskoy akademii nauk [Proceeding of the Institute for Systems Analysis of the Russian Academy of Science]. – 2002. – V. 1. – P. 35-46. 6. Felzenszwalb, P. F., Girshick, R. B., McAllester, D., Ramanan, D. Object detection with discriminatively trained part-based models //IEEE transactions on pattern analysis and machine intelligence. – 2010. – V. 32. – №. 9. – P. 1627-1645. 7. Chrysos, G. G., Antonakos, E., Zafeiriou, S., Snape, P. Offline deformable face tracking in arbitrary videos //Proceedings of the IEEE International Conference on Computer Vision Workshops. – 2015. – P. 1-9. 8. Zhang, L., Kong, H., Liu, S., Wang, T., Chen, S., Sonka, M. Graph-based segmentation of abnormal nuclei in cervical cytology //Computerized Medical Imaging and Graphics. – 2017. – V. 56. – P. 38-48. 9. Sheshkus, A., Nikolaev, D. P., Ingacheva, A., Skoryukina, N. Approach to recognition of flexible form for credit card expiration date recognition as example //Eighth International Conference on Machine Vision (ICMV 2015). – International Society for Optics and Photonics. – 2015. – V. 9875. – P. 98750R. 10. Slugin D. G., Arlazarov V. V. Poisk tekstovykh poley dokumenta s pomoshch’yu metodov obrabotki izobrazheniy [Text fields extraction based on image processing] //Trudy Instituta sistemnogo analiza rossiyskoy akademii nauk [Proceeding of the Institute for Systems Analysis of the Russian Academy of Science]. – 2017. – V. 67. – № 4. – P. 65–73. 11. Louloudis, G., Gatos, B., Pratikakis, I., Halatsis, C. Text line and word segmentation of handwritten documents //Pattern Recognition. – 2009. – V. 42. – №. 12. – P. 3169-3183. 12. Wang K., Belongie S. Word spotting in the wild //European Conference on Computer Vision. – Springer, Berlin, Heidelberg, 2010. – P. 591-604. 13. Epshtein B., Ofek E., Wexler Y. Detecting text in natural scenes with stroke width transform //2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. – IEEE, 2010. – P. 2963-2970. 14. Turki H., Halima M. B., Alimi A. M. A hybrid method of natural scene text detection using MSERs masks in HSV space color //Ninth International Conference on Machine Vision (ICMV 2016). – International Society for Optics and Photonics, 2017. – V. 10341. – P. 1034111. 15. Felzenszwalb P. F., Zabih R. Dynamic programming and graph algorithms in computer vision //IEEE transactions on pattern analysis and machine intelligence. – 2011. – V. 33. – №. 4. – P. 721-740. 16. Van Herk M. A fast algorithm for local minimum and maximum filters on rectangular and octagonal kernels //Pattern Recognition Letters. – 1992. – V. 13. – №. 7. – P. 517-521. 17. Otsu N. A threshold selection method from gray-level histograms //IEEE transactions on systems, man, and cybernetics. – 1979. – V. 9. – №. 1. – P. 62-66. 18. Viola P., Jones M. Rapid object detection using a boosted cascade of simple features //Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on. – IEEE, 2001. – V. 1. – P. I-I. 19. Bulatov K. B., Arlazarov V. V., Chernov T. S., Slavin O. A., Nikolaev D. P. Smart IDReader: Document Recognition in Video Stream // 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). – IEEE, 2017. – V. 6. – P. 39-44., ISSN 2379-2140, ISBN 978-15-38635-86-5, doi: 10.1109/ICDAR.2017.347. 20. Povolotskiy M. A., Tropin D. V. Dynamic programming approach to template-based OCR //Eleventh International Conference on Machine Vision (ICMV 2018). – International Society for Optics and Photonics, 2019. – V. 11041. – P. 110411T. 21. Yujian L., Bo L. A normalized Levenshtein distance metric //IEEE transactions on pattern analysis and machine intelligence. – 2007. – V. 29. – №. 6. – P. 1091-1095. 22. Povolotskiy M. A., Kuznetsova E. G., Khanipov T. M. Russian license plate segmentation based on dynamic time warping //European Conference on Modelling and Simulation. – 2017. – P. 285-291. 23. Povolotskiy M.A., Kuznetsova E.G., Utkin N.V, Nikolaev D.P. Segmentatsiya registratsionnykh nomerov avtomobiley c primeneniem algoritma dinamicheskoy transformatsii vremennoy osi [Segmentation of vehicle registration plates based on dynamic time warping] //Sensornye sistemy [Sensory systems]. – 2018. – V. 32. – № 1. – P. 50–59. doi: 10.7868/S0235009218010080 24. Visillect. MARINA — Automatic license plate recognition system. Available at: http://visillect.com/en/alpr/ (accessed June 17, 2019)
|