ИНТЕЛЛЕКТУАЛЬНЫЕ СИСТЕМЫ И ТЕХНОЛОГИИ
А. В. Гайер "Вычислительно эффективная детекция области интереса паспорта РФ на изображении"
ВЫЧИСЛИТЕЛЬНЫЕ СИСТЕМЫ И СЕТИ
МАТЕМАТИЧЕСКОЕ МОДЕЛИРОВАНИЕ
ОБРАБОТКА ИНФОРМАЦИИ И АНАЛИЗ ДАННЫХ
УПРАВЛЕНИЕ И ПРИНЯТИЕ РЕШЕНИЙ
А. В. Гайер "Вычислительно эффективная детекция области интереса паспорта РФ на изображении"
Аннотация. 

В статье рассматривается задача локализации паспорта гражданина Российской Федерации на фотографиях, где документ занимает небольшую часть кадра. Проблема особенно актуальна для систем удаленной верификации, требующих от пользователя загрузки селфи с паспортом. Малый масштаб затрудняет распознавание и локализацию документа ввиду меньшего разрешения. Для повышения точности локализации предлагается сверхлегкая нейросетевая модель YOLOPassport для локализации области паспорта, сводя задачу к случаю фиксированного масштаба документа. По сравнению с компактными детекторами YOLO, YOLO-Passport имеет на порядок меньше операций и параметров. Предложенный подход позволил повысить полноту детекции паспорта РФ с 91.6% до 97.4%. Время работы модели на CPU составляет 3 мс, а ее размер в 8 битном формате всего 340 КБ, что делает ее эффективной для применения в промышленных системах и веб-приложениях на базе WASM.

Ключевые слова: 

распознавание документов, глубокое обучение, детекция объектов, YOLO.

DOI 10.14357/20718632250301 

EDN AEBPRY

Стр. 3-12.

Литература

1. Arlazarov, V.L., Slavin, O.A.: Issues of recognition and verification of text documents. ITiVS 3, 55–61 (2023), doi: 10.14357/20718632230306.
2. Paliwal, R., Yadav, S., Nain, N. (2020). FaceID: Verification of Face in Selfie and ID Document. In: Nain, N., Vipparthi, S., Raman, B. (eds) Computer Vision and Image Processing. CVIP 2019. Communications in Computer and Information Science, vol 1148. Springer, Singapore. doi: 10.1007/978-981-15-4018-9_40.
3. R. Reyes, B. Peralta, O. Nicolis and L. Caro, "A Proposal for Deep Online Facial Verification using Selfies and Id document," 2022 IEEE International Conference on Automation/XXV Congress of the Chilean Association of Automatic Control (ICA-ACCA), Curicó, Chile, 2022, pp. 1-6, doi: 10.1109/ICA-ACCA56767.2022.10006244.
4. ICAO Doc 9303 (Eighth Edition) Part 4: Specifications for Machine Readable Passports (MRPs) and other TD3 Size MRTDs, Machine Readable Travel Documents. International Civil Aviation Organization. — 2021. 
5. J. Llados, F. Lumbreras, V. Chapaprieta and J. Queralt, "ICAR: Identity Card Automatic Reader," Proceedings of Sixth International Conference on Document Analysis and Recognition, Seattle, WA, USA, 2001, pp. 470-474, doi: 10.1109/ICDAR.2001.953834.
6. D. P. Matalov, S. A. Usilin, D. P. Nikolaev and V. V. Arlazarov, “Application of Cascade Methods as a Universal Object Detection Tool,” Pattern Recognit. Image Anal., vol. 33, no 4, pp. 685-698, 2023, doi: 10.1134/S1054661823040302.
7. D. V. Tropin, A. M. Ershov, D. P. Nikolaev and V. V. Arlazarov, “Advanced Hough-based method for on-device document localization,” Computer Optics, vol. 45, no 5, pp. 702-712, 2021, doi: 10.18287/2412- 6179-CO-895.
8. D. V. Tropin, I. A. Konovalenko, N. S. Skoryukina, D. P. Nikolaev and V. V. Arlazarov, “Improved algorithm of ID card detection by a priori knowledge of the document aspect ratio,” ICMV 2020, 11605 ed., Bellingham, Washington 98227-0010 USA, Society of Photo-Optical Instrumentation Engineers (SPIE), Jan. 2021, vol. 11605, ISSN 0277-786X, ISBN 978-15-10640-40-5, vol. 11605, 116051F, pp.116051F1-116051F9, 2021, doi: 10.1117/12.2587029.
9. K. Javed and F. Shafait, "Real-Time Document Localization in Natural Images by Recursive Application of a CNN," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 2017, pp. 105-110, doi: 10.1109/ICDAR.2017.26.
10. Zhu, A., Zhang, C., Li, Z. et al. Coarse-to-fine document localization in natural scene image with regional attention and recursive corner refinement. IJDAR 22, 351–360 (2019). doi: 10.1007/s10032-019-00341-0.
11. N. S. Skoryukina, D. V. Tropin, Y. A. Shemiakina and V. V. Arlazarov, “Document Localization and Classification As Stages of a Document Recognition System,” Pattern Recognit. Image Anal., vol. 33, no 4, pp. 699-716, 2023, doi: 10.1134/S1054661823040430.
12. J. Shemiakina, I. Konovalenko, D. Tropin and I. Faradjev, “Fast projective image rectification for planar objects with Manhattan structure,” ICMV 2019, 11433 ed., Wolfgang Osten, Dmitry Nikolaev, Jianhong Zhou, Ed., Bellingham, Washington 98227-0010 USA, Society of Photo-Optical Instrumentation Engineers (SPIE), Jan. 2020, vol. 11433, ISSN 0277-786X, ISBN 978-15-10636-44-6, vol. 11433, pp. 114331N1-114331N9, 2020, doi: 10.1117/12.2559630.
13. A. M. Awal, N. Ghanmi, R. Sicre and T. Furon, "Complex Document Classification and Localization Application on Identity Document Images," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 2017, pp. 426-431, doi: 10.1109/ICDAR.2017.77.
14. Yaqiang Wu, Zhen Xu, Yong Duan, Yanlai Wu, Qinghua Zheng, Hui Li, Xiaochen Hu, and Lianwen Jin. 2024. RDLNet: A Novel and Accurate Real-world Document Localization Method. In Proceedings of the 32nd ACM International Conference on Multimedia (MM '24). Association for Computing Machinery, New York, NY, USA, 9847–9855. doi: 10.1145/3664647.3681655.
15. Chiron, G., Arrestier, F., Awal, A.M. (2021). Fast End-to-End Deep Learning Identity Document Detection, Classification and Cropping. In: Lladós, J., Lopresti, D., Uchida, S. Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science, vol 12824. Springer, Cham. doi: 10.1007/978-3-030-86337-1_23.
16. M. Tan, R. Pang and Q. V. Le, "EfficientDet: Scalable and Efficient Object Detection," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 10778-10787, doi: 10.1109/CVPR42600.2020.01079.
17. J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 779-788, doi: 10.1109/CVPR.2016.91.
18. R. Mulajkar and S. Yede, "YOLO Version v1 to v8 Comprehensive Review," 2024 International Conference on Inventive Computation Technologies (ICICT), Lalitpur, Nepal, 2024, pp. 472-478, doi: 10.1109/ICICT60155.2024.10544452.
19. Gioi, Rafael & Jakubowicz, Jeremie & Morel, Jean-Michel & Randall, Gregory. (2010). LSD: A Fast Line Segment Detector with a False Detection Control. IEEE transactions on pattern analysis and machine intelligence. 32. 722-32. doi: 10.1109/TPAMI.2008.300.
20. X. Lin, Y. Zhou, Y. Liu and C. Zhu, "A Comprehensive Review of Image Line Segment Detection and Description: Taxonomies, Comparisons, and Challenges," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 12, pp. 8074-8093, Dec. 2024, doi: 10.1109/TPAMI.2024.3400881.
21. Bulatov, K.B., Emelyanova, E.V., Tropin, D.V., Skoryukina, N.S., Chernyshova, Y.S., Sheshkus, A.V., Usilin, S.A., Ming, Z., Burie, J.C., Luqman, M.M., Arlazarov, V.V.: Midv-2020: A comprehensive benchmark dataset for identity document analysis. Computer Optics 46(2), 252–270 (2022), doi: 10.18287/2412-6179-CO-1006.
22. Performance Analysis of the YOLOv8 Model. Online Resource: https://habr.com/ru/articles/822917/ (access date: 13.06.2025).
23. Performance Comparison of YOLO Object Detection Models – An Intensive Study. Online Resource: https://learnopencv.com/performance-comparison-of-yolo-models/ (access date: 13.06.2025).
24. Ultralytics YOLO11. Online Resource: https://docs.ultralytics.com/ru/models/yolo11 (access date: 13.06.2025).
25. Object Detection using YOLOv5 OpenCV DNN in C++ and Python. Online Resource: https://learnopencv.com/object-detection-using-yolov5-and-opencv-dnn-in-c-and-python/#Inference-with-YOLOv5 (access date: 13.06.2025).
26. Ultralytics. Online Resource: https://github.com/ultralytics (access date: 13.06.2025).
27. A. V. Gayer, A. V. Sheshkus and Y. S. Chernyshova, “Augmentation on the fly for the neural networks learning,” Trudy ISA RAN (Proceedings of ISA RAS), vol. 68, special issue №S1, pp. 150-157, 2018, doi: 10.14357/20790279180517.
2025 / 03
2025 / 02
2025 / 01
2024 / 04

© ФИЦ ИУ РАН 2008-2018. Создание сайта "РосИнтернет технологии".