|
Abstract.
The article discusses the principles of data storage organization designed to maintain strict consistency between data replicas in distributed databases. Storing distributed data with strict consistency is necessary to solve the challenges of digitalization of the economy, whose subjects and decision-making centers are geographically separated. Typical representatives of the recently appeared NewSQL DBMS class are considered, and a conclusion is made about which data structures are used in them and why. The main problems of data organization are highlighted. The conclusion is made about the prospects of distributed databases with strict consistency. The direction of using key-value storages as a universal mechanism for storing distributed data is highlighted. Assumptions are made about why such repositories should be used and which mechanisms, for example, hashing, can help in organizing distributed storage on a large (up to hundreds of thousands) nodes of a distributed database data warehouse.
Keywords:
organization of data storage, strict consistency, distributed databases, NewSQL, key-value stores, hashing.
DOI 10.14357/20718632250210
EDN BFSUXU
PP. 113-122.
References
1. The Digital Twin. 2023. Noel Crespi, Adam T. Drobot, Roberto Minerva (eds). 1238 p. ISBN: 978-3-031-21345-8. DOI: https://doi.org/10.1007/978-3-031-21343-4. 2. Zhang Yan. 2024. Digital Twin. Architectures, Networks, and Applications. 126 p. ISBN: 978-3-031-51818-8. DOI: https://doi.org/10.1007/978-3-031-51819-5. 3. Shu, J. 2024. Distributed Storage Systems. In: Data Storage Architectures and Technologies, pp. 185-224. https://doi.org/10.1007/978-981-97-3534-1_8. 4. Bitcoin Energy Consumption Index // DIGICONOMIST. URL: https://digiconomist.net/bitcoin-energy-consumption/ (Access data: 12.03.2024). 5. Zhu, G., He, D., An, H. et al. The governance technology for blockchain systems: a survey. In: Front. Comput. Sci. 2024, 18, 182813. https://doi.org/10.1007/s11704-023-3113-x. 6. Shiriaev, E. 2024. Load Balancing Methods for Distributed Data Storage: Challenges and Opportunities. In: Alikhanov, A., Tchernykh, A., Babenko, M., Samoylenko, I. (eds) Current Problems of Applied Mathematics and Computer Systems. CPAMCS 2023. Lecture Notes in Networks and Systems, vol. 1044. https://doi.org/10.1007/978-3-031-64010-0_10. 7. Xia, J., Huang, Q., Gui, Z., Tu, W. 2024. Relational Databases. In: Open GIS. Springer, Cham. doi: 10.1007/978-3-031-41748-1_5. 8. Data Distribution with MariaDB Xpand: mariadb.com. URL: https://mariadb.com/docs/xpand/architecture/components/xpand/data-distribution/ (Access data: 14.03.2025). 9. Lengstorf, J., Blom Hansen, T., Prettyman, S. 2022. Databases, MVC, and Data Objects. In: PHP 8 for Absolute Beginners. Apress, Berkeley, CA. doi: 10.1007/978-1-4842-8205-2_6. 10. Hu, C., Li, R., Li, C., Miao, H., Yang, Z., Zhang, T. 2022. Big Data Analysis for Anti-Money Laundering: A Case of Open Source Greenplum Application. In: Zhao, X., Yang, S., Wang, X., Li, J. (eds) Web Information Systems and Applications. WISA 2022. Lecture Notes in Computer Science, vol. 13579. Springer, Cham. https://doi.org/10.1007/978-3-031-20309-1_56. 11. Pavlenko A. 2022. Greenplum: analytical DB for Big Dataprojects: OTUS. URL: https://otus.ru/nest/post/2830/. Published: 05.08.2022. 12. Song, H., Zhou, W., Cui, H. et al. 2024. A survey on hybrid transactional and analytical processing. The VLDB Journal 33, 1485–1515. https://doi.org/10.1007/s00778-024-00858-9. 13. Ensuring Data Consistency and Transaction Management in Trino. 2024: freshers.in. URL: https://www.freshers.in/article/trino/ensuring-data-consistency-and-transaction-management-in-trino/. Published: 03.03.2024. 14. Cardas, C., Aldana-Martín, J.F., Burgueño-Romero, A.M. et al. 2023. On the performance of SQL scalable systems on Kubernetes: a comparative study. In: Cluster Comput, 26, 1935–1947. https://doi.org/10.1007/s10586-022-03718-9. 15. Data Partitioning in Trino: Best Practices. 2024: freshers.in. URL: https://www.freshers.in/article/trino/data-partitioning-in-trino-best-practices/. Published: 03.02.2024. 16. Vlad Mihalcea. 2023. YugabyteDB Architecture. URL: https://vladmihalcea.com/yugabytedb-architecture/. Published: 24.03.2023. 17. Bhanawat Hemant, Agarwal Sonal. 2022. TPC-C Benchmark: Scaling YugabyteDB to 100,000 Warehouses: YugabyteDB. URL: https://www.yugabyte.com/blog/tpc-cbenchmark-100000-warehouses-yugabytedb/. Published: 11.02.2022. 18. YugabyteDB Powers the Global Cache of a Top Five US Bank’s Business-Critical Payment App. 2025: YugabyteDB. URL: https://www.yugabyte.com/success-stories/bank-bill-pay-app/ (Access data: 14.03.2025) 19. Soto Christiane. US-Based Bank Scales Data Platform for Billions of Real-Time Customer Interactions. 2023: YugabyteDB. URL: https://www.yugabyte.com/blog/bankscales-data-platform/. Published: 24.10.2023. 20. YDB review: YDB. URL: https://ydb.tech/docs/ru/concepts/ (Access data: 17.03.2025). 21. Data model and schema: YDB. URL: https://ydb.tech/docs/ru/concepts/datamodel/ (Access data: 17.03.2025). 22. Disk subsystem of the cluster aka YDB BlobStorage : YDB. URL: https://ydb.tech/docs/ru/concepts/cluster/distributed_storage (Access data: 17.03.2025). 23. YDB introduces TPC-C: Revealing the performance of our distributed transactions: habr. URL: https://habr.com/ru/companies/ydb/articles/763938/. Published: 27.09.2023.
|