|
Abstract.
This work investigates the impact of the number, complexity, and diversity of training views on the quality of single-image 3D object reconstruction. Experiments are conducted on the 3D-R2N2 and DISN datasets with object renders from ShapeNet. Training views are categorized as easy (fixed camera-object distance, limited viewing angle) and hard (variable distance, wide range of angles). The AutoSDF model is employed, generating objects as truncated signed distance fields. For easy views, it is established that six images per object are sufficient for satisfactory reconstruction quality, while increasing to 12 and 24 views yields comparable results. Hard views enhance the model's robustness to input data variability and enable correction of geometric distortions in input images; however, they require a larger sample size (minimum 24 views per object). The findings provide recommendations for training set composition that improve the model's generalization ability while maintaining geometric correctness under varying input conditions.
Keywords:
neural networks, signed distance field, 3D models, computer graphics, deep learning, visual quality, 3D reconstruction, machine learning, transformer, volumetric representation of objects.
DOI 10.14357/20718632260102
EDN IKPKBW
PP. 15-27.
References
1. Wang N., Zhang Y., Li Z., et al. Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images. In: Ferrari V., Hebert M., Sminchisescu C., Weiss Y. (eds). Computer Vision – ECCV 2018, Lecture Notes in Computer Science. Cham: Springer; 2018. p. 55-71. https//doi.org/10.1007/978-3-030-01252-6_4 2. Gupta K. and Manmohan C. Neural Mesh Flow: 3D Manifold Mesh Generation via Diffeomorphic Flows. Neural Information Processing Systems. 2020;33:1747-1758. 3. Hui K.-H., Li R., Hu J., Fu C.-W. Neural Template: Topology-aware Reconstruction and Disentangled Generation of 3D Meshes. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 18-24 June, 2022, New Orleans, Los Angeles. USA: IEEE; 2022. p. 18551-18561. https//doi.org/10.1109/CVPR52688.2022.01802 4. Mittal P., Cheng Y.-C., Singh M., Tulsiani S. AutoSDF: Shape Priors for 3D Completion, Reconstruction and Generation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 18-24 June, 2022, New Orleans, Los Angeles. USA: IEEE; 2022. p. 306-315. https//doi.org/10.1109/cvpr52688.2022.00040 5. Zeng X., Vahdat A., Williams F. et al. LION: Latent Point Diffusion Models for 3D Shape Generation. Neural Information Processing Systems. 2022;35:10021-10039. 6. Li Y., Dou Y., Chen X. et al. 3DQD: Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 17-24 June 2023, Vancouver, British Columbia. Canada: IEEE; 2023. p. 16784-16794. https//doi.org/10.1109/CVPR52729.2023.01610 7. Choy C. B., Xu D., Gwak J. et al. 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction. In: Leibe B., Matas J., Sebe N., Welling M. (eds). Computer Vision – ECCV 2016. 11-14 October 2016, Amsterdam. The Netherlands: Cham, Springer; 2016. https//doi.org/10.1007/978-3-319-46484-8_38 8. Lorensen W.E. and Cline H.E. Marching cubes: A high resolution 3d surface construction algorithm. ACM SIGGRAPH Computer Graphics. 1987;21(4): 163-169. https//doi.org/10.1145/37402.37422 9. Chang A. X., Funkhouser T., Guibas L. et al. ShapeNet: An Information-Rich 3D Model Repository. arXiv:1512.03012. 2015. https//doi.org/10.48550/arXiv.1512.03012 10. Gribanov D., Kilbas I., Mukhin A., Paringer R. Effect of Encoder Architectures on the Generation of Vector Representations for Modeling 3D Objects via the Space of Convex Sets. In: 2024 X International Conference on Information Technology and Nanotechnology (ITNT). 20-24 May, 2024, Samara. Russian Federation: IEEE; 2024. p. 1-7. https//doi.org/10.1109/itnt60778.2024.10582346 11. Xie H., Yao H., Sun X. et al. Pix2Vox: Context-Aware 3D Reconstruction From Single and Multi-View Images. In: IEEE/CVF International Conference on Computer Vision (ICCV). 27 October 2019 - 02 November 2019, Seoul. Korea (South): IEEE; 2019. p. 2690-2698. https//doi.org/10.1109/ICCV.2019.00278 12. Vaswani A., Shazeer N., Parmar N. et al. Attention Is All You Need. Neural Information Processing Systems. 2017;30. 13. Xu Q., Wang W., Ceylan D. et al. DISN: Deep Implicit Surface Network for High-quality Single-view 3D Reconstruction. Neural Information Processing Systems. 2019;32. 14. Huang Z., Stojanov S., Thai A. et al. ZeroShape: Regressionbased Zero-shot Shape Reconstruction. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 16-22 June 2024, Seattle, WA. USA: IEEE; 2024. p. 10061-10071. https//doi.org/10.1109/CVPR52733.2024.00959 15. Xian Y., Chibane J., Lal Bhatnagar B. et al. Any-Shot GIN: Generalizing Implicit Networks for Reconstructing Novel Classes. In: 2022 International Conference on 3D Vision (3DV). 12-16 September 2022, Prague. Czech Republic: IEEE; 2022. p. 526-535. https//doi.org/10.1109/3DV57658.2022.00064 16. Thai A., Stojanov S., Upadhya V. and Rehg J. M. 3D Reconstruction of Novel Object Shapes from Single Images. In: 2021 International Conference on 3D Vision (3DV). 01-03 December 2021, London. United Kingdom: IEEE; 2021. p. 85-95. https//doi.org/10.1109/3DV53792.2021.00019 17. Recht B., Roelofs R., Schmidt L., Shankar V. Do ImageNet Classifiers Generalize to ImageNet? Machine Learning Research. 2019;97:5389-5400. Available from: https://proceedings.mlr.press/v97/recht19a.html [Accessed 02 February 2025]. 18. Geirhos R., Rubisch P., Michaelis C. et al. ImageNettrained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In: International Conference on Learning Representations. 6-9 May, 2019, Ernest N. Morial Convention Center, New Orleans. USA: 2019. https//doi.org/10.48550/arXiv.1811.12231 19. He K., Zhang X., Ren S. and Sun J., Deep Residual Learning for Image Recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2019, Las Vegas, NV. USA: IEEE; 2016. p. 770-778. https//doi.org/10.1109/CVPR.2016.90 20. Chen D.-Y., Tian X.-P., Shen Y.-T., Ouhyoung M. On Visual Similarity Based 3D Model Retrieval. Computer Graphics Forum. 2003;22(3):223-232. https//doi.org/10.1111/1467-8659.00669
|