Blender as a tool for generating synthetic data
Abstract
Acquiring data for neural network training is an expensive and labour-intensive task, especially when such data is
difficult to access. This article proposes the use of 3D Blender graphics software as a tool to automatically generate
synthetic image data on the example of price labels. Using the fastai library, price label classifiers were trained on
a set of synthetic data, which were compared with classifiers trained on a real data set. The comparison of the results
showed that it is possible to use Blender to generate synthetic data. This allows for a significant acceleration of the
data acquisition process and consequently, the learning process of neural networks.
Keywords:
artificial neural networks, convolutional neural network, synthetic data, blenderReferences
A. Voulodimos, N. Doulamis, A. Doulamis, E. Protopapadakis, Deep learning for computer vision: A brief review, Computational intelligence and neuroscience (2018).
DOI: https://doi.org/10.1155/2018/7068349
Google Scholar
Z. Cao, G. Martinez, T. Simon, S. Wei, Y. A. Sheikh, Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) 7291-7299.
Google Scholar
T Simon, H Joo, I Matthews, Y Sheikh, Hand keypoint detection in single images using multiview bootstrapping, CVPR (2017) 1145-1153.
DOI: https://doi.org/10.1109/CVPR.2017.494
Google Scholar
Z Cao, T Simon, S EnWei, Y. Sheikh, Realtime multiperson 2d pose estimation using part affinity fields, CVPR (2017) 7291-7299.
DOI: https://doi.org/10.1109/CVPR.2017.143
Google Scholar
S. En Wei, V. Ramakrishna, T. Kanade, Y. Sheikh, Convolutional pose machines, CVPR (2016) 4724-4732.
DOI: https://doi.org/10.1109/CVPR.2016.511
Google Scholar
Y. Roh, G. Heo, S. E. Whang. A survey on data collection for machine learning: a big data – ai integration perspective, IEEE Transactions on Knowledge and Data Engineering (2019).
DOI: https://doi.org/10.1109/TKDE.2019.2946162
Google Scholar
C. Xie, L. Vedaldi, P. Zisserman. Vgg-sound: A largescale audio-visual dataset (2020).
Google Scholar
S. Reddy, M. Mathew, L. Gomez, M. Rusinol, D. Karatzas., C. V. Jawahar, Roadtext-1k: Text detection and recognition dataset for driving videos, 2020 IEEE International Conference on Robotics and Automation (2020) 11074-11080.
DOI: https://doi.org/10.1109/ICRA40945.2020.9196577
Google Scholar
S. Hesai, Pandaset - public large-scale dataset for autonomous driving.
Google Scholar
J. Zhao, Y. Zhang, X. He, P. Xie. covid-ct-dataset: a ct scan dataset about covid-19. arXiv preprint arXiv:2003.13865 (2020).
Google Scholar
Blender Online Community. Blender - a 3D modelling and rendering package. Blender Foundation, Stichting Blender Foundation, Amsterdam (2018).
Google Scholar
A.Tsirikoglou, J. Kronander, M. Wrenninge, J. Unger, Procedural modeling and physically based rendering for synthetic data generation in automotive applications, arXiv preprint arXiv:1710.06270 (2017).
Google Scholar
A. Gaidon, Q. Wang, Y. Cabon, E. Vig, Virtual worlds as proxy for multi-object tracking analysis, proceedings of the IEEE conference on computer vision and pattern recognition (2016) 4340-4349.
DOI: https://doi.org/10.1109/CVPR.2016.470
Google Scholar
M. Muller, V. Casser, J. Lahoud, N. Smith, B. Ghanem, Sim4cv: A photo-realistic simulator for computer vision applications, International Journal of Computer Vision, 126(9) (2018) 902-919.
Google Scholar
J. McCormac, A. Handa, S. Leutenegger, A. J. Davison, Scenenet rgb-d: 5m photorealistic images of synthetic indoor trajectories with ground truth, arXiv preprint arXiv:1612.05079 (2016).
Google Scholar
Y. Zhang, W. Qiu, Q. Chen, X. Hu, A. Yuille, Unrealstereo: Controlling hazardous factors to analyze stereo vision, in proceedings of International Conference on 3D Vision (3DV) (2018) 228-237.
Google Scholar
W. Qiu, A. Yuille, Unrealcv: Connecting computer vision to unreal engine, in proceedings of European Conference on Computer Vision (2016) (909-916).
DOI: https://doi.org/10.1007/978-3-319-49409-8_75
Google Scholar
N. Mayer, E. Ilg, P. Hausser, P. Fischer, D. Cremers, A. Dosovitskiy, T. Brox, A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation, in Proceedings of the IEEE conference on computer vision and pattern recognition (2016) 4040-4048.
DOI: https://doi.org/10.1109/CVPR.2016.438
Google Scholar
P. Fischer, A. Dosovitskiy, E. Ilg, P. Hausser, C. Hazırba¸s, V. Golkov, P. van der Smagt, D. Cremers, T. Brox, Flownet: Learning optical flow with convolutional networks, in Proceedings of the IEEE international conference on computer vision (2015) 2758-2766.
Google Scholar
S. R. Richter, V. Vineet, S. Roth, V. Koltun, Playing for data: Ground truth from computer games, in European conference on computer vision (2016) 102–118.
Google Scholar
G. Ros, L. Sellart, J. Materzynska, D. Vazquez, A. M. Lopez, The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) 3234–3243.
Google Scholar
D. J. Butler, J. Wulff, G. B. Stanley, M. J. Black, A naturalistic open source movie for optical flow evaluation, in proceedings of Computer Vision – ECCV (2012) 611–625.
DOI: https://doi.org/10.1007/978-3-642-33783-3_44
Google Scholar
M. Jaderberg, K. Simonyan, A. Vedaldi, A. Zisserman, Synthetic data and artificial neural networks for natural scene text recognition, arXiv preprint arXiv:1406.2227 (2014).
Google Scholar
X. Peng, B. Sun, K. Ali, K. Saenko. Learning deep object detectors from 3d models, in Proceedings of the IEEE International Conference on Computer Vision (2015) 1278-1286.
DOI: https://doi.org/10.1109/ICCV.2015.151
Google Scholar
P. S. Rajpura, H. Bojinov, R. S. Hegde, Object detection using deep cnns trained on synthetic images, arXiv preprint arXiv:1706.06782 (2017).
Google Scholar
K. Wang, F. Shi, W. Wang, Y. Nan, S. Lian, Synthetic data generation and adaption for object detection in smart vending machines, arXiv preprint arXiv:1904.12294 (2019).
Google Scholar
G. Varol, J. Romero, X. Martin, N. Mahmood, M. J. Black, I. Laptev, C. Schmid. Learning from synthetic humans, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017) 109–117.
DOI: https://doi.org/10.1109/CVPR.2017.492
Google Scholar
J. Tremblay, A. Prakash, D. Acuna, M. Brophy, V. Jampani, C. Anil, T. To, E. Cameracci, S. Boochoon, S. Birchfield, Training deep networks with synthetic data: Bridging the reality gap by domain randomization, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2018) 969–977. 231 Journal of Computer Sciences Institute 16 (2020) 227–232
Google Scholar
C. Mitash, K. E. Bekris, A. Boularias, A self-supervised learning system for object detection using physics simulation and multi-view pose estimation, in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2017) 545-551.
DOI: https://doi.org/10.1109/IROS.2017.8202206
Google Scholar
J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, P. Abbeel, Domain randomization for transferring deep neural networks from simulation to the real world, in proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2017) 23-30.
DOI: https://doi.org/10.1109/IROS.2017.8202133
Google Scholar
G. Ros, L. Sellart, J. Materzynska, D. Vazquez, A. M. Lopez, The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) 3234–3243.
Google Scholar
H. Hattori, V. N. Boddeti, K. M. Kitani, T. Kanade, Learning scene-specific pedestrian detectors without real data, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015) 3819–3827.
DOI: https://doi.org/10.1109/CVPR.2015.7299006
Google Scholar
J. Howard, S. Gugger. fastai: A layered api for deep learning, Information 11(2) (2020) 108.
Google Scholar
Authors
Maciej PańczykPoland
Statistics
Abstract views: 922PDF downloads: 726
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.