A comparative analysis of the performance of the relational database and the Hadoop environment in the context of analytical data processing
Abstract
The article presents a detailed comparative analysis of the performance of a Microsoft SQL Server relational database and an Apache Hadoop environment in the context of analytical data processing. The study was carried out by execut-ing more than a dozen research scenarios with different queries on datasets of varying sizes. For each research scenario, the average query execution time on different datasets was compared. Based on the results, it was found that the average execution time of queries from the presented scenarios is significantly shorter in MS SQL Server than in Apache Ha-doop.
Keywords:
Apache Hadoop, SQL Server, relational database, OLAPReferences
P. O. Queiroz-Sousa, A. C. Salgado, A review on OLAP technologies applied to information networks, ACM Transactions on Knowledge Discovery from Data (TKDD) 14(1) (2019) 1–25, https://doi.org/10.1145/3370912.
DOI: https://doi.org/10.1145/3370912
Google Scholar
S. Sagiroglu, D. Sinanc, Big data: A review, Proceedings of the 2013 international conference on collaboration technologies and systems (CTS), IEEE, San Diego, CA, USA (2013) 42–47, https://doi.org/10.1109/CTS.2013.6567202.
DOI: https://doi.org/10.1109/CTS.2013.6567202
Google Scholar
S. Chaudhuri, U. Dayal, An overview of data warehousing and OLAP technology, ACM Sigmod record 26(1) (1997) 65–74, https://doi.org/10.1145/248603.248616.
DOI: https://doi.org/10.1145/248603.248616
Google Scholar
J. Song, C. Guo, Z. Wang, Y. Zhang, G. Yu, J. M. Pierson, HaoLap: A Hadoop based OLAP system for big data, Journal of Systems and Software 102 (2015) 167–181, https://doi.org/10.1016/j.jss.2014.09.024.
DOI: https://doi.org/10.1016/j.jss.2014.09.024
Google Scholar
R. Stanek, Microsoft SQL Server. Brno: Computer Press, 2013. [12.02.2023]
Google Scholar
R. Kumar, B. B. Parashar, S. Gupta, Y. Sharma, N. Gupta, Apache hadoop, nosql and newsql solutions of big data, International Journal of Advance Foundation and Research in Science & Engineering (IJAFRSE) 1(6) (2014) 28–36, https://doi.org/10.13140/2.1.3454.9444.
Google Scholar
K. Shvachko, H. Kuang, S. Radia, R. Chansler, The hadoop distributed file system, Proceedings of the 2010 IEEE 26th symposium on mass storage systems and technologies (MSST), Ieee, Incline Village, NV, USA (2010) 1–10, https://doi.org/10.1109/MSST.2010.5496972
DOI: https://doi.org/10.1109/MSST.2010.5496972
Google Scholar
J. Dittrich, J. A. Quiané-Ruiz, Efficient big data processing in Hadoop MapReduce, Proceedings of the VLDB Endowment 5(12) (2012) 2014-2015, https://doi.org/10.14778/2367502.2367562.
DOI: https://doi.org/10.14778/2367502.2367562
Google Scholar
V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, E. Baldeschwieler, Apache hadoop yarn: Yet another resource negotiator., Proceedings of the 4th annual Symposium on Cloud Computing, Santa Clara California (2013) 1–16, https://doi.org/10.1145/2523616.2523633
DOI: https://doi.org/10.1145/2523616.2523633
Google Scholar
Statistics
Abstract views: 62PDF downloads: 131
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.