Analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment

Mikołaj Skrzypczyński; Piotr Muryjas

doi:10.35784/jcsi.4060

PDF

Published: Mar 20, 2024

DOI: https://doi.org/10.35784/jcsi.4060

Issue Vol. 30 (2024)

Articles

Analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment
Mikołaj Skrzypczyński, Piotr Muryjas

1-8
Analysis of the application for the DFD authoring usage possibilities
Marek Pieczykolan, Marcin Badurowicz

9-13
Comparative analysis of query execution speed using Entity Framework for selected database engines
Krzysztof Winiarczyk, Rafał Stęgierski

14-20
C++ and Kotlin performance on Android – a comparative analysis
Grzegorz Zaręba, Maciej Zarębski, Jakub Smołka

21-25
Comparative analysis of Node.js frameworks
Bartłomiej Zima, Marcin Barszcz

26-30
User experience analysis in virtual museums
Aleksandra Kobylska, Mariusz Dzieńkowski

31-38
Analysis of user experience during interaction with automotive repair workshop websites
Radosław Danielkiewicz, Mariusz Dzieńkowski

39-46
A comparative analysis of transitions generated using the Unity game development platform
Marek Tabiszewski

47-52
Comparative analysis of the performance of Unity and Unreal Engine game engines in 3D games
Kamil Abramowicz, Przemysław Borczuk

53-60
Classification Performance Comparison of BERT and IndoBERT on SelfReport of COVID-19 Status on Social Media
Irwan Budiman, Mohammad Reza Faisal, Astina Faridhah, Andi Farmadi, Muhammad Itqan Mazdadi, Triando Hamonangan Saragih, Friska Abadi

61-67

Authors

Mikołaj Skrzypczyński

mikolaj.skrzypczynski@pollub.edu.pl

Lublin University of Technology, Poland

Piotr Muryjas

p.muryjas@pollub.pl

Lublin University of Technology, Poland

Abstract

The aim of this paper is the analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment. The analysis was based on comparison between both mentioned tools with use of large data set, represented by 28 million records. Research was provided with use of scripts and queries destined for Apache Hive and Apache Pig, and then executed 10 times on environment brought by created virtual machine. Those methods were performed on the same data sets for 16 times according to previously prepared research scenarios. As the conclusion, authors had observed that Apache Hive is more efficient tool, than Apache Pig.

Keywords:

data processing, Apache Hive, Apache Pig, Hadoop

References

Skrzypczyński, M., & Muryjas, P. (2024). Analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment. Journal of Computer Sciences Institute, 30, 1–8. https://doi.org/10.35784/jcsi.4060

Analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment

Issue Vol. 30 (2024)

Archives

Authors

Abstract

Keywords:

References

License

Article Sidebar

Issue Vol. 30 (2024)

Archives

Main Article Content

Authors

Abstract

Keywords:

References

Article Details

License