CKSD: Comprehensive Kurdish-Sorani database
Article Sidebar
Open full text
Issue Vol. 15 No. 1 (2025)
-
Statistical reliability of decisions on controlled process faults
Yevhen Volodarskyi, Oleh Kozyr, Zygmunt Warsza5-9
-
Pulse chaotic generator based a classical Chua’s circuit
Volodymyr Rusyn, Andrii Samila, Bogdan Markovych, Aceng Sambas, Christos Skiadas, Milan Guzan10-14
-
Stability of metaheuristic PID controllers in photovoltaic dc microgrids
Elvin Yusubov, Lala Bekirova15-21
-
Integrating numerical simulation and experimental data for enhanced structural health monitoring of bridges
Om Narayan Singh, Kaushik Dey22-26
-
Application of multi-agent programming for modeling the viscosity state of mash in alcohol production
Larysa Gumeniuk, Ludmyla Markina, Viktor Satsyk, Pavlo Humeniuk, Anton Lashch27-32
-
A stochastic interval algebra for smart factory processes
Piotr Dziurzanski, Konrad Kabala, Agnieszka Konrad33-38
-
Advancements in solar panel maintenance: a review of IoT-integrated automatic dust cleaning systems
Balamurugan Rangaswamy, Ramasamy Nithya39-44
-
Modified cosine-quadratic reflectance model
Oleksandr Romanyuk, Volodymyr Lytvynenko, Yevhen Zavalniuk45-48
-
Comparative analysis of lithium-iron-phosphate and sodium-ion energy storage devices
Huthaifa A. Al_Issa, Mohamed Qawaqzeh, Lina Hani Hussienat, Ruslan Oksenych, Oleksandr Miroshnyk, Oleksandr Moroz, Iryna Trunova, Volodymyr Paziy, Serhii Halko, Taras Shchur49-54
-
Investigation of DC-AC converter with microcontroller control of inverter frequency
Anatolii Tkachuk, Mykola Polishchuk, Liliia Polishchuk, Serhii Kostiuchko, Serhii Hryniuk, Liudmyla Konkevych55-61
-
Mathematical apparatus for finding the optimal configuration secure communication network with a specified number of subscribers
Volodymyr Khoroshko, Yuliia Khokhlachova, Oleksandr Laptiev, Al-Dalvash Ablullah Fowad62-66
-
Critical cybersecurity aspects for improving enterprise digital infrastructure protection
Roman Kvуetnyy, Volodymyr Kotsiubynskyi, Serhii Husak, Yaroslav Movchan, Nataliia Dobrovolska, Sholpan Zhumagulova, Assel Aitkazina67-72
-
Modification of the Peterson algebraic decoder
Dmytro Mogylevych, Iryna Kononova, Liudmyla Pogrebniak, Kostiantyn Lytvyn, Igor Gyrenko73-78
-
Development of a model for calculating the dilution of precision coefficients of the global navigation system at a given point in space
Oleksandr Turovsky, Nazarii Blazhennyi, Roman Vozniak, Yana Horbachova, Kostiantyn Horbachov, Nataliia Rudenko79-87
-
LLM based expert AI agent for mission operation management
Sobhana Mummaneni, Syama Sameera Gudipati, Satwik Panda88-94
-
Review of operating systems used in unmanned aerial vehicles
Viktor Ivashko, Oleh Krulikovskyi, Serhii Haliuk, Andrii Samila95-100
-
Optimization of machine learning methods for de-anonymization in social networks
Nurzhigit Smailov, Fatima Uralova, Rashida Kadyrova, Raiymbek Magazov, Akezhan Sabibolda101-104
-
Robust deepfake detection using Long Short-Term Memory networks for video authentication
Ravi Kishan Surapaneni, Hameed Syed, Harshitha Kakarala, Venkata Sai Srikar Yaragudipati105-108
-
Regional trending topics mining from real time Twitter data for sentiment, context, network and temporal analysis
Mousumi Hasan, Mujiba Shaima, Quazi Saad ul Mosaher109-116
-
Model development to improve the predictive maintenance reliability of medical devices
Khalid Musallam Alahmadi, Essam Rabea Ibrahim Mahmoud, Fitrian Imaduddin117-124
-
Explainable artificial intelligence for detecting lung cancer
Vinod Kumar R S, Bushara A R, Abubeker K M, Smitha K M, Abini M A, Jubaira Mammoo, Bijesh Paul125-130
-
Design and implementation of a vein detection system for improved accuracy in blood sampling
Omar Boutalaka, Achraf Benba, Sara Sandabad131-134
-
Metrological feature for determining the concentration of cholesterol, triglycerides, and phospholipids for psoriasis detection
Ivan Diskovskyi, Yurii Kachurak, Orysya Syzon, Marta Kolishetska, Bogdan Pinaiev, Oksana Stoliarenko135-138
-
Development of a mobile application for testing fine motor skills disorders
Marko Andrushchenko, Karina Selivanova, Oleg Avrunin, Alla Kraievska, Orken Mamyrbayev, Kymbat Momynzhanova139-143
-
Artificial intelligence in education: ChatGPT-based simulations in teachers’ preparation
Marina Drushlyak, Tetiana Lukashova, Volodymyr Shamonia, Olena Semenikhina144-152
-
CKSD: Comprehensive Kurdish-Sorani database
Jihad Anwar Qadir, Samer Kais Jameel, Wshyar Omar Khudhur, Kamaran H. Manguri153-156
Archives
-
Vol. 15 No. 3
2025-09-30 24
-
Vol. 15 No. 2
2025-06-27 24
-
Vol. 15 No. 1
2025-03-31 26
-
Vol. 14 No. 4
2024-12-21 25
-
Vol. 14 No. 3
2024-09-30 24
-
Vol. 14 No. 2
2024-06-30 24
-
Vol. 14 No. 1
2024-03-31 23
-
Vol. 13 No. 4
2023-12-20 24
-
Vol. 13 No. 3
2023-09-30 25
-
Vol. 13 No. 2
2023-06-30 14
-
Vol. 13 No. 1
2023-03-31 12
-
Vol. 12 No. 4
2022-12-30 16
-
Vol. 12 No. 3
2022-09-30 15
-
Vol. 12 No. 2
2022-06-30 16
-
Vol. 12 No. 1
2022-03-31 9
-
Vol. 11 No. 4
2021-12-20 15
-
Vol. 11 No. 3
2021-09-30 10
-
Vol. 11 No. 2
2021-06-30 11
-
Vol. 11 No. 1
2021-03-31 14
Main Article Content
DOI
Authors
Abstract
Every individual has a specific language with which he/she communicates. Each language has special letters and features distinguishing it from other languages. Ideas, cultures, and sciences are exchanged through some notions of languages, including retrieval, translation, and classification of texts from journals, books, journals, research, and the internet. It is accomplished through database availability. Unfortunately, due to some reasons, Kurdish language databases may be rare or non-existent. In the present study, a Comprehensive Kurdish-Sorani Database (CKSD) is generated, which contains datasets of dates, letters, and common words in the Kurdish language, as well as the documents employed for the extraction of these datasets. Elements of these collections were extracted from the written documents in 27 different fonts. It bestows a comprehensiveness feature to the CKSD database that can be utilized by researchers. In order to determine the extent to which classifiers can categorize such data, these data were utilized in this study. Indeed, this study demonstrated the reliability of this data and its suitability for use in the fields of machine learning and other artificial intelligence applications.
Keywords:
References
[1] Abdulrahman R. O. et al.: Developing a Fine-Grained Corpus for a Less-Resourced Language: The Case of Kurdish. arXiv 11467, 2019.
[2] Ahmed R. M. et al.: Kurdish Handwritten Character Recognition Using Deep Learning Techniques 46, 2022, 119278. DOI: https://doi.org/10.1016/j.gep.2022.119278
[3] Akhter M. P. et al.: Exploring Deep Learning Approaches for Urdu Text Classification in Product Manufacturing 16(2), 2022, 223–248. DOI: https://doi.org/10.1080/17517575.2020.1755455
[4] Allahyari M. et al.: A Brief Survey of Text Mining: Classification, Clustering, and Extraction Techniques. arXiv 1707.02919v2, 2017.
[5] Alwehaibi A., Roy K.: Comparison of Pre-Trained Word Vectors for Arabic Text Classification Using Deep Learning Approach. 17th IEEE International Conference on Machine Learning and Applications (ICMLA), 2018, 1471–1474. DOI: https://doi.org/10.1109/ICMLA.2018.00239
[6] Celik S.: Collaborative English Language Learning in Primary School: A Sequential Explanatory Study in Kurdistan Region of Iraq. Id No. 2520, 2019.
[7] Chen K. et al.: Defect Texts Mining of Secondary Device in Smart Substation with GloVe and Attention-Based Bidirectional LSTM. Energies 13(17), 2020, 4522. DOI: https://doi.org/10.3390/en13174522
[8] Choudhary P. et al.: A Four-Tier Annotated Urdu Handwritten Text Image Dataset for Multidisciplinary Research on Urdu Script. Information Processing. 15(4), 2016, 1–23. DOI: https://doi.org/10.1145/2857053
[9] Gómez L. A. et al.: Single Shot Scene Text Retrieval. European Conference on Computer Vision (ECCV), 2018, 700–715. DOI: https://doi.org/10.1007/978-3-030-01264-9_43
[10] Hakim L. et al.: Text Mining of UU-ITE Implementation in Indonesia. Journal of Physics: Conference Series 1, 2018. DOI: https://doi.org/10.1088/1742-6596/1007/1/012038
[11] Hashimi A. O.: Ajami Tradition in Non-Islamic Society: The Roles of Ajami-Arabic Scripts in Keeping Records and Documentation. KIU Journal of Humanities 5(2), 2020, 373–379.
[12] Jana H. P.: The Tools of Language and Literature in Sustainable Development of the Globizen: An Enquiry with Special Reference to English Language and Literature. International Journal of Yogic, Human Movement and Sports Sciences 3(2), 2018, 318–324.
[13] Mallery G.: Sign Language among North American Indians Compared with That among Other Peoples and Deaf-Mutes. Vol. 14, Walter de Gruyter GmbH & Co KG, 2019.
[14] Rashid T. A. et al.: A Robust Categorization System for Kurdish Sorani Text Documents. Information Technology Journal 16(1), 2017, 27–34. DOI: https://doi.org/10.3923/itj.2017.27.34
[15] Sheyholislami J.: Identity, Language, and New Media: The Kurdish Case. Language Policy 9, 2010, 289–312. DOI: https://doi.org/10.1007/s10993-010-9179-y
[16] Sun W. et al.: Data Processing and Text Mining Technologies on Electronic Medical Records: A Review. Journal of Healthcare Engineering 2018, 4302425 [https://doi.org/10.1155/2018/4302425]. DOI: https://doi.org/10.1155/2018/4302425
[17] Tensmeyer C. et al.: Convolutional Neural Networks for Font Classification. 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) 1, 2017, 985–990. DOI: https://doi.org/10.1109/ICDAR.2017.164
[18] Tofiq T. A., Hussein J. A.: Kurdish Text Segmentation Using Projection-Based Approaches. UHD Journal of Science and Technology 5(1), 2021, 56–65. DOI: https://doi.org/10.21928/uhdjst.v5n1y2021.pp56-65
[19] Veisi H. et al.: Toward Kurdish Language Processing: Experiments in Collecting and Processing the Asosoft Text Corpus. Digital Scholarship in the Humanities 35(1), 2020, 176–193. DOI: https://doi.org/10.1093/llc/fqy074
[20] Wahdan A. et al.: A Systematic Review of Text Classification Research Based on Deep Learning Models in Arabic Language. International Journal of Electrical and Computer Engineering (IJECE) 10(6), 2020, 6629–6643. DOI: https://doi.org/10.11591/ijece.v10i6.pp6629-6643
[21] Wang Z. et al.: DeepFont: Identify Your Font from an Image. 23rd ACM International Conference on Multimedia, 2015. DOI: https://doi.org/10.1145/2733373.2806219
[22] Wiedemann G., Wiedemann: Text Mining for Qualitative Data Analysis in the Social Sciences. Vol. 1, Springer, 2016. DOI: https://doi.org/10.1007/978-3-658-15309-0_1
[23] Yao L. et al.: Graph Convolutional Networks for Text Classification. AAAI Conference on Artificial Intelligence 3(1), 2019, 7370–7377. DOI: https://doi.org/10.1609/aaai.v33i01.33017370
[24] Yaseen R., Hassani H.: Kurdish Optical Character Recognition. UKH Journal of Science and Engineering 2(1), 2018, 18–27. DOI: https://doi.org/10.25079/ukhjse.v2n1y2018.pp18-27
[25] Zarro R. D. et al.: Recognition-based online Kurdish character recognition using hidden Markov model and harmony search. I. J. Technology 20(2), 2017, 783–794. DOI: https://doi.org/10.1016/j.jestch.2016.11.016
Article Details
Abstract views: 342

