KEYSTROKE DYNAMICS ANALYSIS USING MACHINE LEARNING METHODS
Nataliya SHABLIY
natalinash@gmail.comTernopil Ivan Puluj National Technical University, Faculty of Computer Information Systems and Software Engineering, Computer Systems and Networks Department, Ternopil (Ukraine)
Serhii LUPENKO
Ternopil Ivan Puluj National Technical University, Faculty of Computer Information Systems and Software Engineering, Computer Systems and Networks Department, Ternopil (Ukraine)
Nadiia LUTSYK
Ternopil Ivan Puluj National Technical University, Faculty of Computer Information Systems and Software Engineering, Computer Systems and Networks Department, Ternopil (Ukraine)
Oleh YASNIY
Ternopil Ivan Puluj National Technical University, Faculty of Computer Information Systems and Software Engineering, Computer Systems and Networks Department, Ternopil (Ukraine)
Olha MALYSHEVSKA
Ivano-Frankivsk National Medical University, Department of Hygiene and Ecology, Ivano-Frankivsk (Ukraine)
Abstract
The primary objective of the paper was to determine the user based on its keystroke dynamics using the methods of machine learning. Such kind of a problem can be formulated as a classification task. To solve this task, four methods of supervised machine learning were employed, namely, logistic regression, support vector machines, random forest, and neural network. Each of three users typed the same word that had 7 symbols 600 times. The row of the dataset consists of 7 values that are the time period during which the particular key was pressed. The ground truth values are the user id. Before the application of machine learning classification methods, the features were transformed to z-score. The classification metrics were obtained for each applied method. The following parameters were determined: precision, recall, f1-score, support, prediction, and area under the receiver operating characteristic curve (AUC). The obtained AUC score was quite high. The lowest AUC score equal to 0.928 was achieved in the case of linear regression classifier. The highest AUC score was in the case of neural network classifier. The method of support vector machines and random forest showed slightly lower results as compared with neural network method. The same pattern is true for precision, recall and F1-score. Nevertheless, the obtained classification metrics are quite high in every case. Therefore, the methods of machine learning can be efficiently used to classify the user based on keystroke patterns. The most recommended method to solve such kind of a problem is neural network.
Keywords:
keystroke dynamics analysis, machine learning, neural network, supervised learning, classification problemReferences
Al-Awad, N. A., Abboud, I. K., & Al-Rawi, M. F. (2021). Genetic Algorithm-PID controller for model order reduction pantographcatenary system. Applied Computer Science, 17(2), 28-39. https://doi.org/10.23743/acs-2021-11
Google Scholar
Alyamani, A., & Yasniy, O. (2020). Classification of EEG signal by methods of machine learning. Applied Computer Science, 16(4), 56-63. https://doi.org/10.23743/acs-2020-29
Google Scholar
Biau, G., & Scornet, E. (2016). A Random Forest Guided Tour. Test, 25(2), 197–227. https://doi.org/10.1007/s11749-016-0481-7
DOI: https://doi.org/10.1007/s11749-016-0481-7
Google Scholar
Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7), 1145–1159. https://doi.org/10.1016/S0031-3203(96)00142-2
DOI: https://doi.org/10.1016/S0031-3203(96)00142-2
Google Scholar
Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys, 41(3), 1–58. https://doi.org/10.1145/1541880.1541882
DOI: https://doi.org/10.1145/1541880.1541882
Google Scholar
Dewi, W., & Utomo, W. H. (2021). Plant classification based on leaf edges and leaf morphological veins using wavelet convolutional neural network. Applied Computer Science, 17(1), 81–89. https://doi.org/10.23743/acs-2021-08
Google Scholar
Dhir, Vijay, Singh, A., Kumar, R., & Singh, G. (2010). Biometric Recognition: A Modern Era For Security. International Journal of Engineering Science and Technology, 2(8), 3364–80.
Google Scholar
Edgar, T. W., & Manz, D. O. (2017). Research Methods for Cyber Security. Syngress.
Google Scholar
Fischer, R. J., Halibozek, E. P., & Walters, D. C. (2019). Holistic Security Through the Application of Integrated Technology. Introduction to Security, 2019, 433–62. https://doi.org/10.1016/b978-0-12-805310-2.00017-2.
DOI: https://doi.org/10.1016/B978-0-12-805310-2.00017-2
Google Scholar
Gaines, R. S., Lisowski. W., Press, S. J., & Shapiro, N. (1980). Authentication by Keystroke Timing. The Rand Corporation.
Google Scholar
Gebrie, M. T., & Abie, H. (2017). Risk-Based Adaptive Authentication for Internet of Things in Smart Home EHealth. Proceedings of the 11th European Conference on Software Architecture: Companion Proceedings (ECSA'17) (pp. 102–108). Association for Computing Machinery. https://doi.org/10.1145/3129790.3129801
DOI: https://doi.org/10.1145/3129790.3129801
Google Scholar
Hwang, S.-S., Lee H., & Cho, S. (2009). Improving Authentication Accuracy Using Artificial Rhythms and Cues for Keystroke Dynamics-Based Authentication. Expert Systems with Applications, 36(7), 10649–56. https://doi.org/10.1016/j.eswa.2009.02.075
DOI: https://doi.org/10.1016/j.eswa.2009.02.075
Google Scholar
Jain, A. K., Bolle, R. M., & Pankanti, S. (2006). Biometrics. Personal Identification in Networked Society. Springer.
Google Scholar
Jain, A. K., Ross, A., & Prabhakar, S. (2004). An Introduction to Biometric Recognition. IEEE Trans. on Circuits and Systems for Video Technology, 14(1), 4-19.
DOI: https://doi.org/10.1109/TCSVT.2003.818349
Google Scholar
Javaheri, S. H., Sepehri, M. M. & Teimourpour, B. (2013). Response Modeling in Direct Marketing. A Data Mining-Based Approach for Target Selection. Data Mining Applications with R (pp. 153-180). Elsevier Inc. https://doi.org/10.1016/B978-0-12-411511-8.00006-2
DOI: https://doi.org/10.1016/B978-0-12-411511-8.00006-2
Google Scholar
Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43(1), 59–69.
DOI: https://doi.org/10.1007/BF00337288
Google Scholar
Markou, M., & Singh, S. (2003). Novelty detection: a review—part 1: statistical approaches. Signal Processing, 83(12), 2481–2497. https://doi.org/10.1016/j.sigpro.2003.07.018
DOI: https://doi.org/10.1016/j.sigpro.2003.07.018
Google Scholar
Miljković, D. (2010). Review of novelty detection methods. The 33rd International Convention MIPRO (pp. 593-598). IEEE.
Google Scholar
Monrose, F., Reiter, M. K., & Wetzel, S. (2002). Password Hardening Based on Keystroke Dynamics. International Journal of Information Security, 1(2), 69–83. https://doi.org/10.1007/s102070100006
DOI: https://doi.org/10.1007/s102070100006
Google Scholar
Raschka, S. (2017). Python Machine Learning. Second edition. Packt Publishing Ltd.
Google Scholar
Ru, W.G., & Eloff, J.H. (1997). Enhanced Password Authentication through Fuzzy Logic. IEEE Expert, 12, 38-45.
DOI: https://doi.org/10.1109/64.642960
Google Scholar
Sridharan, M., Rani Arulanandam, D. C., Chinnasamy, R. K., Thimmanna, S., & Dhandapani, S. (2021). Recognition of font and tamil letter in images using deep learning. Applied Computer Science, 17(2), 90–99. https://doi.org/10.23743/acs-2021-15
Google Scholar
Subasi, A. (2020). Practical Machine Learning for Data Analysis Using Python. Academic Press.
Google Scholar
Umphress, D., & Williams, G. (1985). Identity verification through keyboard characteristics. International Journal of Man-Machine Studies, 23(3), 263–273. https://doi.org/10.1016/S0020-7373(85)80036-5
DOI: https://doi.org/10.1016/S0020-7373(85)80036-5
Google Scholar
Vaibhaw, Sarraf, J., & Pattnaik, P.K. (2020). Brain–Computer Interfaces and Their Applications. An Industrial IoT Approach for Pharmaceutical Industry Growth, 2, 31-54. https://doi.org/10.1016/b978-0-12-821326-1.00002-4
DOI: https://doi.org/10.1016/B978-0-12-821326-1.00002-4
Google Scholar
Williams, B., Halloin, C., Löbel, W., Finklea, F., Lipke, E., Zweigerdt, R., & Cremaschi, S. (2020). Data-Driven Model Development for Cardiomyocyte Production Experimental Failure Prediction. Computer Aided Chemical Engineering, 48, 1639-1644. https://doi.org/10.1016/B978-0-12-823377-1.50274-3
DOI: https://doi.org/10.1016/B978-0-12-823377-1.50274-3
Google Scholar
Authors
Nataliya SHABLIYnatalinash@gmail.com
Ternopil Ivan Puluj National Technical University, Faculty of Computer Information Systems and Software Engineering, Computer Systems and Networks Department, Ternopil Ukraine
Authors
Serhii LUPENKOTernopil Ivan Puluj National Technical University, Faculty of Computer Information Systems and Software Engineering, Computer Systems and Networks Department, Ternopil Ukraine
Authors
Nadiia LUTSYKTernopil Ivan Puluj National Technical University, Faculty of Computer Information Systems and Software Engineering, Computer Systems and Networks Department, Ternopil Ukraine
Authors
Oleh YASNIYTernopil Ivan Puluj National Technical University, Faculty of Computer Information Systems and Software Engineering, Computer Systems and Networks Department, Ternopil Ukraine
Authors
Olha MALYSHEVSKAIvano-Frankivsk National Medical University, Department of Hygiene and Ecology, Ivano-Frankivsk Ukraine
Statistics
Abstract views: 530PDF downloads: 104
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.
Most read articles by the same author(s)
- Amina ALYAMANI, Oleh YASNIY, CLASSIFICATION OF EEG SIGNAL BY METHODS OF MACHINE LEARNING , Applied Computer Science: Vol. 16 No. 4 (2020)
Similar Articles
- Nawazish NAVEED, Hayan T. MADHLOOM, Mohd Shahid HUSAIN, BREAST CANCER DIAGNOSIS USING WRAPPER-BASED FEATURE SELECTION AND ARTIFICIAL NEURAL NETWORK , Applied Computer Science: Vol. 17 No. 3 (2021)
- Jerzy JÓZWIK, Magdalena ZAWADA-MICHAŁOWSKA, Monika KULISZ, Paweł TOMIŁO, Marcin BARSZCZ, Paweł PIEŚKO, Michał LELEŃ, Kamil CYBUL, MODELING THE OPTIMAL MEASUREMENT TIME WITH A PROBE ON THE MACHINE TOOL USING MACHINE LEARNING METHODS , Applied Computer Science: Vol. 20 No. 2 (2024)
- Roman GALAGAN, Serhiy ANDREIEV, Nataliia STELMAKH, Yaroslava RAFALSKA, Andrii MOMOT, AUTOMATION OF POLYCYSTIC OVARY SYNDROME DIAGNOSTICS THROUGH MACHINE LEARNING ALGORITHMS IN ULTRASOUND IMAGING , Applied Computer Science: Vol. 20 No. 2 (2024)
- Shahil SHARMA, Rajnesh LAL, Bimal KUMAR, DEVELOPING MACHINE LEARNING APPLICATION FOR EARLY CARDIOVASCULAR DISEASE (CVD) RISK DETECTION IN FIJI: A DESIGN SCIENCE APPROACH , Applied Computer Science: Vol. 20 No. 3 (2024)
- Miguel Angel BELLO RIVERA, Carlos Alberto REYES GARCÍA, Tania Cristal TALAVERA ROJAS, Perfecto Malaquías QUINTERO FLORES, Rodolfo Eleazar PÉREZ LOAIZA, AUTOMATIC IDENTIFICATION OF DYSPHONIAS USING MACHINE LEARNING ALGORITHMS , Applied Computer Science: Vol. 19 No. 4 (2023)
- Amina KINANE DAOUADJI, Fatima BENDELLA, IMPROVING E-LEARNING BY FACIAL EXPRESSION ANALYSIS , Applied Computer Science: Vol. 20 No. 2 (2024)
- Robert KARPIŃSKI, Jakub GAJEWSKI, Jakub SZABELSKI, Dalibor BARTA, APPLICATION OF NEURAL NETWORKS IN PREDICTION OF TENSILE STRENGTH OF ABSORBABLE SUTURES , Applied Computer Science: Vol. 13 No. 4 (2017)
- Wulan Dewi, Wiranto Herry Utomo, PLANT CLASSIFICATION BASED ON LEAF EDGES AND LEAF MORPHOLOGICAL VEINS USING WAVELET CONVOLUTIONAL NEURAL NETWORK , Applied Computer Science: Vol. 17 No. 1 (2021)
- Anitha Rani PALAKAYALA, Kuppusamy P, A QUALITATIVE AND QUANTITATIVE APPROACH USING MACHINE LEARNING AND NON-MOTOR SYMPTOMS FOR PARKINSON’S DISEASE CLASSIFICATION. A HIERARCHICAL STUDY , Applied Computer Science: Vol. 20 No. 3 (2024)
- Dilek AYDOGAN-KILIC, Deniz Kenan KILIC, Izabela Ewa NIELSEN, EXAMINATION OF SUMMARIZED MEDICAL RECORDS FOR ICD CODE CLASSIFICATION VIA BERT , Applied Computer Science: Vol. 20 No. 2 (2024)
You may also start an advanced similarity search for this article.