EVALUATING LARGE LANGUAGE MODELS FOR MEDICAL INFORMATION EXTRACTION: A COMPARATIVE STUDY OF ZERO-SHOT AND SCHEMA-BASED METHODS
Zakaria KADDARI
z.kaddari@ump.ac.maUniversité Mohammed Premier, National School of Applied Sciences, LaRSA laboratory, AIRES team (Morocco)
https://orcid.org/0000-0003-4034-5612
Ikram El HACHMI
Université Mohammed Premier, Faculty of Medicine and Pharmacy Oujda (Morocco)
https://orcid.org/0009-0008-7928-3088
Jamal BERRICH
Université Mohammed Premier, Faculty of Medicine and Pharmacy Oujda (Morocco)
https://orcid.org/0000-0001-8443-7223
Rim AMRANI
Université Mohammed Premier, Faculty of Medicine and Pharmacy Oujda (Morocco)
https://orcid.org/0000-0003-3906-5533
Toumi BOUCHENTOUF
Université Mohammed Premier, Faculty of Medicine and Pharmacy Oujda (Morocco)
https://orcid.org/0000-0002-2689-8678
Abstract
This study investigates the application of large language models, particularly ChatGPT, in the extraction and structuring of medical information from free-text patient reports. The authors explore two distinct methods: a zero-shot extraction approach and a schema-based extraction approach. The dataset, consisting of 1230 anonymized French medical reports from the Department of Neonatology of the Mohammed VI University Hospital, served as the basis for these experiments. The findings indicate that while ChatGPT demonstrates a significant capability in structuring medical data, certain challenges remain, particularly with complex and non-standardized text formats. The authors evaluate the model's performance using precision, recall, and F1 score metrics, providing a comprehensive assessment of its applicability in clinical settings.
Keywords:
Medical Information Extraction, Large Language Models, ChatGPT, schema-based extractionReferences
Agrawal, M., Hegselmann, S., Lang, H., Kim, Y., & Sontag, D. (2022). Large Language Models are few-shot clinical information extractors. ArXiv, abs/2205.12689. https://doi.org/10.48550/arXiv.2205.12689
Google Scholar
Bergomi, L., Tommaso, M., Antonazzo, P., Alberghi, L., Bellazzi, R., Preda, L., Bortolotto, C., & Parimbelli, E. (2024). Reshaping free-text radiology notes into structured reports with generative question answering transformers. Artificial Intelligence in Medicine, 154, 102924. https://doi.org/10.1016/j.artmed.2024.102924
Google Scholar
Bhate, N., Mittal, A., He, Z., & Luo, X. (2023). Zero-shot learning with minimum instruction to extract social determinants and family history from clinical notes using GPT Model. IEEE International Conference on Big Data (BigData) (pp. 1476-1480). IEEE. https://doi.org/10.1109/BigData59044.2023.10386811
Google Scholar
Huang, J., Yang, D. M., Rong, R., Nezafati, K., Treager, C., Chi, Z., Wang, S., Cheng, X., Guo, Y., Klesse, L. J., Xiao, G., Peterson, E. D., Zhan, X., & Xie, Y. (2024). A critical assessment of using ChatGPT for extracting structured data from clinical notes. Npj Digital Medicine, 7(1), 106. https://doi.org/10.1038/s41746-024-01079-8
Google Scholar
Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., Chen, Q., Peng, W., Feng, X., Qin, B., & Liu, T. (2024). A Survey on hallucination in Large Language Models: Principles, taxonomy, challenges, and open questions. ACM Transactions on Information Systems, 3703155. https://doi.org/10.1145/3703155
Google Scholar
Kaddari, Z., Mellah, Y., Berrich, J., Belkasmi, M. G., & Bouchentouf, T. (2021). Natural language processing: challenges and future directions. In T. Masrour, I. El Hassani, & A. Cherrafi (Eds.), Artificial Intelligence and Industrial Applications (Vol. 144, pp. 236–246). Springer International Publishing. https://doi.org/10.1007/978-3-030-53970-2_22
Google Scholar
Kernberg, A., Gold, J., & Mohan, V. (2024). Using ChatGPT-4 to create structured medical notes from audio recordings of physician-patient encounters: Comparative study. Journal of Medical Internet Research, 26, e54419. https://doi.org/10.2196/54419
Google Scholar
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P., Leike, J., & Lowe, R. (2022). Training language models to follow instructions with human feedback. ArXiv, abs/2203.02155. https://doi.org/10.48550/arXiv.2203.02155
Google Scholar
Patra, B. G., Lepow, L. A., Kasi Reddy Jagadeesh Kumar, P., Vekaria, V., Sharma, M. M., Adekkanattu, P., Fennessy, B., Hynes, G., Landi, I., Sanchez-Ruiz, J. A., Ryu, E., Biernacka, J. M., Nadkarni, G. N., Talati, A., Weissman, M., Olfson, M., Mann, J. J., Zhang, Y., Charney, A. W., & Pathak, J. (2024). Extracting social support and social isolation information from clinical psychiatry notes: Comparing a rule-based natural language processing system and a large language model. Journal of the American Medical Informatics Association. https://doi.org/10.1093/jamia/ocae260
Google Scholar
Ray, P. P. (2023). ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems, 3, 121-154. https://doi.org/10.1016/j.iotcps.2023.04.003
Google Scholar
Straka, M., Náplava, J., Straková, J., & Samuel, D. (2021). RobeCzech: Czech RoBERTa, a monolingual contextualized language representation model. In K. Ekštein, F. Pártl, & M. Konopík (Eds.), Text, Speech, and Dialogue (Vol. 12848, pp. 197-209). Springer International Publishing. https://doi.org/10.1007/978-3-030-83527-9_17
Google Scholar
Tsai, R. T.-H., Wu, S.-H., Chou, W.-C., Lin, Y.-C., He, D., Hsiang, J., Sung, T.-Y., & Hsu, W.-L. (2006). Various criteria in the evaluation of biomedical named entity recognition. BMC Bioinformatics, 7, 92. https://doi.org/10.1186/1471-2105-7-92
Google Scholar
Yifan, Y., Jinhao, D., Kaidi, X., Yuanfang, C., Zhibo, S., & Yue, Z. (2024). A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly. High-Confidence Computing, 4(2), 100211. https://doi.org/10.1016/j.hcc.2024.100211
Google Scholar
Zelina, P., Halamkova, J., & Novacek, V. (2022). Unsupervised extraction, labelling and clustering of segments from clinical notes. 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (pp. 1362-1368). IEEE. http://dx.doi.org/10.1109/BIBM55620.2022.9995229
Google Scholar
Zhan, X., Humbert-Droz, M., Mukherjee, P., & Gevaert, O. (2021). Structuring clinical text with AI: Old versus new natural language processing techniques evaluated on eight common cardiovascular diseases. Patterns, 2(7), 100289. https://doi.org/10.1016/j.patter.2021.100289
Google Scholar
Authors
Zakaria KADDARIz.kaddari@ump.ac.ma
Université Mohammed Premier, National School of Applied Sciences, LaRSA laboratory, AIRES team Morocco
https://orcid.org/0000-0003-4034-5612
Authors
Ikram El HACHMIUniversité Mohammed Premier, Faculty of Medicine and Pharmacy Oujda Morocco
https://orcid.org/0009-0008-7928-3088
Authors
Jamal BERRICHUniversité Mohammed Premier, Faculty of Medicine and Pharmacy Oujda Morocco
https://orcid.org/0000-0001-8443-7223
Authors
Rim AMRANIUniversité Mohammed Premier, Faculty of Medicine and Pharmacy Oujda Morocco
https://orcid.org/0000-0003-3906-5533
Authors
Toumi BOUCHENTOUFUniversité Mohammed Premier, Faculty of Medicine and Pharmacy Oujda Morocco
https://orcid.org/0000-0002-2689-8678
Statistics
Abstract views: 119PDF downloads: 22
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.
Similar Articles
- Nancy WOODS, Gideon BABATUNDE, A ROBUST ENSEMBLE MODEL FOR SPOKEN LANGUAGE RECOGNITION , Applied Computer Science: Vol. 16 No. 3 (2020)
- Lubna RIYAZ, Muheet Ahmed BUTT, Majid ZAMAN, IMPROVING CORONARY HEART DISEASE PREDICTION BY OUTLIER ELIMINATION , Applied Computer Science: Vol. 18 No. 1 (2022)
- Arkadiusz GOLA, Łukasz WIECHETEK, MODELLING AND SIMULATION OF PRODUCTION FLOW IN JOB-SHOP PRODUCTION SYSTEM WITH ENTERPRISE DYNAMICS SOFTWARE , Applied Computer Science: Vol. 13 No. 4 (2017)
- KK Praneeth Tellakula, Saravana Kumar R, Sanjoy Deb, A SURVEY OF AI IMAGING TECHNIQUES FOR COVID-19 DIAGNOSIS AND PROGNOSIS , Applied Computer Science: Vol. 17 No. 2 (2021)
- Marcin PERY, Robert WASZKOWSKI, COMPUTATIONAL SYSTEM FOR EVALUATING HUMAN PERCEPTION IN VIDEO STEGANOGRAPHY , Applied Computer Science: Vol. 20 No. 4 (2024)
- Hawkar ASAAD, Shavan ASKAR, Ahmed KAKAMIN, Nayla FAIQ, EXPLORING THE IMPACT OF ARTIFICIAL INTELLIGENCE ON HUMANROBOT COOPERATION IN THE CONTEXT OF INDUSTRY 4.0 , Applied Computer Science: Vol. 20 No. 2 (2024)
- Kamil ŻYŁA, SIMPLIFIED GRAPHICAL DOMAIN-SPECIFIC LANGUAGES FOR THE MOBILE DOMAIN – PERSPECTIVES OF LEARNABILITY BY NONTECHNICAL USERS , Applied Computer Science: Vol. 13 No. 3 (2017)
- Baldemar ZURITA, Luís LUNA, José HERNÁNDEZ, Federico RAMÍREZ, BOVW FOR CLASSIFICATION IN GEOMETRICS SHAPES , Applied Computer Science: Vol. 14 No. 4 (2018)
- Thanh-Lam BUI, Ngoc-Tien TRAN, NAVIGATION STRATEGY FOR MOBILE ROBOT BASED ON COMPUTER VISION AND YOLOV5 NETWORK IN THE UNKNOWN ENVIRONMENT , Applied Computer Science: Vol. 19 No. 2 (2023)
- Krzysztof NIEMIEC, Grzegorz BOCEWICZ, AN AUTHENTICATION METHOD BASED ON A DIOPHANTINE MODEL OF THE COIN BAG PROBLEM , Applied Computer Science: Vol. 20 No. 2 (2024)
<< < 1 2 3 4 5 6 7 8 9 10 > >>
You may also start an advanced similarity search for this article.