EVALUATING LARGE LANGUAGE MODELS FOR MEDICAL INFORMATION EXTRACTION: A COMPARATIVE STUDY OF ZERO-SHOT AND SCHEMA-BASED METHODS
Zakaria KADDARI
z.kaddari@ump.ac.maUniversité Mohammed Premier, National School of Applied Sciences, LaRSA laboratory, AIRES team (Morocco)
https://orcid.org/0000-0003-4034-5612
Ikram El HACHMI
Université Mohammed Premier, Faculty of Medicine and Pharmacy Oujda (Morocco)
https://orcid.org/0009-0008-7928-3088
Jamal BERRICH
Université Mohammed Premier, Faculty of Medicine and Pharmacy Oujda (Morocco)
https://orcid.org/0000-0001-8443-7223
Rim AMRANI
Université Mohammed Premier, Faculty of Medicine and Pharmacy Oujda (Morocco)
https://orcid.org/0000-0003-3906-5533
Toumi BOUCHENTOUF
Université Mohammed Premier, Faculty of Medicine and Pharmacy Oujda (Morocco)
https://orcid.org/0000-0002-2689-8678
Abstract
This study investigates the application of large language models, particularly ChatGPT, in the extraction and structuring of medical information from free-text patient reports. The authors explore two distinct methods: a zero-shot extraction approach and a schema-based extraction approach. The dataset, consisting of 1230 anonymized French medical reports from the Department of Neonatology of the Mohammed VI University Hospital, served as the basis for these experiments. The findings indicate that while ChatGPT demonstrates a significant capability in structuring medical data, certain challenges remain, particularly with complex and non-standardized text formats. The authors evaluate the model's performance using precision, recall, and F1 score metrics, providing a comprehensive assessment of its applicability in clinical settings.
Keywords:
Medical Information Extraction, Large Language Models, ChatGPT, schema-based extractionReferences
Agrawal, M., Hegselmann, S., Lang, H., Kim, Y., & Sontag, D. (2022). Large Language Models are few-shot clinical information extractors. ArXiv, abs/2205.12689. https://doi.org/10.48550/arXiv.2205.12689
Google Scholar
Bergomi, L., Tommaso, M., Antonazzo, P., Alberghi, L., Bellazzi, R., Preda, L., Bortolotto, C., & Parimbelli, E. (2024). Reshaping free-text radiology notes into structured reports with generative question answering transformers. Artificial Intelligence in Medicine, 154, 102924. https://doi.org/10.1016/j.artmed.2024.102924
Google Scholar
Bhate, N., Mittal, A., He, Z., & Luo, X. (2023). Zero-shot learning with minimum instruction to extract social determinants and family history from clinical notes using GPT Model. IEEE International Conference on Big Data (BigData) (pp. 1476-1480). IEEE. https://doi.org/10.1109/BigData59044.2023.10386811
Google Scholar
Huang, J., Yang, D. M., Rong, R., Nezafati, K., Treager, C., Chi, Z., Wang, S., Cheng, X., Guo, Y., Klesse, L. J., Xiao, G., Peterson, E. D., Zhan, X., & Xie, Y. (2024). A critical assessment of using ChatGPT for extracting structured data from clinical notes. Npj Digital Medicine, 7(1), 106. https://doi.org/10.1038/s41746-024-01079-8
Google Scholar
Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., Chen, Q., Peng, W., Feng, X., Qin, B., & Liu, T. (2024). A Survey on hallucination in Large Language Models: Principles, taxonomy, challenges, and open questions. ACM Transactions on Information Systems, 3703155. https://doi.org/10.1145/3703155
Google Scholar
Kaddari, Z., Mellah, Y., Berrich, J., Belkasmi, M. G., & Bouchentouf, T. (2021). Natural language processing: challenges and future directions. In T. Masrour, I. El Hassani, & A. Cherrafi (Eds.), Artificial Intelligence and Industrial Applications (Vol. 144, pp. 236–246). Springer International Publishing. https://doi.org/10.1007/978-3-030-53970-2_22
Google Scholar
Kernberg, A., Gold, J., & Mohan, V. (2024). Using ChatGPT-4 to create structured medical notes from audio recordings of physician-patient encounters: Comparative study. Journal of Medical Internet Research, 26, e54419. https://doi.org/10.2196/54419
Google Scholar
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P., Leike, J., & Lowe, R. (2022). Training language models to follow instructions with human feedback. ArXiv, abs/2203.02155. https://doi.org/10.48550/arXiv.2203.02155
Google Scholar
Patra, B. G., Lepow, L. A., Kasi Reddy Jagadeesh Kumar, P., Vekaria, V., Sharma, M. M., Adekkanattu, P., Fennessy, B., Hynes, G., Landi, I., Sanchez-Ruiz, J. A., Ryu, E., Biernacka, J. M., Nadkarni, G. N., Talati, A., Weissman, M., Olfson, M., Mann, J. J., Zhang, Y., Charney, A. W., & Pathak, J. (2024). Extracting social support and social isolation information from clinical psychiatry notes: Comparing a rule-based natural language processing system and a large language model. Journal of the American Medical Informatics Association. https://doi.org/10.1093/jamia/ocae260
Google Scholar
Ray, P. P. (2023). ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems, 3, 121-154. https://doi.org/10.1016/j.iotcps.2023.04.003
Google Scholar
Straka, M., Náplava, J., Straková, J., & Samuel, D. (2021). RobeCzech: Czech RoBERTa, a monolingual contextualized language representation model. In K. Ekštein, F. Pártl, & M. Konopík (Eds.), Text, Speech, and Dialogue (Vol. 12848, pp. 197-209). Springer International Publishing. https://doi.org/10.1007/978-3-030-83527-9_17
Google Scholar
Tsai, R. T.-H., Wu, S.-H., Chou, W.-C., Lin, Y.-C., He, D., Hsiang, J., Sung, T.-Y., & Hsu, W.-L. (2006). Various criteria in the evaluation of biomedical named entity recognition. BMC Bioinformatics, 7, 92. https://doi.org/10.1186/1471-2105-7-92
Google Scholar
Yifan, Y., Jinhao, D., Kaidi, X., Yuanfang, C., Zhibo, S., & Yue, Z. (2024). A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly. High-Confidence Computing, 4(2), 100211. https://doi.org/10.1016/j.hcc.2024.100211
Google Scholar
Zelina, P., Halamkova, J., & Novacek, V. (2022). Unsupervised extraction, labelling and clustering of segments from clinical notes. 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (pp. 1362-1368). IEEE. http://dx.doi.org/10.1109/BIBM55620.2022.9995229
Google Scholar
Zhan, X., Humbert-Droz, M., Mukherjee, P., & Gevaert, O. (2021). Structuring clinical text with AI: Old versus new natural language processing techniques evaluated on eight common cardiovascular diseases. Patterns, 2(7), 100289. https://doi.org/10.1016/j.patter.2021.100289
Google Scholar
Authors
Zakaria KADDARIz.kaddari@ump.ac.ma
Université Mohammed Premier, National School of Applied Sciences, LaRSA laboratory, AIRES team Morocco
https://orcid.org/0000-0003-4034-5612
Authors
Ikram El HACHMIUniversité Mohammed Premier, Faculty of Medicine and Pharmacy Oujda Morocco
https://orcid.org/0009-0008-7928-3088
Authors
Jamal BERRICHUniversité Mohammed Premier, Faculty of Medicine and Pharmacy Oujda Morocco
https://orcid.org/0000-0001-8443-7223
Authors
Rim AMRANIUniversité Mohammed Premier, Faculty of Medicine and Pharmacy Oujda Morocco
https://orcid.org/0000-0003-3906-5533
Authors
Toumi BOUCHENTOUFUniversité Mohammed Premier, Faculty of Medicine and Pharmacy Oujda Morocco
https://orcid.org/0000-0002-2689-8678
Statistics
Abstract views: 119PDF downloads: 22
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.
Similar Articles
- Rumesh Edirimanne, W Madushan Fernando, Peter Nielsen, H. Niles Perera, Amila Thibbotuwawa, OPTIMIZING UNMANNED AERIAL VEHICLE BASED FOOD DELIVERY THROUGH VEHICLE ROUTING PROBLEM: A COMPARATIVE ANALYSIS OF THREE DELIVERY SYSTEMS. , Applied Computer Science: Vol. 20 No. 1 (2024)
- Tilla IZSÁK, László MARÁK, Mihály ORMOS, EVALUATION OF SUPPORT VECTOR MACHINE BASED STOCK PRICE PREDICTION , Applied Computer Science: Vol. 19 No. 3 (2023)
- Mohamed ELBAHRI, Nasreddine TALEB, Sid Ahmed El Mehdi ARDJOUN, Chakib Mustapha Anouar ZOUAOUI , FEW-SHOT LEARNING WITH PRE-TRAINED LAYERS INTEGRATION APPLIED TO HAND GESTURE RECOGNITION FOR DISABLED PEOPLE , Applied Computer Science: Vol. 20 No. 2 (2024)
- Saleh ALBAHLI, A DEEP ENSEMBLE LEARNING METHOD FOR EFFORT-AWARE JUST-IN-TIME DEFECT PREDICTION , Applied Computer Science: Vol. 16 No. 3 (2020)
- Janusz MLECZKO, Paweł BOBIŃSKI, PRODUCTION PLANNING IN CONDITIONS OF MASS CUSTOMIZATION BASED ON THEORY OF CONSTRAINTS , Applied Computer Science: Vol. 13 No. 4 (2017)
- Łukasz WÓJCIK, Zbigniew PATER, LIMITING VALUE OF COCKROFT-LATHAM INTEGRAL FOR COMMERCIAL PLASTICINE , Applied Computer Science: Vol. 13 No. 4 (2017)
- Pornsiri KHUMLA, Kamthorn SARAWAN, IMPROVING MATERIAL REQUIREMENTS PLANNING THROUGH WEB-BASED: A CASE STUDY THAILAND SMEs , Applied Computer Science: Vol. 19 No. 4 (2023)
- Rawaa HAAMED, Ekhlas HAMEED, CONTROLLING THE MEAN ARTERIAL PRESSURE BY MODIFIED MODEL REFERENCE ADAPTIVE CONTROLLER BASED ON TWO OPTIMIZATION ALGORITHMS , Applied Computer Science: Vol. 16 No. 2 (2020)
- Raphael Olufemi AKINYEDE, Temitayo Elijah BALOGUN, Abiodun Boluwade ROTIMI, Oluwasefunmi Busola FAMODIMU, A CUSTOMER-CENTRIC APPLICATION FOR A CINEMA HOUSE , Applied Computer Science: Vol. 16 No. 2 (2020)
- Ahmed A.H. HAQQANI, Seenu N, Mukund JANARDHANAN, Kuppan Chetty RM, EVALUATION OF ROBOTIC CLEANING TECHNOLOGIES: PRESERVING A BRITISH ICONIC BUILDING , Applied Computer Science: Vol. 16 No. 2 (2020)
<< < 5 6 7 8 9 10 11 12 13 14 > >>
You may also start an advanced similarity search for this article.