EVALUATING LARGE LANGUAGE MODELS FOR MEDICAL INFORMATION EXTRACTION: A COMPARATIVE STUDY OF ZERO-SHOT AND SCHEMA-BASED METHODS

Zakaria KADDARI; Ikram El HACHMI; Jamal BERRICH; Rim AMRANI; Toumi BOUCHENTOUF

doi:10.35784/acs-2024-44

EVALUATING LARGE LANGUAGE MODELS FOR MEDICAL INFORMATION EXTRACTION: A COMPARATIVE STUDY OF ZERO-SHOT AND SCHEMA-BASED METHODS

Zakaria KADDARI

z.kaddari@ump.ac.ma
Université Mohammed Premier, National School of Applied Sciences, LaRSA laboratory, AIRES team (Morocco)
https://orcid.org/0000-0003-4034-5612

Ikram El HACHMI

Université Mohammed Premier, Faculty of Medicine and Pharmacy Oujda (Morocco)
https://orcid.org/0009-0008-7928-3088

Jamal BERRICH

Université Mohammed Premier, Faculty of Medicine and Pharmacy Oujda (Morocco)
https://orcid.org/0000-0001-8443-7223

Rim AMRANI

Université Mohammed Premier, Faculty of Medicine and Pharmacy Oujda (Morocco)
https://orcid.org/0000-0003-3906-5533

Toumi BOUCHENTOUF

Université Mohammed Premier, Faculty of Medicine and Pharmacy Oujda (Morocco)
https://orcid.org/0000-0002-2689-8678

DOI: https://doi.org/10.35784/acs-2024-44

Abstract

This study investigates the application of large language models, particularly ChatGPT, in the extraction and structuring of medical information from free-text patient reports. The authors explore two distinct methods: a zero-shot extraction approach and a schema-based extraction approach. The dataset, consisting of 1230 anonymized French medical reports from the Department of Neonatology of the Mohammed VI University Hospital, served as the basis for these experiments. The findings indicate that while ChatGPT demonstrates a significant capability in structuring medical data, certain challenges remain, particularly with complex and non-standardized text formats. The authors evaluate the model's performance using precision, recall, and F1 score metrics, providing a comprehensive assessment of its applicability in clinical settings.

Keywords:

Medical Information Extraction, Large Language Models, ChatGPT, schema-based extraction

References

Agrawal, M., Hegselmann, S., Lang, H., Kim, Y., & Sontag, D. (2022). Large Language Models are few-shot clinical information extractors. ArXiv, abs/2205.12689. https://doi.org/10.48550/arXiv.2205.12689
DOI: https://doi.org/10.18653/v1/2022.emnlp-main.130 Google Scholar

Bergomi, L., Tommaso, M., Antonazzo, P., Alberghi, L., Bellazzi, R., Preda, L., Bortolotto, C., & Parimbelli, E. (2024). Reshaping free-text radiology notes into structured reports with generative question answering transformers. Artificial Intelligence in Medicine, 154, 102924. https://doi.org/10.1016/j.artmed.2024.102924
DOI: https://doi.org/10.1016/j.artmed.2024.102924 Google Scholar

Bhate, N., Mittal, A., He, Z., & Luo, X. (2023). Zero-shot learning with minimum instruction to extract social determinants and family history from clinical notes using GPT Model. IEEE International Conference on Big Data (BigData) (pp. 1476-1480). IEEE. https://doi.org/10.1109/BigData59044.2023.10386811
DOI: https://doi.org/10.1109/BigData59044.2023.10386811 Google Scholar

Huang, J., Yang, D. M., Rong, R., Nezafati, K., Treager, C., Chi, Z., Wang, S., Cheng, X., Guo, Y., Klesse, L. J., Xiao, G., Peterson, E. D., Zhan, X., & Xie, Y. (2024). A critical assessment of using ChatGPT for extracting structured data from clinical notes. Npj Digital Medicine, 7(1), 106. https://doi.org/10.1038/s41746-024-01079-8
DOI: https://doi.org/10.1038/s41746-024-01079-8 Google Scholar

Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., Chen, Q., Peng, W., Feng, X., Qin, B., & Liu, T. (2024). A Survey on hallucination in Large Language Models: Principles, taxonomy, challenges, and open questions. ACM Transactions on Information Systems, 3703155. https://doi.org/10.1145/3703155
DOI: https://doi.org/10.1145/3703155 Google Scholar

Kaddari, Z., Mellah, Y., Berrich, J., Belkasmi, M. G., & Bouchentouf, T. (2021). Natural language processing: challenges and future directions. In T. Masrour, I. El Hassani, & A. Cherrafi (Eds.), Artificial Intelligence and Industrial Applications (Vol. 144, pp. 236–246). Springer International Publishing. https://doi.org/10.1007/978-3-030-53970-2_22
DOI: https://doi.org/10.1007/978-3-030-53970-2_22 Google Scholar

Kernberg, A., Gold, J., & Mohan, V. (2024). Using ChatGPT-4 to create structured medical notes from audio recordings of physician-patient encounters: Comparative study. Journal of Medical Internet Research, 26, e54419. https://doi.org/10.2196/54419
DOI: https://doi.org/10.2196/54419 Google Scholar

Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P., Leike, J., & Lowe, R. (2022). Training language models to follow instructions with human feedback. ArXiv, abs/2203.02155. https://doi.org/10.48550/arXiv.2203.02155
Google Scholar

Patra, B. G., Lepow, L. A., Kasi Reddy Jagadeesh Kumar, P., Vekaria, V., Sharma, M. M., Adekkanattu, P., Fennessy, B., Hynes, G., Landi, I., Sanchez-Ruiz, J. A., Ryu, E., Biernacka, J. M., Nadkarni, G. N., Talati, A., Weissman, M., Olfson, M., Mann, J. J., Zhang, Y., Charney, A. W., & Pathak, J. (2024). Extracting social support and social isolation information from clinical psychiatry notes: Comparing a rule-based natural language processing system and a large language model. Journal of the American Medical Informatics Association. https://doi.org/10.1093/jamia/ocae260
DOI: https://doi.org/10.1093/jamia/ocae260 Google Scholar

Ray, P. P. (2023). ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems, 3, 121-154. https://doi.org/10.1016/j.iotcps.2023.04.003
DOI: https://doi.org/10.1016/j.iotcps.2023.04.003 Google Scholar

Straka, M., Náplava, J., Straková, J., & Samuel, D. (2021). RobeCzech: Czech RoBERTa, a monolingual contextualized language representation model. In K. Ekštein, F. Pártl, & M. Konopík (Eds.), Text, Speech, and Dialogue (Vol. 12848, pp. 197-209). Springer International Publishing. https://doi.org/10.1007/978-3-030-83527-9_17
DOI: https://doi.org/10.1007/978-3-030-83527-9_17 Google Scholar

Tsai, R. T.-H., Wu, S.-H., Chou, W.-C., Lin, Y.-C., He, D., Hsiang, J., Sung, T.-Y., & Hsu, W.-L. (2006). Various criteria in the evaluation of biomedical named entity recognition. BMC Bioinformatics, 7, 92. https://doi.org/10.1186/1471-2105-7-92
DOI: https://doi.org/10.1186/1471-2105-7-92 Google Scholar

Yifan, Y., Jinhao, D., Kaidi, X., Yuanfang, C., Zhibo, S., & Yue, Z. (2024). A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly. High-Confidence Computing, 4(2), 100211. https://doi.org/10.1016/j.hcc.2024.100211
DOI: https://doi.org/10.1016/j.hcc.2024.100211 Google Scholar

Zelina, P., Halamkova, J., & Novacek, V. (2022). Unsupervised extraction, labelling and clustering of segments from clinical notes. 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (pp. 1362-1368). IEEE. http://dx.doi.org/10.1109/BIBM55620.2022.9995229
DOI: https://doi.org/10.1109/BIBM55620.2022.9995229 Google Scholar

Zhan, X., Humbert-Droz, M., Mukherjee, P., & Gevaert, O. (2021). Structuring clinical text with AI: Old versus new natural language processing techniques evaluated on eight common cardiovascular diseases. Patterns, 2(7), 100289. https://doi.org/10.1016/j.patter.2021.100289
DOI: https://doi.org/10.1016/j.patter.2021.100289 Google Scholar

Download

Published

2024-12-31

Cited by

KADDARI, Z., El HACHMI, I., BERRICH, J., AMRANI, R., & BOUCHENTOUF, T. (2024). EVALUATING LARGE LANGUAGE MODELS FOR MEDICAL INFORMATION EXTRACTION: A COMPARATIVE STUDY OF ZERO-SHOT AND SCHEMA-BASED METHODS. Applied Computer Science, 20(4), 138–148. https://doi.org/10.35784/acs-2024-44

Authors

Zakaria KADDARI
z.kaddari@ump.ac.ma
Université Mohammed Premier, National School of Applied Sciences, LaRSA laboratory, AIRES team Morocco
https://orcid.org/0000-0003-4034-5612

Authors

Ikram El HACHMI

Université Mohammed Premier, Faculty of Medicine and Pharmacy Oujda Morocco
https://orcid.org/0009-0008-7928-3088

Authors

Jamal BERRICH

Université Mohammed Premier, Faculty of Medicine and Pharmacy Oujda Morocco
https://orcid.org/0000-0001-8443-7223

Authors

Rim AMRANI

Université Mohammed Premier, Faculty of Medicine and Pharmacy Oujda Morocco
https://orcid.org/0000-0003-3906-5533

Authors

Toumi BOUCHENTOUF

Université Mohammed Premier, Faculty of Medicine and Pharmacy Oujda Morocco
https://orcid.org/0000-0002-2689-8678

Statistics

Abstract views: 266
PDF downloads: 64

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.

EVALUATING LARGE LANGUAGE MODELS FOR MEDICAL INFORMATION EXTRACTION: A COMPARATIVE STUDY OF ZERO-SHOT AND SCHEMA-BASED METHODS

Zakaria KADDARI

Ikram El HACHMI

Jamal BERRICH

Rim AMRANI

Toumi BOUCHENTOUF

Abstract

Keywords:

References

Authors

Authors

Authors

Authors

Authors

Statistics

License

Similar Articles

CURRENT ISSUE

Make a Submission

Lublin University of Technology Publishing House

Copyright